26,924 Matching Annotations
  1. May 2024
    1. Reviewer #2 (Public Review):

      The authors characterized activity of the dorsal periaqueductal gray (dPAG) - basolateral amygdala (BLA) circuit. They show that BLA cells that are activated by dPAG stimulation are also more likely to be activated by a robot predator. These same cells are also more likely to display synchronous firing.

      The authors also replicate prior results showing that dPAG stimulation evokes fear and the dPAG is activated by a predator.

      Lastly, the report performs anatomical tracing to show that the dPAG may act on the BLA via the paraventricular thalamus (PVT). Indeed, the PVT receives dPAG projections and also projects to the BLA. However, the authors do not show if the PVT mediates dPAG to BLA communication with any functional behavioral assay. Furthermore, the authors also do not thoroughly characterize the activity of BLA cells during the predatory assay.

      The major impact in the field would be to add evidence to their prior work, strengthening the view that the BLA can be downstream of the dPAG.

    2. Reviewer #3 (Public Review):

      In the present study, the authors examined how dPAG neurons respond to predatory threats and how dPAG and BLA communicate threat signals. The authors employed single-unit recording and optogenetics tools to address these issues in an 'approach food-avoid predator' paradigm. They characterized dPAG and BLA neurons responsive to a looming robot predator and found that dPAG opto-stimulation elicited fleeing and increased BLA activity. Importantly, they found that dPAG stimulation produces activity changes in subpopulations of BLA neurons related to predator detection, thus supporting the idea that dPAG conveys innate fear signals to the amygdala. In addition, injections of anterograde and retrograde tracers into the dPAG and BLA, respectively, along with the examination of c-FOS activity in midline thalamic relay stations, suggest that the paraventricular nucleus of the thalamus (PVT) may serve as a mediator of dPAG to BLA neurotransmission. Of relevance, the study helps to validate an important concept that dPAG mediates primal fear emotion and may engage upstream amygdalar targets to evoke defensive responses. The series of experiments provide a compelling case for supporting their conclusions. The study brings important concepts revealing dynamics of fear-related circuits particularly attractive to a broad audience, from basic scientists interested in neural circuits to psychiatrists.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Recommendations for the authors:

      We sincerely value the insightful and constructive feedback (italicized) provided by the reviewers, which has been instrumental in identifying areas of our manuscript that required further clarification or amendment. In response to these valuable comments, we have significantly revised the manuscript to enhance clarity and accuracy. Specifically, we have corrected an oversight related to the robot’s velocity and secondary antibody ratios, and addressed previously missing values in Figs. 3E and 4E. Importantly, these corrections did not alter the outcomes of our results. Additionally, we have enriched our manuscript with new data analyses, as reflected in Figures 1B, 1F, 2H-J, 4D, 4F-H, S1A, S1C-E, S3H, S5, and Table 1, ensuring a more comprehensive presentation of our findings. Below are our responses detailing each comment and explaining the modifications integrated into the revised manuscript.

      Reviewer 1:

      (1) To address the question of whether PAG photostimulation biases the cells that respond to the robot, a counterbalanced experiment, in which the BLA activity is initially recorded during the foraging vs. robot test and the PAG stimulation happens at the end of the session, should have been performed.

      In our study, we investigated fear behavior and BLA cell responses to intrinsic dPAG photostimulation (320 pulses) in naïve animals, followed by their reactions to an extrinsic predatory robot. We recognize the reviewer's concern regarding the potential  influence of initial dPAG photostimulation on BLA neuron responses to the robot. We address this issue in our discussion (pg. 13) as follows: “However, it is crucial to consider the recent discovery that optogenetic stimulation of CA3 neurons (3000 pulses) leads to gain-of-function changes in CA3-CA3 recurrent (monosynaptic) excitatory synapses (Oishi et al., 2019). Although there is no direct connection between dPAG neurons and the BLA (Vianna and Brandao 2003, McNally, Johansen, and Blair 2011, Cameron et al. 1995), and no studies have yet demonstrated gain-of-function changes in polysynaptic pathways to our knowledge, the potential for our dPAG photostimulation (320 pulses) to induce similar changes in amygdalar neurons, thereby enhancing their sensitivity to predatory threats, cannot be dismissed.”

      (2) In Figure 3, it is unclear which criteria (e.g. response latency, minimum Z score, spike fidelity) was used to identify the BLA neurons that were indirectly activated by PAG stimulation. A graphic containing at least the distribution of the response latencies for each BLA neuron after PAG laser activation is needed.

      We have specified the criteria for determining the responsiveness of BLA neurons to dPAG stimulation on page 22. This involves analyzing the first 500-ms post-stimulation across five 0.1-s bins. Units were classified as ‘stim cells’ if they showed z-scores greater than 3 (z > 3) in any of the bins during the initial 500-ms period post-stimulation. Neurons activated by both pellet procurement and dPAG stimulation were not included in the 'stim cell' category. Additionally, we have included a graphic in the revised manuscript (Fig. S3C) that presents the distribution of response latencies of BLA neurons to dPAG stimulation.

      (3) To strengthen the claim that it is a BLA-PVT-PAG circuit that carries information about predatory threat, a new experiment using CTB and cFos could be used to demonstrate that PAG neurons that project to PVT are recruited during the robot exposure.

      Our study primarily aimed to explore the transmission of threat signals between the dPAG and BLA. We acknowledge that our evidence for the PVT’s intermediary role, derived from CTB injections in the BLA and subsequent CTB+cFos co-labeling analysis in the PVT (Fig. 4G and 4H), is limited. Accordingly, we have moderated the emphasis on the PVT’s involvement in both the abstract and introduction. We now present the PVT’s role as a promising direction for future research in the discussion section of our revised manuscript.

      (4) In Fig 2, the authors' interpretation is that photostimulation of PAG neurons elicits fleeing responses in the rats. However, there is a vast literature demonstrating that the PAG is also involved in nociception. Although this is recognized by the authors in the first part of the introduction and briefly described in the discussion, the authors should more explicitly explain that PAG stimulation produces analgesia and thus is unlikely to underlie the escaping responses observed. This may not be intuitive for a broader audience.

      We appreciate the reviewer's insightful suggestion to elaborate on the PAG involvement in nociception and analgesia, as supported by the literature. While our initial manuscript acknowledged these functions, we have now expanded our discussion to address the PAG’s multifaceted roles (pg. 12): “As mentioned in the introduction, the dPAG is recognized as part of the ascending nociceptive pathway to the BLA (De Oca et al. 1998, Gross and Canteras 2012, Herry and Johansen 2014, Kim, Rison, and Fanselow 1993, Ressler and Maren 2019, Walker and Davis 1997). The dPAG is also implicated in non-opioid analgesia (e.g., Bagley and Ingram 2020, Cannon et al. 1982, Fields 2000). However, it is essential to emphasize that, despite its roles in pain modulation, the primary behavior observed in dPAG-stimulated, naive rats foraging for food in an open arena was goal-directed escape to the safe nest, underscoring the dPAG’s critical function in survival behaviors.” Note that this aligns with human studies on PAG stimulation (e.g., Carrive and Morgan 2012, Magierek et al. 2003), particularly those by Amano et al. (Amano et al. 1982), which reported patients feeling an urge to escape, similar to being chased, upon PAG stimulation.

      (5) To truly demonstrate the functional links between the PAG and BLA, more experiments are needed. For example, one could record from BLA neurons during the robot surge while performing optogenetic inhibition of the PAG neurons. There is also no evidence that activity in the indirect pathway that connects the PAG to the BLA is indispensable for the expression of defensive responses towards the robot (e.g., causality tests using chemogenetic or optogenetic inactivation).

      We agree that incorporating optogenetic inhibition of PAG neurons while simultaneously recording from BLA neurons during a robot surge would strengthen the evidence for the functional connectivity between the PAG and BLA. Such an experiment would necessitate the transfection and photoinhibition of a wide array of dPAG neurons responsive to predatory threats. This procedure is technically more viable in transgenic mouse models, given their suitability for genetic manipulation. In light of this, and in response to the suggestions in the Joint Public Review, we have revised the abstract, introduction, and discussion to offer a more cautious interpretation of our findings. This revision reflects a careful consideration of both the evidence and the limitations inherent in our study (pg. 13): “While our findings demonstrate that opto-stimulation of the dPAG is sufficient to trigger both fleeing behavior and increased BLA activity, we have not established that the dPAG is necessary for the BLA’s response to predatory threats. To establish causality, it is essential to conduct experiments such as optogenetic inhibition to determine whether the dPAG is indispensable for activating BLA neurons and initiating escape behavior in the face of threats. The complexity of targeting the dPAG, which includes its dorsomedial, dorsolateral, lateral, and ventrolateral subdivisions (e.g., Bandler, Carrive, and Zhang 1991, Bandler and Keay 1996, Carrive 1993), suggests the need for future studies using transgenic mouse models. Should inactivation of the dPAG negate the BLA's response to predatory threats, it would underscore the dPAG's central role in this defensive mechanism. Conversely, if BLA responses remain unaffected by dPAG inactivation, this could indicate the existence of multiple pathways for antipredatory defense mechanisms.”

      (6) The manuscript lacks information about the number of rats and trials that were used across the experiments (e.g. Fig 2G-J). In some occasions, the authors start the experiments with a specific number of animals and then reduce the N by half without providing a rationale (e.g. Fig. 3). Equally confusing is the experimental timeline. For example: a) Were the pre-robot, robot, and post-robot sessions always performed within the same day? b) It was described that microdrivable arrays were used, but did the same rats experienced the robot test more than one time? c) How many bins were used for normalization during the Z-score calculation and when were the data binned at 100 ms versus 1 s? d) How many trials were used for each analysis? For example, to identify robot cells, did the authors establish a minimum number of trials per animal to calculate the peristimulus time histograms? Having a significant number of trials is critical to make sure that the observed neuronal responses are replicable across the trials. e) How was the neuronal activity related to "pellet retrieval" aligned during robot sessions? Was the activity aligned with the moment in which the rat touches the pellet or when the animal returns to the nest with the pellet? f) How did the authors control for trials in which the rat consumed the pellets in the same local vs. those in which they returned to the nest to eat it? All these points are extremely important for future replicability.

      We apologize for any confusion caused by the initial lack of detail in our experimental procedures. The revised manuscript has been updated with comprehensive methodological details:  

      (i) The study involved thirteen rats (ChR2, n = 9; EYFP, n = 4), subjected to dPAG stimulation using fixed light parameters (473 nm, 20 Hz, 10-ms pulse width, 2 s duration) during Long and Short pellet distance trials (refer to Fig. 2E-G). The stimulation intensity was adjusted to each animal's response (fleeing behavior), ranging from 1-3 mW. Additional testing occurred over multiple days, with incremental adjustments to stimulation parameters (intensity, frequency, duration) after confirming normal baseline foraging behavior (Fig. 2H-J, at x = 0). These details are now clearly depicted in the manuscript.

      (ii) The primary objective was to investigate BLA neuron responses to dPAG opto-stimulation. Six rats were initially tested, with three later assessed for their reactions to dPAG stimulation in the presence of an actual predator, to gauge behavioral effects.

      (iii) Regarding the experimental timeline:

      a) Pre-robot, robot, and post-robot sessions were conducted successively on the same day.

      b) Sessions with the robot predator were repeated until habituation occurred or when unit recordings were deemed invalid due to microdrive limitations or the absence of unit detection. Throughout these sessions, the success rate for pellet retrieval remained consistently low. Specifically, the mean success rate for the dPAG recordings was 2.803% + 1.311. For the BLA recordings, animals did not succeed in retrieving pellets during any of the robot trials. To provide a more detailed account of the methodology, the manuscript has been updated to include the number of recording days and the units recorded in the "Behavioral Procedures" section.

      c) As described in Materials and Methods, unit recording data were binned at 0.1-s intervals and normalized against a 5-s pre-event baseline (50 bins). For statistical analyses in Figure 1F’s rightmost column, 1-s bins were used to simplify post-hoc analysis corrections.

      d) Each recording session consisted of 5-15 trials. Trials were excluded if rats attempted to procure the pellet within 10 s post-dPAG stimulation or robot activation, ensuring accurate characterization of unit responsiveness. Consequently, the number of trials varied among subjects.

      e) Pellet retrieval was indicated by the animal entering a designated zone 19 cm from the pellet, driven by hunger.

      f) Animals were trained to retrieve pellets and return to their nest for consumption prior to robot testing sessions, as elaborated in the “Baseline foraging” section.

      (7) In the abstract, the authors mention that predictive cues are ambiguous during naturalistic predatory threats, but it is not clear what do they mean by ambiguous. In addition, in the introduction section, the authors describe that the present study will investigate how the dPAG and BLA communicate threat signals. However, the author should clarify right in the beginning that these two regions are not monosynaptically connected with each other and cite the proper references.

      The abstract’s original sentence, “…where predictive cues are ambiguous and do not afford reiterative trial-and-error learning…” has been refined to “…characterized by less explicit cues and the absence of reiterative trial-and-error learning events …” This adjustment more accurately reflects that cues in natural settings often lack the clear and consistent quality of those in controlled experimental settings, which is necessary for the straightforward process of trial-and-error learning.

      Regarding the dPAG and BLA connectivity, the revised introduction (pg. 5) now states: “Considering the lack of direct monosynaptic projections between dPAG and BLA neurons (Vianna and Brandao 2003, McNally, Johansen, and Blair 2011, Cameron et al. 1995), we utilized anterograde and retrograde tracers in the dPAG and BLA, respectively. This was complemented by c-Fos expression analysis following exposure to predatory threats. Our anatomical findings suggest that the paraventricular nucleus of the thalamus (PVT) may be part of a network that conveys predatory threat information from the dPAG to the BLA.”

      (8) In the introduction section, the authors should clarify that the US information is conveyed from the PAG to BLA via the lateral thalamus (posterior intralaminar nucleus, medial geniculate nucleus) or dorsal midline thalamus (paraventricular nucleus of the thalamus). The statement regarding how "the PAG functions as part of the ascending pain transmission pathway, providing footshock US information to the BLA" is misleading because the PAG does not send monosynaptic projections directly to the BLA.

      The revised text (pg. 3) now reads: “…suggest that the dPAG is part of the ascending US pain transmission pathway to the BLA, the presumed site for CS-US association formation (De Oca et al. 1998, Gross and Canteras 2012, Herry and Johansen 2014, Kim, Rison, and Fanselow 1993, Ressler and Maren 2019, Walker and Davis 1997). This pathway is thought to be mediated through the lateral and dorsal-midline thalamus regions, including the posterior intralaminar nucleus and paraventricular nucleus of the thalamus (Krout and Loewy, 2000; McNally, Johansen, and Blair, 2011; Yeh, Ozawa, and Johansen, 2021; but see Brunzell and Kim, 2001).”

      (9) The author's assumption that threat information flows from the PAG to the BLA, rather than BLA to PAG, based on electrical stimulation and lesion experiments performed in previous studies is problematic for at least three reasons: a) Electrical stimulation can activate fibers of passage as well as presynaptic neurons antidromically. b) The lesion approach may not have targeted 100% of the neurons in PAG, which extends anatomically along the antero-posterior axis of the midbrain for several millimeters in rats. This observation also disagrees with more recent studies using optogenetics and imaging tools demonstrating that the PAG is the downstream target of the BLA-CeA pathway. c) The authors cited prior reports describing the role of the amygdala-PAG pathway in dampening the US response and providing a negative signal to the PAG. However, a series of previous studies demonstrating that the PAG serves as the downstream target of the central nucleus of the amygdala for the expression of defensive response are completely ignored by the authors. Here are just some examples: Massi et al, 2023, PMID: 36652513; Tovote et al 2016, PMID: 27279213; Penzo et al, 2014 PMID: 24523533).

      We recognize the complexities in interpreting findings from electrical stimulation and lesion studies. Our prior work (Kim et al. 2013) supports the conclusion that predatory threat information directionally flows from the dPAG to the BLA, as evidenced by distinct behavioral outcomes from experimental manipulations of dPAG and BLA. Specifically, dPAG stimulation-induced fleeing behavior was blocked by BLA lesions (as well as muscimol inactivation), whereas BLA stimulation-induced fleeing was unaffected by dPAG or combined dPAG+vPAG lesions (refer to Fig. 5A), suggesting a flow from dPAG to BLA. Our manuscript further clarifies that dPAG optostimulation results confirmed that escape behavior in foraging rats, induce by dPAG electrical stimulation (Kim et al. 2013), was activated by intrinsic dPAG neurons rather than by fibers of passage or current spread to other brain regions.  

      Furthermore, the PAG’s anatomical and functional diversity, with distinct segments along its longitudinal axis associated with different defensive behaviors, reinforces our conclusions. The dPAG is implicated in flight responses, while the vPAG is associated with freezing behavior (e.g., Bandler and Shipley 1994, Kim, Rison, and Fanselow 1993, Lefler, Campagner, and Branco 2020, Morgan, Whitney, and Gold 1998). The critiques' referenced studies primarily focus on the BLA-CeA-vPAG circuit's role in freezing during Pavlovian fear conditioning, contrasting with our emphasis on the dPAG-PVT-BLA circuit and its mediation in escape behavior in response to naturalistic predatory threats.

      We also note that different invasive procedures can yield varying behavioral outcomes. For example, both acute (e.g., optogenetic and muscimol inactivation) and chronic (e.g., surgical ablation) manipulations within the same brain circuit have shown diverse effects across species (Otchy et al. 2015). Moreover, optogenetics comes with its own set of conceptual and technical challenges (Adamantidis et al. 2015), including the difficulty of targeting, quantifying and photo-inhibiting 100% of PAG neurons. Despite the limitations of each technique, our collective evidence from lesions, inactivation, electrical stimulation (Kim et al. 2013), optostimulation, and single-unit recordings (the present study) supports the premise that the dPAG acts upstream of the BLA in processing predatory threat information.

      (10) In the discussion, the authors suggest that the PVT may be the interface between the PAG and the BLA for the expression of antipredatory defensive behavior during their foraging vs. robot test, but previous studies looking at the role of PVT in antipredator defensive behavior and/or approach-avoidance conflict tasks are not cited and discussed in the manuscript (Engelke et al, 2021, PMID: 33947849; Choi et al 2019, PMID: 30979815; Choi and McNally 2017, PMID: 28193686).

      We thank the reviewer for pointing out these pivotal studies, which we have carefully reviewed and integrated into the revised manuscript (pg. 14): “These results, in conjunction with previous research on the roles of the dPAG, PVT, and BLA in producing flight behaviors in naïve rats (Choi and Kim 2010, Daviu et al. 2020, Deng, Xiao, and Wang 2016, Kim et al. 2013, Kim et al. 2018, Kong et al. 2021, Ma et al. 2021, Reis et al. 2021), the anterior PVT’s involvement in cat odor-induced avoidance behavior (Engelke et al. 2021), and the PVT’s regulation of behaviors motivated by both appetitive and aversive stimuli (Choi and McNally 2017, Choi et al. 2019), suggest the involvement of the dPAGàPVTàBLA pathways in antipredatory defensive mechanisms, particularly as rats leave the safety of the nest to forage in an open arena (Figure 4I) (Reis et al. 2023).”  

      (11) The authors use the expression "looming robot predator" in many cases throughout the manuscript. However, it is unclear whether the defensive responses observed in the rats are elicited by the looming stimulus produced by the movement of the robot towards the rats. The authors describe that rats do not respond to a stationary robot, but would the sound produced by the movement of the robot elicit defensive responses? Would non-approaching lateral or dorsoventral movements (not associated with looming) be sufficient to induce defensive behavior in the rats? There is a vast literature in the field about defensive behaviors induced by looming stimuli. The authors should empirically demonstrate that the escaping responses induced by the robot are mediated by looming or refrain to use the looming terminology to avoid confusion.

      Our use of "looming robot predator" is based on empirical evidence from a prior parametric study, which identified the forward, or 'looming,' motion of the Robogator as the key stimulus eliciting a flight response in rats (Kim, Choi, and Lee 2016). This reaction significantly decreased when the robot moved backward from the same starting position, producing a similar sound, and was absent when the robot remained stationary. This suggests that neither sound alone nor the mere presence of a novel object provokes goal-directed escape behavior (Kong et al. 2021). This aligns with studies indicating that simulated looming stimuli, like an expanding disk, induce flight or freezing responses in mice (De Franceschi et al. 2016, Yilmaz and Meister 2013).

      It should be noted that the 2013 study by Yilmaz & Meister (Yilmaz and Meister 2013) on the looming disk paradigm showed that not all mice responded to the stimuli (e.g., Figs. 2A and 3A), with those that did exhibiting rapid habituation by the second exposure. This contrasts with our predatory robot paradigm (Choi and Kim 2010), where all rats consistently fled from the looming robotic predator across multiple trials, underscoring the critical role of looming motion in simulating predator attacks that trigger flight behavior in rats.

      Thus, the term "looming" accurately captures the nature of the robot's movement and its effect on eliciting defensive responses in rats. Nonetheless, should the editors agree with the reviewer's suggestion to minimize potential confusion, we are willing to substitute "looming" with "approaching," although we consider the terms to be synonymous in the context of our study.

      (12) If the authors are citing the Rescorla-Wagner model, they should include at least one additional sentence to explain it, as many people in the field are not familiar with this model.

      In response to the request for clarification on the Rescorla-Wagner model, we have added an explanatory sentence (pg. 4): “Fundamentally, the negative feedback circuit between the amygdala and the dPAG serves as a biological implementation of the Rescorla–Wagner (1972) model, a foundational theory of associative learning that emphasizes the importance of prediction errors in reinforcement (i.e., US), as applied to FC (Fanselow 1998).”

      (13) The authors need to include the normality test used to determine whether a parametric or non-parametric statistical analysis was the most appropriate test for each experiment.

      We have included the outcomes of the normality tests, detailed in Table S1.

      (14) In Fig. 1F, the authors show a representative PAG neuron with peristimulus-time histogram and rasters reaching frequencies higher than 100 Hz and sustained firing rates of >50 Hz following robot activation. The authors should include a firing rate analysis (e.g., average firing rate and maximum firing rate before and after robot activation) of the 22 robot-responsive PAG neurons recorded during the session to clarify whether this high firing rate, which is atypical in other brain regions, is commonly observed in the PAG. Showing the isolated waveforms of some representative neurons would help to clarify whether the activity is being recorded from a single-isolated unit instead of multiple units within the same channel.

      In response to the critique, we have expanded our analysis to include both average and maximum firing rates before and after robot activation for the 22 robot-responsive PAG neurons. This detailed firing rate analysis, illustrating their distribution, has been incorporated into the revised manuscript (refer to Figure S1C and S1D). Furthermore, to alleviate concerns regarding the identification of single-unit activity versus potential multi-unit recordings, we have included peri-event raster plots and waveforms for two additional representative neurons in Figure 1F.

      (15) In Figure 2, the authors should indicate when the recordings are performed on anesthetized vs. freely-moving awake animals.

      In the original manuscript, we specified that the optrode recordings depicted in Figure 2B were conducted on anesthetized rats. To enhance clarity and directly address the critique, we have now clearly indicated this condition in Figure 2A as well.

      (16) The optogenetic stimulation parameters used in Fig 2H indicate that 0.5 mW was sufficient to induce behavioral changes. This is surprising because most optogenetic experiments in the field use much higher intensities (> 5mW). If much lower intensities are sufficient to drive PAG-mediated behaviors, this may be a very important observation that should be conveyed to the field. I recommend the reviewers clarify if they in fact used 0.5 mW and then discuss that the laser intensity used in the experiments was 10X lower than that required for other brain regions

      In our study, we indeed observed that 0.5 mW of dPAG stimulation increased the latency to procure the pellet without completely preventing the action. Notably, at 1 mW, more than half of the animals (n = 5/9 rats; Fig. 2H) and at 3 mW, all rats (9/9) failed to procure the pellet and fled from the foraging area to the nest (Fig. 2G). These results indicate that even lower intensities were sufficient to elicit behavioral changes through dPAG stimulation in a large foraging arena, highlighting the dPAG's sensitivity to optogenetic manipulation. This finding is consistent with our earlier research on dPAG electrical stimulation, which required significantly lower intensities to provoke defensive behaviors compared to the BLA. Specifically, the stimulation intensity needed for aversive behavior in the dPAG was substantially lower (dPAG: 65.0 ± 6.85 µA) than for the BLA (BLA: 275.0 ± 24.44 µA) (Kim et al. 2013). Furthermore, Deng et al. (Deng, Xiao, and Wang 2016) showed that 1 mW of blue light could elicit a 60% freezing response, with 2 mW triggering flight behavior within a latency of 0.6 seconds.

      (17) In Fig 2 G-J, how many animals are being used per group and how was the sequence of the experiments performed? This is very important for replicability.

      A total of three rats were utilized for the robot testing experiments depicted in Fig. 2 G-J. The experimental sequence for these animals consisted of successive pre-stimulation, stimulation, post-stimulation, and robot sessions. We have updated the manuscript to include this information.

      (18) For the photostimulation of PAG neurons in Figs. 2 and 3, the authors need to clarify if the same parameters of laser stimulation used during the anesthetized recordings were also used during the behavioral tests. Also, the wavelength corresponding to the blue laser should be 473 nm instead of 437 nm.

      We thank the reviewer for identifying the error. We confirm that the opto-stimulation parameters (473 nm, 10-ms pulse width, 2 s duration) were consistently applied across both anesthetized recordings and behavioral tests. This consistency has been explicitly stated in the revised manuscript to ensure clarity regarding our experimental approach.

      (19) In Fig. 3I, how was the representative trials selected? Instead of picking up the most representative trials, the authors should demonstrate the response of the cell during the entire session.

      In response to the critique, we clarify that the color-coded PETH shown in Fig. 3I represents averaged BLA activity across a comprehensive set of trials. This includes 8 pre-stimulation, 10 stimulation, and 8 post-stimulation trials for the robot-activated sessions, with a similar distribution for non-stimulated sessions. This approach was chosen to provide a representative overview of the cell's response throughout the entire session. To address the request for more detailed data, we have added traditional PETHs to the revised manuscript (see Fig. S3H), which depict the cell's response across all trials.

      (20) Fig 4 D should demonstrate a colabeling between the anterograde PAG fibers in the PVT and the retrogradely labeled neurons from BLA instead of PAG fibers only.

      We wish to clarify that Fig. 4D is intended to show the distribution of dPAG terminals within the midline thalamic nuclei, as noted in prior research (Krout and Loewy 2000). Although dPAG terminals are distributed throughout the midline thalamus, our observations have specifically highlighted a notable increase in c-Fos expression within the paraventricular nucleus of the thalamus (PVT) in rats subjected to the robotic predator stimulus, in contrast to those in the foraging-only control condition (Fig. 4E). Addressing the reviewer's point, we direct attention to Fig. 4G, which includes images labeled "Robot-experienced" and "Merge." This figure demonstrates a subset of PVT neurons that were retrogradely labeled with CTB injected into the BLA, anterogradely labeled with AAV injected into the dPAG, and activated (as indicated by c-Fos expression) in response to the robotic predator. This provides specific colabeling evidence between anterograde PAG fibers in the PVT and retrogradely labeled neurons from the BLA, directly addressing the critique.

      (21) The resolution of the cFos images is very low and makes it hard to appreciate.

      We have updated Figs. 4F and 4G with high-resolution versions to ensure the details are more clearly visible. Furthermore, should there be a need for even greater clarity, we are prepared to supply the images as TIFF files, which are known for preserving high image quality.

      Reviewer 2:

      (1) The text is clearly written, and I appreciated the inclusion of interesting citations, such as the one about paintings by cavemen. The authors also do a good job of discussing the underlying theoretical framework and the figures are easy to understand. Although the topic is very interesting, the amount of novel work is somewhat low. Figure 1 shows that dPAG cells are activated by the predator, and this has been shown by many prior reports. Similarly, Figure 2 shows that dPAG activation creates defensive responses, and this too has been shown by many prior reports.

      We appreciate the reviewer’s positive remarks. We acknowledge the rich body of research documenting dPAG neuronal activation by various predator cues such as odors (e.g., fox urine) (Lu et al. 2023), and scenarios involving anesthetized or spontaneously moving rat/cat predators, either physically partitioned or harness-restrained (Bindi et al. 2022, Deng, Xiao, and Wang 2016, Esteban Masferrer et al. 2020). Nevertheless, our study distinguishes itself by examining dPAG neuronal responses to a robotic predator, uniquely designed to replicate consistent looming motions across multiple trials and subjects within an environment that simulates natural foraging conditions, inclusive of a safe nest (cf. Choi and Kim, 2010). This approach allowed us to not only reveal the immediate activation of dPAG neurons in response to a rapidly approaching predator but also to explore the consequent fleeing behavior towards safety, thereby providing new insights into the dPAG's role in mediating goal-directed defensive responses in a more ecologically-relevant setting. Furthermore, our investigation extends beyond these findings to assess the impact of dPAG activation on BLA neuronal responses and their functional connectivity during predator-prey interactions, offering a fresh perspective on the neural circuits that support survival behaviors in animals when confronted with naturalistic threats.

      (2) The results in Figure 3 are novel and interesting, but the characterization of BLA activity is incomplete. For example, what are the percentages of BLA cells that are inhibited or activated by all major behaviors observed? These behaviors include approach to pellet, escape from robot, freezing, stretch-attend postures, etc. These same analyses should also be added to dPAG activity in Figure 1. How does BLA single cell encoding of these behaviors relate to their responsivity to dPAG stimulation? And, finally, it is unclear what is the significance of BLA correlated synchronous firing. Is the animal more or less likely to be performing certain behaviors when correlated BLA firing occurs?

      Our analysis, as presented in Figs. 3I, 3K, and S3D-F, selectively focused on BLA cell responses during distinct behaviors such as approaching a pellet and escaping from the robot. These behaviors were selected because their precise temporal markers allow for accurate correlation with BLA cell activity, building on the findings of our previous research (Kim et al. 2018, Kong et al. 2021).

      The robot's motion, programmed to advance a fixed distance before retreating to its starting position, is designed to repeatedly elicit foraging, thus facilitating analysis of neural changes during conflict situations involving food approach and predator avoidance. However, this also leads to the rapid diminution of freezing and stretch-attend postures inside the nest as animals quickly adapt to the robot's movement pattern, rendering a time-stamped analysis of these behaviors unfeasible under our experimental conditions. While the inclusion of these behaviors in our analysis would be insightful, especially in extended interaction scenarios where the robot advances to the nest opening and remains before returning in a less predictable manner, such conditions would likely reduce foraging behavior due to increased fear, deviating from our study's primary objective of elucidating the interactions between the dorsal periaqueductal gray (dPAG) and the basolateral amygdala (BLA) functions.

      Regarding the significance of BLA correlated synchronous firing, our findings, particularly in Figures 3M-O and S4, demonstrate significant synchronous activity among BLA neuronal pairs during encounters with the robot, as opposed to pre-stim, stim, and post-stim sessions. This synchrony is notably prominent among neurons responsive to dPAG stimulation, indicating that BLA neurons involved in processing dPAG signals may play a crucial role in enhancing BLA network coherence to effectively manage predatory threat information (pg. 13).

      (3) In Figure 4, the authors identify the PVT as a potential region that can mediate dPAG to BLA communication via anatomical tracing. However, functional assays are missing. For example, if the PVT is inhibited chemogenetically, does this result in a smaller number of BLA cells that are activated by dPAG stimulation? Does activation of the dPAG-PVT or the PVT-BLA projections cause defensive behaviors? Functionally showing that the dPAG-PVT-BLA circuit controls defensive actions would be a major advance in the field and would greatly enhance the significance of this paper. It would also provide an anatomical substrate to support the view that the BLA is downstream of the dPAG, which was first demonstrated by the authors in their elegant 2013 PNAS paper.

      We appreciate the reviewer’s constructive critique and valuable suggestions on the necessity for functional validation of the dPAG-PVT-BLA circuit's involvement in mediating defensive behaviors. In light of these comments, we have carefully considered and included a discussion on the importance of these proposed experiments as a direction for future research in our manuscript revision (also see response to Reviewer 1’s critique #5).

      Our initial work in 2013 (Kim et al. 2013) laid the groundwork for identifying BLA neurons responsive to dPAG stimulation and suggested the PVT as a potential relay in this neural circuit. Recognizing the limitations of our current study, which does not include direct functional assays, we have adjusted our manuscript to convey the speculative aspect of the dPAG-PVT-BLA circuit’s role more accurately. Moreover, we have enriched our discussion by citing relevant studies that lend support to our proposed circuit mechanism. These references serve to place our findings within the broader context of existing research and highlight the imperative for subsequent studies to empirically confirm the functional significance of the dPAG-PVT-BLA pathway in driving defensive behaviors.

      Reviewer 3:

      (1) The Introduction refers to a negative feedback amygdala-dPAG from a study of the Johansen group, but in this case, the authors were referring to the ventrolateral and not the dorsal PAG.

      We thank the reviewer for pointing out the need to distinguish between the dPAG and vPAG regions in our introduction. While Johansen et al. (2010) investigated the roles of PAG (including both dPAG and vPAG regions; see their Supplementary Figs. 4, 5, and 10), the differentiation between their specific contributions to the amygdala's negative feedback mechanism was not explicitly detailed in their initial publication. This distinction was further elaborated upon in later work by the same group (Yeh, Ozawa, and Johansen 2021), which specifically illuminated the dPAG's role in conditioned fear memory formation and its neural pathways to the PVT that influence fear learning. To reflect this nuanced understanding, we have revised our introduction (pg. 3): “In parallel, Johansen et al. (2010) found that pharmacological inhibition of the PAG, encompassing both dPAG and vPAG regions, diminishes the behavioral and neural responses in the amygdala elicited by periorbital shock US, thereby impairing the acquisition of auditory FC.”

      (2) In the experiments recording dPAG in response to the predator threat, the authors mentioned cells activated by the predator threat, referred to as "robot cells." Were these cells inhibited in response to threat?

      In the Result and Materials and Methods sections, we report that 23.4% (22 out of 94) of dPAG neurons, termed “robot cells,” showed a significant increase in firing rates (z > 3) within a latency of less than 500 ms during exposure to the looming robot threat, but not during the pre- and post-robot sessions. These cells are highlighted in Figures 1E-G. In contrast, we identified only a single unit exhibiting a decrease in activity (z-score < -3) in response to the robot threat. Given the overwhelming prevalence of cells with excitatory responses to the threat, our discussions and analyses have primarily centered on these excited cells. Nevertheless, to ensure a full depiction of our observations, we have included data on the inhibited unit in the revised manuscript, specifically in Figure S1E.

      (3) The authors claim that tetrodes were implanted in the dorsal PAG; however, the electrodes' tips shown in the figures are positioned more ventrally in the lateral PAG (see Figures 1B, S5A).

      The PAG is anatomically organized into dorsomedial (dmPAG), dorsolateral (dlPAG), lateral (lPAG), and ventrolateral (vlPAG) columns along the rostro-caudal axis of the aqueduct. The designation "dorsal PAG" (dPAG) traditionally encompasses the dmPAG, dlPAG, and lPAG regions, a classification supported by extensive track-tracing, neurochemical, and immunohistochemical evidence (e.g., (Bandler, Carrive, and Zhang 1991, Bandler and Keay 1996, Carrive 1993)). As Bandler and Shipley (Bandler and Shipley 1994) summarized, “These findings suggest that what has been traditionally called the 'dorsal PAG' (a collective term for regions dorsal and lateral to the aqueduct), consists of three anatomically distinct longitudinal columns: dorsomedial and lateral columns…and a dorsolateral column…" Similarly, Schenberg et al. (Schenberg et al. 2005) clarified in their review that, “According to this parcellation...the defensive behaviors (freezing, flight or fight) and aversion-related responses (switch-off behavior) were ascribed to the DMPAG, DLPAG, and LPAG (usually named the ‘dorsal’ PAG).” In our study, electrode placements were strictly within these specified dPAG regions. The electrode tip locations depicted in Figures 1B and S5A correspond with the -6.04 mm template (left panel below) from Paxinos & Watson’s atlas (Paxinos and Watson 1998), situated anteriorly to the emergence of the  vlPAG (right panel below). To enhance clarification in our manuscript, we provide a detailed definition of the dPAG that includes the dmPAG, dlPAG,  and lPAG, and support our electrode placement rationale with references to established literature (pg. 5).

      Author response image 1.

      (4) It would be nice to include a series of observations applying inhibitory tools (i.e., optogenetic photo inhibition) in the dPAG and BLA and see how they affect the behavioral responses in the 'approach food-avoid predator' paradigm. Moreover, it would be interesting to explore how inhibiting the dPAG to PVT pathway influences the flee response during the robot surge.

      We appreciate the suggestion to explore the effects of optogenetic inhibition in the dPAG and BLA on behavioral responses within the 'approach food-avoid predator' paradigm, as well as the potential impact of inhibiting the dPAG to PVT pathway on flee responses during robot surge incidents. As mentioned in our response to Reviewer 1’s critique #5, the application of optogenetic inhibition necessitates transfecting, quantifying, and photoinhibiting a comprehensive set of dPAG neurons activated by predatory threats. This approach is more viable in future studies that can leverage transgenic mouse models for their genetic tractability. Following the Joint Public Review’s recommendations, we have revised our manuscript to ensure a more measured interpretation of our data, carefully balancing the evidence from tracer studies against the limitations of our current methodology.

      Furthermore, referencing Reviewer 1’s critique #9, it is important to consider that various invasive techniques can yield different behavioral outcomes. For instance, research by Olveczky and colleagues (Otchy et al. 2015) demonstrated that acute manipulations (i.e., optogenetic and muscimol inactivation) and chronic surgical ablation of the same brain circuit can produce distinct effects in rats and finches. Despite these methodological constraints, our collective results from lesion, inactivation, electrical stimulation (Kim et al. 2013), optostimulation, and single-unit recording (present) studies cohesively suggest that the dPAG functions upstream of the BLA in processing predatory threat signals.

      (5) The authors should also examine whether 'synaptic' appositions exist between the anterogradely labeled terminals from the dPAG and the double labeled CTB and cFOS neurons in the PVT.

      We appreciate the suggestion to investigate the presence of synaptic appositions, which could potentially offer valuable insights into the synaptic connections and functional interactions within this neural circuit. However, due to the specialized nature of electron microscopy required for these examinations and the extensive resources it entails, this line of inquiry falls beyond the scope of our current study. We hope to address this aspect in future studies, where we can dedicate the necessary resources and expertise to conducting these intricate analyses.

      (6) It is odd to see the projection fields shown in Fig. 4D, where the projection to the PVT looks much sparser compared to other targets in the thalamus and hypothalamus. If the projection to the PVT has such an important function, why does it seem so weak? This should be discussed. Also, because the projection to the PVT seems sparse, the authors should consider alternative paths like the one involving the cuneiform nucleus. The cuneiform nucleus is an important region responding to looming shadows with strong bidirectional links to the dorsolateral periaqueductal gray, providing strong projections to the rostral PVT.

      The perceived scarcity of the dPAG-PVT pathway might not reflect its functional significance accurately. The PVT's small size could make its projections appear less dense in broad anatomical studies. To address this, we have updated Figure 4D with a high-resolution image that offers a detailed view of the PVT region. This enhancement (refer to the updated Fig. 4, bottom) more accurately depicts the projection density within the PVT. It is also critical to consider that the functional impact of neural pathways is not solely dependent on the quantity of projecting neurons. For instance, work by Deisseroth and colleagues (Rajasethupathy et al. 2015) has shown that even relatively sparse monosynaptic projections from the anterior cingulate cortex to the hippocampus can exert significant effects on neural circuit dynamics. Additionally, we have expanded our discussion to consider the potential roles of other circuits, such as the cuneiform nucleus, in driving the behavioral responses observed in our study (pg. 15): “Given the recent significance attributed to the superior colliculus in detecting innate visual threats (Lischinsky and Lin 2019, Wei et al. 2015, Zhou et al. 2019) and the cuneiform nucleus in the directed flight behavior of mice (Bindi et al. 2023, Tsang et al. 2023), further exploration into the communication between these structures and the dPAG-BLA circuitry is warranted.”

      (7) Finally, in the Discussion, it would be nice to comment on how the BLA mediates flee responses. Which pathways are likely involved?

      This excellent suggestion has been incorporated in the discussion (pg. 15): “Future studies will also need to delineate the downstream pathways emanating from the BLA that orchestrate goal-directed flight responses to external predatory threats as well as internal stimulations from the dPAG/BLA circuit. Potential key structures include the dorsal/posterior striatum, which has been associated with avoidance behaviors in response to airpuff in head-fixed mice (Menegas et al. 2018) and flight reactions triggered by auditory looming cues (Li et al. 2021). Additionally, the ventromedial hypothalamus (VMH) has been implicated in flight behaviors in mice, evidenced by responses to the presence of a rat predator (Silva et al. 2013) and upon optogenetic activation of VMH Steroidogenic factor 1 (Kunwar et al. 2015) or the VMH-anterior hypothalamic nucleus pathway (Wang, Chen, and Lin 2015). Investigating the indispensable role of these structures in flight behavior could involve lesion or inactivation studies. Such interventions are anticipated to inhibit flight behaviors elicited by amygdala stimulation and predatory threats, confirming their critical involvement. Conversely, activating these structures in subjects with an inactivated or lesioned amygdala, which would typically inhibit fear responses to external threats (Choi and Kim 2010), is expected to induce fleeing behavior, further elucidating their functional significance.”

      Adamantidis, A., S. Arber, J. S. Bains, E. Bamberg, A. Bonci, G. Buzsaki, J. A. Cardin, R. M. Costa, Y. Dan, Y. Goda, A. M. Graybiel, M. Hausser, P. Hegemann, J. R. Huguenard, T. R. Insel, P. H. Janak, D. Johnston, S. A. Josselyn, C. Koch, A. C. Kreitzer, C. Luscher, R. C. Malenka, G. Miesenbock, G. Nagel, B. Roska, M. J. Schnitzer, K. V. Shenoy, I. Soltesz, S. M. Sternson, R. W. Tsien, R. Y. Tsien, G. G. Turrigiano, K. M. Tye, and R. I. Wilson. 2015. "Optogenetics: 10 years after ChR2 in neurons--views from the community."  Nat Neurosci 18 (9):1202-12. doi: 10.1038/nn.4106.

      Amano, K., T. Tanikawa, H. Kawamura, H. Iseki, M. Notani, H. Kawabatake, T. Shiwaku, T. Suda, H. Demura, and K. Kitamura. 1982. "Endorphins and pain relief. Further observations on electrical stimulation of the lateral part of the periaqueductal gray matter during rostral mesencephalic reticulotomy for pain relief."  Appl Neurophysiol 45 (1-2):123-35.

      Bagley, E. E., and S. L. Ingram. 2020. "Endogenous opioid peptides in the descending pain modulatory circuit."  Neuropharmacology 173:108131. doi: 10.1016/j.neuropharm.2020.108131.

      Bandler, R., P. Carrive, and S. P. Zhang. 1991. "Integration of somatic and autonomic reactions within the midbrain periaqueductal grey: viscerotopic, somatotopic and functional organization."  Prog Brain Res 87:269-305. doi: 10.1016/s0079-6123(08)63056-3.

      Bandler, R., and K. A. Keay. 1996. "Columnar organization in the midbrain periaqueductal gray and the integration of emotional expression."  Prog Brain Res 107:285-300. doi: 10.1016/s0079-6123(08)61871-3.

      Bandler, R., and M. T. Shipley. 1994. "Columnar organization in the midbrain periaqueductal gray: modules for emotional expression?"  Trends Neurosci 17 (9):379-89. doi: 10.1016/0166-2236(94)90047-7.

      Bindi, R. P., C. C. Guimaraes, A. R. de Oliveira, F. F. Melleu, M. A. X. de Lima, M. V. C. Baldo, S. C. Motta, and N. S. Canteras. 2023. "Anatomical and functional study of the cuneiform nucleus: A critical site to organize innate defensive behaviors."  Ann N Y Acad Sci 1521 (1):79-95. doi: 10.1111/nyas.14954.

      Bindi, R. P., R. G. O. Maia, F. Pibiri, M. V. C. Baldo, S. L. Poulter, C. Lever, and N. S. Canteras. 2022. "Neural correlates of distinct levels of predatory threat in dorsal periaqueductal grey neurons."  Eur J Neurosci 55 (6):1504-1518. doi: 10.1111/ejn.15633.

      Cameron, A. A., I. A. Khan, K. N. Westlund, and W. D. Willis. 1995. "The efferent projections of the periaqueductal gray in the rat: a Phaseolus vulgaris-leucoagglutinin study. II. Descending projections."  J Comp Neurol 351 (4):585-601. doi: 10.1002/cne.903510408.

      Cannon, J. T., G. J. Prieto, A. Lee, and J. C. Liebeskind. 1982. "Evidence for opioid and non-opioid forms of stimulation-produced analgesia in the rat."  Brain Res 243 (2):315-21. doi: 10.1016/0006-8993(82)90255-4.

      Carrive, P, and M. M. Morgan. 2012. "Periaqueductal Gray." In The Human Nervous System, edited by J. K.; Paxinos Mai, G., 367-400. London: Academic Press.

      Carrive, P. 1993. "The periaqueductal gray and defensive behavior: functional representation and neuronal organization."  Behav Brain Res 58 (1-2):27-47. doi: 10.1016/0166-4328(93)90088-8.

      Choi, E. A., P. Jean-Richard-Dit-Bressel, C. W. G. Clifford, and G. P. McNally. 2019. "Paraventricular Thalamus Controls Behavior during Motivational Conflict."  J Neurosci 39 (25):4945-4958. doi: 10.1523/JNEUROSCI.2480-18.2019.

      Choi, E. A., and G. P. McNally. 2017. "Paraventricular Thalamus Balances Danger and Reward."  J Neurosci 37 (11):3018-3029. doi: 10.1523/JNEUROSCI.3320-16.2017.

      Choi, J. S., and J. J. Kim. 2010. "Amygdala regulates risk of predation in rats foraging in a dynamic fear environment."  Proc Natl Acad Sci U S A 107 (50):21773-7. doi: 10.1073/pnas.1010079108.

      De Franceschi, G., T. Vivattanasarn, A. B. Saleem, and S. G. Solomon. 2016. "Vision Guides Selection of Freeze or Flight Defense Strategies in Mice."  Curr Biol 26 (16):2150-4. doi: 10.1016/j.cub.2016.06.006.

      De Oca, B. M., J. P. DeCola, S. Maren, and M. S. Fanselow. 1998. "Distinct regions of the periaqueductal gray are involved in the acquisition and expression of defensive responses."  J Neurosci 18 (9):3426-32. doi: 10.1523/JNEUROSCI.18-09-03426.1998.

      Deng, H., X. Xiao, and Z. Wang. 2016. "Periaqueductal Gray Neuronal Activities Underlie Different Aspects of Defensive Behaviors."  J Neurosci 36 (29):7580-8. doi: 10.1523/JNEUROSCI.4425-15.2016.

      Engelke, D. S., X. O. Zhang, J. J. O'Malley, J. A. Fernandez-Leon, S. Li, G. J. Kirouac, M. Beierlein, and F. H. Do-Monte. 2021. "A hypothalamic-thalamostriatal circuit that controls approach-avoidance conflict in rats."  Nat Commun 12 (1):2517. doi: 10.1038/s41467-021-22730-y.

      Esteban Masferrer, M., B. A. Silva, K. Nomoto, S. Q. Lima, and C. T. Gross. 2020. "Differential Encoding of Predator Fear in the Ventromedial Hypothalamus and Periaqueductal Grey."  J Neurosci 40 (48):9283-9292. doi: 10.1523/JNEUROSCI.0761-18.2020.

      Fanselow, M. S. 1998. "Pavlovian conditioning, negative feedback, and blocking: mechanisms that regulate association formation."  Neuron 20 (4):625-7. doi: 10.1016/s0896-6273(00)81002-8.

      Fields, H. L. 2000. "Pain modulation: expectation, opioid analgesia and virtual pain."  Prog Brain Res 122:245-53. doi: 10.1016/s0079-6123(08)62143-3.

      Gross, C. T., and N. S. Canteras. 2012. "The many paths to fear."  Nat Rev Neurosci 13 (9):651-8. doi: 10.1038/nrn3301.

      Herry, C., and J. P. Johansen. 2014. "Encoding of fear learning and memory in distributed neuronal circuits."  Nat Neurosci 17 (12):1644-54. doi: 10.1038/nn.3869.

      Kim, E. J., O. Horovitz, B. A. Pellman, L. M. Tan, Q. Li, G. Richter-Levin, and J. J. Kim. 2013. "Dorsal periaqueductal gray-amygdala pathway conveys both innate and learned fear responses in rats."  Proc Natl Acad Sci U S A 110 (36):14795-800. doi: 10.1073/pnas.1310845110.

      Kim, E. J., M. S. Kong, S. G. Park, S. J. Y. Mizumori, J. Cho, and J. J. Kim. 2018. "Dynamic coding of predatory information between the prelimbic cortex and lateral amygdala in foraging rats."  Sci Adv 4 (4):eaar7328. doi: 10.1126/sciadv.aar7328.

      Kim, J. J., J. S. Choi, and H. J. Lee. 2016. "Foraging in the face of fear: Novel strategies for evaluating amygdala functions in rats." In Living without an amygdala, edited by D. G. Amaral and R. Adolphs, 129-148. The Guilford Press.

      Kim, J. J., R. A. Rison, and M. S. Fanselow. 1993. "Effects of amygdala, hippocampus, and periaqueductal gray lesions on short- and long-term contextual fear."  Behav Neurosci 107 (6):1093-8. doi: 10.1037//0735-7044.107.6.1093.

      Kong, M. S., E. J. Kim, S. Park, L. S. Zweifel, Y. Huh, J. Cho, and J. J. Kim. 2021. "'Fearful-place' coding in the amygdala-hippocampal network."  Elife 10. doi: 10.7554/eLife.72040.

      Krout, K. E., and A. D. Loewy. 2000. "Periaqueductal gray matter projections to midline and intralaminar thalamic nuclei of the rat."  J Comp Neurol 424 (1):111-41. doi: 10.1002/1096-9861(20000814)424:1<111::aid-cne9>3.0.co;2-3.

      Kunwar, P. S., M. Zelikowsky, R. Remedios, H. Cai, M. Yilmaz, M. Meister, and D. J. Anderson. 2015. "Ventromedial hypothalamic neurons control a defensive emotion state."  Elife 4. doi: 10.7554/eLife.06633.

      Lefler, Y., D. Campagner, and T. Branco. 2020. "The role of the periaqueductal gray in escape behavior."  Curr Opin Neurobiol 60:115-121. doi: 10.1016/j.conb.2019.11.014.

      Li, Z., J. X. Wei, G. W. Zhang, J. J. Huang, B. Zingg, X. Wang, H. W. Tao, and L. I. Zhang. 2021. "Corticostriatal control of defense behavior in mice induced by auditory looming cues."  Nat Commun 12 (1):1040. doi: 10.1038/s41467-021-21248-7.

      Lischinsky, J. E., and D. Lin. 2019. "Looming Danger: Unraveling the Circuitry for Predator Threats."  Trends Neurosci 42 (12):841-842. doi: 10.1016/j.tins.2019.10.004.

      Lu, B., P. Fan, M. Li, Y. Wang, W. Liang, G. Yang, F. Mo, Z. Xu, J. Shan, Y. Song, J. Liu, Y. Wu, and X. Cai. 2023. "Detection of neuronal defensive discharge information transmission and characteristics in periaqueductal gray double-subregions using PtNP/PEDOT:PSS modified microelectrode arrays."  Microsyst Nanoeng 9:70. doi: 10.1038/s41378-023-00546-8.

      Magierek, V., P. L. Ramos, N. G. da Silveira-Filho, R. L. Nogueira, and J. Landeira-Fernandez. 2003. "Context fear conditioning inhibits panic-like behavior elicited by electrical stimulation of dorsal periaqueductal gray."  Neuroreport 14 (12):1641-4. doi: 10.1097/00001756-200308260-00020.

      McNally, G. P., J. P. Johansen, and H. T. Blair. 2011. "Placing prediction into the fear circuit."  Trends Neurosci 34 (6):283-92. doi: 10.1016/j.tins.2011.03.005.

      Menegas, W., K. Akiti, R. Amo, N. Uchida, and M. Watabe-Uchida. 2018. "Dopamine neurons projecting to the posterior striatum reinforce avoidance of threatening stimuli."  Nat Neurosci 21 (10):1421-1430. doi: 10.1038/s41593-018-0222-1.

      Morgan, M. M., P. K. Whitney, and M. S. Gold. 1998. "Immobility and flight associated with antinociception produced by activation of the ventral and lateral/dorsal regions of the rat periaqueductal gray."  Brain Res 804 (1):159-66. doi: 10.1016/s0006-8993(98)00669-6.

      Otchy, T. M., S. B. Wolff, J. Y. Rhee, C. Pehlevan, R. Kawai, A. Kempf, S. M. Gobes, and B. P. Olveczky. 2015. "Acute off-target effects of neural circuit manipulations."  Nature 528 (7582):358-63. doi: 10.1038/nature16442.

      Paxinos, G., and C. Watson. 1998. The Rat Brain in Stereotaxic Coordinates. San Diego: Academic Press.

      Rajasethupathy, P., S. Sankaran, J. H. Marshel, C. K. Kim, E. Ferenczi, S. Y. Lee, A. Berndt, C. Ramakrishnan, A. Jaffe, M. Lo, C. Liston, and K. Deisseroth. 2015. "Projections from neocortex mediate top-down control of memory retrieval."  Nature 526 (7575):653-9. doi: 10.1038/nature15389.

      Ressler, R. L., and S. Maren. 2019. "Synaptic encoding of fear memories in the amygdala."  Curr Opin Neurobiol 54:54-59. doi: 10.1016/j.conb.2018.08.012.

      Schenberg, L. C., R. M. Povoa, A. L. Costa, A. V. Caldellas, S. Tufik, and A. S. Bittencourt. 2005. "Functional specializations within the tectum defense systems of the rat."  Neurosci Biobehav Rev 29 (8):1279-98. doi: 10.1016/j.neubiorev.2005.05.006.

      Silva, B. A., C. Mattucci, P. Krzywkowski, E. Murana, A. Illarionova, V. Grinevich, N. S. Canteras, D. Ragozzino, and C. T. Gross. 2013. "Independent hypothalamic circuits for social and predator fear."  Nat Neurosci 16 (12):1731-3. doi: 10.1038/nn.3573.

      Tsang, E., C. Orlandini, R. Sureka, A. H. Crevenna, E. Perlas, I. Prankerd, M. E. Masferrer, and C. T. Gross. 2023. "Induction of flight via midbrain projections to the cuneiform nucleus."  PLoS One 18 (2):e0281464. doi: 10.1371/journal.pone.0281464.

      Vianna, D. M., and M. L. Brandao. 2003. "Anatomical connections of the periaqueductal gray: specific neural substrates for different kinds of fear."  Braz J Med Biol Res 36 (5):557-66. doi: 10.1590/s0100-879x2003000500002.

      Walker, D. L., and M. Davis. 1997. "Involvement of the dorsal periaqueductal gray in the loss of fear-potentiated startle accompanying high footshock training."  Behav Neurosci 111 (4):692-702. doi: 10.1037//0735-7044.111.4.692.

      Wang, L., I. Z. Chen, and D. Lin. 2015. "Collateral pathways from the ventromedial hypothalamus mediate defensive behaviors."  Neuron 85 (6):1344-58. doi: 10.1016/j.neuron.2014.12.025.

      Wei, P., N. Liu, Z. Zhang, X. Liu, Y. Tang, X. He, B. Wu, Z. Zhou, Y. Liu, J. Li, Y. Zhang, X. Zhou, L. Xu, L. Chen, G. Bi, X. Hu, F. Xu, and L. Wang. 2015. "Processing of visually evoked innate fear by a non-canonical thalamic pathway."  Nat Commun 6:6756. doi: 10.1038/ncomms7756.

      Yeh, L. F., T. Ozawa, and J. P. Johansen. 2021. "Functional organization of the midbrain periaqueductal gray for regulating aversive memory formation."  Mol Brain 14 (1):136. doi: 10.1186/s13041-021-00844-0.

      Yilmaz, M., and M. Meister. 2013. "Rapid innate defensive responses of mice to looming visual stimuli."  Curr Biol 23 (20):2011-5. doi: 10.1016/j.cub.2013.08.015.

      Zhou, Z., X. Liu, S. Chen, Z. Zhang, Y. Liu, Q. Montardy, Y. Tang, P. Wei, N. Liu, L. Li, R. Song, J. Lai, X. He, C. Chen, G. Bi, G. Feng, F. Xu, and L. Wang. 2019. "A VTA GABAergic Neural Circuit Mediates Visually Evoked Innate Defensive Responses."  Neuron 103 (3):473-488 e6. doi: 10.1016/j.neuron.2019.05.027.

    1. eLife assessment

      This valuable work describes a new protein factor that is required for filamentous phage assembly. Convincing evidence is provided for the binding of PSB15 to the packaging signal of the single-stranded DNA, Trx, and cardiolipin, and a mechanism for how the phage DNA is targeted to the assembly site in the bacterial inner membrane is presented. The work will be of interest to microbiologists.

    2. Reviewer #1 (Public Review):

      Summary:

      This work describes a new protein factor required for filamentous phage assembly. The protein PSB15 binds to the packaging signal of the ssDNA, Trx and cardiolipin. A mechanism how the phage DNA is targeted to the assembly site in the bacterial inner membrane is discussed.

      Strengths:

      The work describes a clever way to detect factors required for phage propagation by looking at the plaque size of pseudorevertants that arise after infection of a phage with a directed mutation in the packaging signal. This led to the detection of a phage protein expressed from ORF9, the PSB15.

      The authors convincingly show that PSB15 is expressed in infected cells and can complement a phage with a mutated orf9.

      Weaknesses:

      Given the fact that the phage LF-UK is not well explored, many open questions should be mentioned in the introduction. For the study, it is important to know if the phageLF-UK has a mimick or homolog of gV and gXI, and if not, whether PSB15 could take their role.

      I am not convinced of the proposition of their term "checkpoint". The truth is that the authors do not know the real purpose of PSB15. I do not see an advantage for a checkpoint that only adds an additional step to enter the phage assembly site. There must be a biochemical reason for the action of PSB15. Looking at Figure 7, the step from "Release" to "Loading" is just adding many unknowns, e.g. how to transfer the DNA, how to dispose of PSB15 and Trx? Also, in the previous step are three question marks that do not add any solid information.

      The in vivo study of subcellular localization is very questionable. Why is there a single fluorescent dot if there are thousands of PSB15 molecules expressed in the cell? I have my doubts that the conclusions the authors make here are correct and meaningful. The movies do not add anything significant.

    3. Reviewer #2 (Public Review):

      Secretion of the prototypical F-associated filamentous phage (Ff) of E. coli depends on the selective binding of a hairpin (the packaging signal, PS) by two phage encoded protein, pVII and pIX. PVII and pIX target the PS to IM channels formed by pI and pIV. However, integrative filamentous phages lack a homologue of pIX and pIV, and many of them also lack a homologue of pVII, raising questions on the assembly and secretion of new phages. In the manuscript, Yueh et al. present the identification of a phage-encoded protein, PSB15, which binds to the PS signal of a Xanthomonas integrative filamentous phage, ΦLf-UK. They showed that PSB15 is required for viral assembly and is conserved in several other integrative filamentous phages. They further analyzed how PSB15 binds to PS and demonstrated that it associates to the IM, which targets phage DNA to it. Finally, they show that thioredoxin, the only host protein that was found to be essential for Ff secretion, interacts with PSB15 and releases the PSB15-PS complex from the IM. These results are important because they elucidate a major step in the secretion of integrative filamentous phage, and the role of thioredoxin on filamentous phage secretion in general.

      I found the data and interpretation convincing. However, the presentation and description are confusing in places because the reader has to juggle between figures. A scheme depicting what is known and unknown in the integration of Ff phages and interactive filamentous phages in the introduction would be useful to the general reader.

    1. eLife assessment

      This study presents important data describing cell states of olfactory ensheathing cells, and how these cell states may relate to repair after spinal cord injury. While the overall framework used for characterizing these cells is solid, the quantification and contextualization of results are incomplete, given that measurements, significance statistics, and discussion of both previous work and experimental methods that would be necessary to support several claims are not provided. With more thorough quantification and discussion, this work will be of interest to stem cell biologists and spinal cord injury researchers.

    2. Joint Public Review:

      Summary

      This manuscript explores the transcriptomic identities of olfactory ensheathing cells (OECs), glial cells that support life-long axonal growth in olfactory neurons, as they relate to spinal cord injury repair. The authors show that transplantation of cultured, immunopurified rodent OECs at a spinal cord injury site can promote injury-bridging axonal regrowth. They then characterize these OECs using single-cell RNA sequencing, identifying five subtypes and proposing functional roles that include regeneration, wound healing, and cell-cell communication. They identify one progenitor OEC subpopulation and also report several other functionally relevant findings, notably, that OEC marker genes contain mixtures of other glial cell type markers (such as for Schwann cells and astrocytes), and that these cultured OECs produce and secrete Reelin, a regrowth-promoting protein that has been disputed as a gene product of OECs.

      This manuscript offers an extensive, cell-level characterization of OECs, supporting their potential therapeutic value for spinal cord injury and suggesting potential underlying repair mechanisms. The authors use various approaches to validate their findings, providing interesting images that show the overlap between sprouting axons and transplanted OECs, and showing that OEC marker genes identified using single-cell RNA sequencing are present in vivo, in both olfactory bulb tissue and spinal cord after OEC transplantation.

      Despite the breadth of information presented, however, further quantification of results and explanation of experimental approaches would be needed to support some of the authors' claims. Additionally, a more thorough discussion is needed to contextualize their findings relative to previous work.

      (1) Important quantification is lacking for the data presented. For example, multiple figures include immunohistochemistry or immunocytochemistry data (Figures 1, 5, 6), but they are presented without accompanying measures like fractions of cells labeled or comparisons against controls. As a result, for axons projecting via OEC bridges in Figure 1, it is unclear how common these bridges are in the presence or absence of OECs. For Figure 6., it is unclear whether cells having an alternative OEC morphology coincide with progenitor OEC subtype marker genes to a statistically significant degree. Similar quantification is missing in other types of data such as Western blot images (Fig. 9) and OEC marker gene data (for which p-values are not reported; Table S2).

      The addition of quantitative measures and, where appropriate, statistical comparisons with p-values or other significance measures, would be important for supporting the authors' claims and more rigorously conveying the results.

      (2) Some aspects of the experimental design that are relevant to the interpretation of the results are not explained. For example, OECs appear to be collected from only female rats, but the potential implications of this factor are not discussed.

      Additionally, it is unclear from the manuscript to what degree immunopurified cells are OECs as opposed to other cell types. The antibody used to retain OECs, nerve growth factor receptor p75 (Ngfr-p75), can also be expressed by non-OEC olfactory bulb cell types including astrocytes [1-3]. The possible inclusion of Ngfr-p75-positive but non-OEC cell types in the OEC culture is not sufficiently addressed. Such non-OEC cell types are also not distinguished in the analysis of single-cell RNA sequencing data (only microglia, fibroblasts, and OECs are identified; Figure 2). Thus, it is currently unclear whether results related to the OEC subtype may have been impacted by these experimental factors.

      (3) The introduction, while well written, does not discuss studies showing no significant effect of OEC implantation after spinal cord injury. The discussion also fails to sufficiently acknowledge this variability in the efficacy of OEC implantation. This omission amplifies bias in the text, suggesting that OECs have significant effects that are not fully reflected in the literature. The introduction would need to be expanded to properly address the nuance suggested by the literature regarding the benefits of OECs after spinal cord injury. Additionally, in the discussion, relating the current study to previous work would help clarify how varying observations may relate to experimental or biological factors.

      (a) Cragnolini, A.B. et al., Glia, (2009), doi: 10.1002/glia.20857.<br /> (b) Vickland H. et al., Brain Res., (1991), doi: 10.1016/0006-8993(91)91659-O.<br /> (c) Ung K. et al., Nat Commun., (2021), doi: 10.1038/s41467-021-25444-3.

    1. eLife assessment

      This study presents valuable research comparing three different species of extant cartilaginous fishes and describes new data on ratfish. The methods are convincing although the reviewers noted that standardized methods are essential when comparing numerical datasets. This study would be of interest to skeletal biologists working on the evolution of chondrichthyan skeletons.

    2. Reviewer #1 (Public Review):

      Summary:

      It seems as if the main point of the paper is about the new data related to rat fish although your title is describing it as extant cartilaginous fishes and you bounce around between the little skate and ratfish. So here's an opportunity for you to adjust the title to emphasize ratfish is given the fact that leader you describe how this is your significant new data contribution. Either way, the organization of the paper can be adjusted so that the reader can follow along the same order for all sections so that it's very clear for comparative purposes of new data and what they mean. My opinion is that I want to read, for each subheading in the results, about the the ratfish first because this is your most interesting novel data. Then I want to know any confirmation about morphology in little skate. And then I want to know about any gaps you fill with the cat shark. (It is ok if you keep the order of "skate, ratfish, then shark, but I think it undersells the new data).

      Strengths:

      The imagery and new data availability for ratfish are valuable and may help to determine new phylogenetically informative characters for understanding the evolution of cartilaginous fishes. You also allude to the fossil record.

      Opportunities:

      I am concerned about the statement of ratfish paedomorphism because stage 32 and 33 were not statistically significantly different from one another (figure and prior sentences). So, these ratfish TMDs overlap the range of both 32 and 33. I think you need more specimens and stages to state this definitely based on TMD. What else leads you to think these are paedomorphic? Right now they are different, but it's unclear why. You need more outgroups.

      Your headings for the results subsection and figures are nice snapshots of your interpretations of the results and I think they would be better repurposed in your abstract, which needs more depth.

      Historical literature is more abundant than what you've listed. Your first sentence describes a long fascination and only goes back to 1990. But there are authors that have had this fascination for centuries and so I think you'll benefit from looking back. Especially because several of them have looked into histology and development of these fishes.

      I agree that in the past 15 years or so a lot more work has been done because it can be done using newer technologies and I don't think your list is exhaustive. You need to expand this list and history which will help with your ultimate comparative analysis without you needed to sample too many new data yourself.

      I'd like to see modifications to figure 7 so that you can add more continuity between the characters, illustrated in figure 7 and the body of the text. Generally Holocephalans are the outgroup to elasmobranchs - right now they are presented as sister taxa with no ability to indicate derivation. Why isn't the catshark included in this diagram?

      In the last paragraph of the introduction, you say that "the data argue" and I admit, I am confused. Whose data? Is this a prediction or results or summary of other people's work? Either way, could be clarified to emphasize the contribution you are about to present.

    3. Reviewer #2 (Public Review):

      General comment:

      This is a very valuable and unique comparative study. An excellent combination of scanning and histological data from three different species is presented. Obtaining the material for such a comparative study is never trivial. The study presents new data and thus provides the basis for an in-depth discussion about chondrichthyan mineralised skeletal tissues. I have, however, some comments. Some information is lacking and should be added to the manuscript text. I also suggest changes in the result and the discussion section of the manuscript.

      Introduction:

      The reader gets the impression almost no research on chondrichthyan skeletal tissues was done before the 2010 ("last 15 years", L45). I suggest to correct that and to cite also previous studies on chondrichthyan skeletal tissues, this includes studies from before 1900.

      Material and Methods:

      Please complete L473-492: Three different Micro-CT scanners were used for three different species? ScyScan 117 for the skate samples. Catshark different scanner, please provide full details. Chimera Scncrotron Scan? Please provide full details for all scanning protocols.

      TMD is established in the same way in all three scanners? Actually not possible. Or, all specimens were scanned with the same scanner to establish TMD? If so please provide the protocol.

      Please complete L494 ff: Tissue embedding medium and embedding protocol is missing. Specimens have been decalcified, if yes how? Have specimens been sectioned non-decalcified or decalcified?

      Please complete L506 ff: Tissue embedding medium and embedding protocol is missing. Description of controls are missing.

      Results:

      L147: It is valuable and interesting to compare the degree of mineralisation in individuals from the three different species. It appears, however, not possible to provide numerical data for Tissue Mineral Density (TMD). First requirement, all specimens must be scanned with the same scanner and the same calibration values. This in not stated in the M&M section. But even if this was the case, all specimens derive from different sample locations and have, been preserved differently. Type of fixation, extension of fixation time in formalin, frozen, unfrozen, conditions of sample storage, age of the samples, and many more parameters, all influence TMD values. Likewise the relative age of the animals (adult is not the same as adult) influences TMD. One must assume different sampling and storage conditions and different types of progression into adulthood. Thus, the observation of different degrees of mineralisation is very interesting but I suggest not to link this observation to numerical values.

      Parts of the results are mixed with discussion. Sometimes, a result chapter also needs a few references but this result chapter is full of references.

      Based on different protocols, the staining characteristics of the tissue are analysed. This is very good and provides valuable additional data. The authors should inform the not only about the staining (positive of negative) abut also about the histochemical characters of the staining. L218: "fast green positive" means what? L234: "marked by Trichrome acid fuchsin" means what? And so on, see also L237, L289, L291<br /> Discussion

      Please completely remove figure 7, please adjust and severely downsize the discussion related to figure 7. It is very interesting and valuable to compare three species from three different groups of elasmobranchs. Results of this comparison also validate an interesting discussion about possible phylogenetic aspects. This is, however, not the basis for claims about the skeletal tissue organisation of all extinct and extant members of the groups to which the three species belong. The discussion refers to "selected representatives" (L364), but how representative are the selected species? Can there be a extant species that represents the entire large group, all sharks, rays or chimeras? Are the three selected species basal representatives with a generalist life style?

      Please completely remove the discussion about paedomorphosis in chimeras (already in the result section). This discussion is based on a wrong idea about the definition of paedomorphosis. Paedomorphosis can occur in members of the same group. Humans have paedormorphic characters within the primates, Ambystoma mexicanum is paedormorphic within the urodeals. Paedomorphosis does not extend to members of different vertebrate branches. That elasmobranchs have a developmental stage that resembles chimera vertebra mineralisation does not define chimera vertebra centra as paedomorphic. Teleost have a herocercal caudal fin anlage during development, that does not mean the heterocercal fins in sturgeons or elasmobranchs are paedomorphic characters.

      L432-435: In times of Gadow & Abott (1895) science had completely wrong ideas bout the phylogenic position of chondrichthyans within the gnathostomes. It is curious that Gadow & Abott (1895) are being cited in support of the paedomorphosis claim.

      The SCPP part of the discussion is unrelated to the data obtained by this study. Kawaki & WEISS (2003) describe a gene family (called SCPP) that control Ca-binding extracellular phosphoproteins in enamel, in bone and dentine, in saliva and in milk. It evolved by gene duplication and differentiation. They date it back to a first enamel matrix protein in conodonts (Reif 2006). Conodonts, a group of enigmatic invertebrates have mineralised structures but these structure are neither bone nor mineralised cartilage. Cat fish (6 % of all vertebrate species) on the other hand, have bone but do not have SCPP genes (Lui et al. 206). Other calcium binding proteins, such as osteocalcin, were initially believed to be required for mineralisation. It turned out that osteocalcin is rather a mineralisation inhibitor, at best it regulates the arrangement collagen fiber bundles. The osteocalcin -/- mouse has fully mineralised bone. As the function of the SCPP gene product for bone formation is unknown, there is no need to discuss SCPP genes. It would perhaps be better to finish the manuscript with summery that focuses on the subject and the methodology of this nice study.

    1. eLife assessment

      This useful study reports that epididymal proteins are required for embryogenesis after fertilization. The data presented are generally convincing, but the study is incomplete because it does not investigate in detail how those proteins cause DNA fragmentation and compromised embryonic development. This work will be of interest to reproductive biologists and andrologists.

    2. Reviewer #1 (Public Review):

      Summary:

      The main observation that the sperm from CRISP proteins 1 and 3 KO lines are post-fertilization less developmentally competent is convincing. However, the molecular characterization of the mechanism that leads to these defects and the temporal appearance of the defects requires additional studies.

      Strengths:

      The generation of these double mutant mice is valuable for the field. Moreover, the fact that the double mutant line of Crisp 1 and 3 is phenotypically different from the Crisp 1 and 4 line suggests different functions of these epididymis proteins. The methods used to demonstrate that developmental defects are largely due to post-fertilization defects are also a considerable strength. The initial characterization of these sperm has altered intracellular Ca2+ levels, and increased rates of DNA fragmentation are valuable.

      Weaknesses:

      The study is mechanistically incomplete because there is no direct demonstration that the absence of these proteins alters the epididymal environment and fluid, wherein during the passage through the epididymis the sperm become affected. Also, a direct demonstration of how the proteins in question cause or lead to DNA damage and increased Ca2+ requires further characterization.

    3. Reviewer #2 (Public Review):

      The authors showed that CRISP1 and CRISP3, secreted proteins in the epididymis, are required for early embryogenesis after fertilization through DNA integrity in cauda epididymal sperm. This paper is the first report showing that the epididymal proteins are required for embryogenesis after fertilization. However, some data in this paper (Table 1 and Figure 2A) are overlapped in a published paper (Curci et al., FASEB J, 34,15718-15733, 2020; PMID: 33037689). Furthermore, the authors did not address why the disruption of CRISP1/3 leads to these phenomena (the increased level of the intracellular Ca2+ level and impaired DNA integrity in sperm) with direct evidence. Therefore, if the authors can address the following comments to improve the paper's novelty and clarification, this paper may be worthwhile to readers.

    1. Author response

      Reviewer #1 (Public Review):

      The authors aimed to investigate if 2-hydroxybutyrate (2HB), a metabolite induced by exercise, influences physiological changes, particularly metabolic alterations post-exercise training. They treated young mice and cultured myoblasts with 2HB, conducted exercise tests, metabolomic profiling, gene expression analysis, and knockdown experiments to understand 2HB's mechanisms. Their findings indicate that 2HB enhances exercise tolerance, boosts branch chain amino acid (BCAA) enzyme gene expression in skeletal muscles, and increases oxidative capacity. They also highlight the role of SIRT4 in these effects. This study establishes 2HB, once considered a waste product, as a regulator of exercise-induced metabolic processes. The study's strength lies in its consistent results across in vitro, in vivo, and ex vivo analyses.

      The authors propose a mechanism in which 2HB inhibits BCAA breakdown, raises NAD+/NADH ratio, activates SIRT4, increases ADP ribosylation, and controls gene expression.

      However, some questions remain unclear based on these findings:

      This study focused on the effects of short-term exercise (1 or 5 bouts of treadmill running) and short-term 2HB treatment (1 or 4 days of treatment). Adaptations to exercise training typically occur progressively over an extended period. It's important to investigate the effects of long-term 2HB treatment and whether extended combined 2HB treatment and exercise training have independent, synergistic, or antagonistic effects.

      We agree with the reviewer that investigation of longer-term 2HB treatment may potentially yield interesting findings with more implications to exercise physiology. To investigate the effects of 2HB treatment against or in combination with a progressive exercise training protocol would require an experiment duration between 4 to 12 weeks, based on previous studies (Systematic Review by Massett et al., Frontiers in Physiology, 2021, 10.3389/fphys.2021.782695). However, our experience with these types of experiments is that such a pursuit would require a breadth of work beyond the scope of this current study. For instance, if there were evidence of weakened effect of 2HB over time, one may be compelled to investigate other organs such as the liver to find signs of metabolic adaptation to the exogenous metabolite. If there were additive or synergistic effects on exercise performance, one may be compelled to investigate changes to the cardiovascular system in addition to the skeletal muscle. Additional questions would be raised around the skeletal muscle as well, including assessment of structural and fibre-type changes. Further, these additional mechanisms would need to be characterized in a time course fashion. Rather, we view the scope of the current study to be the acute response to 2HB as an initial report on mechanistic effects of 2HB.

      Exercise training leads to significant mitochondrial changes, including increased mitochondrial biogenesis in skeletal muscle. It would be valuable to compare the impact of 2HB treatment on mitochondrial content and oxidative capacity in treated mice to that in exercised mice.

      We agree with the author that it is of interest to investigate how 2HB may affect mitochondrial biogenesis. However, our preliminary findings were that 2HB-treated MEFs, C2C12s, and mouse soleus muscles showed no change in PGC1α gene expression after four days of treatment (data not shown). As a follow-up assessment of mitochondrial protein expression, although not specific to mtDNA derived genes, we quantified the expression of the respiratory chain proteins in cells and soleus muscle and found no effect of 2HB treatment (SFig. 5,6). At this stage we conclude that there is not evidence of 2HB modifying mitochondrial biogenesis in this time frame and that further investigation would be best suited to a follow-up study such as one interested in long-term exercise training.

      The authors demonstrate that 2-ketobutyrate (2KB) can serve as an oxidative fuel, suggesting a role for the intact BCAA catabolic pathway. However, it's puzzling that the knockout of BCKDHA, a subunit crucial for the second step of BCAA catabolism, did not result in changes in oxidative capacity in cultured myoblasts.

      While we report the BCKDH complex to be dispensable for 2KB oxidation it is important to note that previous studies have reported the following: (1) that 2KB is a viable substrate for BCKDH, (2) that 2KB is a viable substrate for pyruvate dehydrogenase, and (3) that pyruvate dehydrogenase is also dispensable for 2KB oxidation (see Steele et al., J Nutr., 114: 701-710, and Paxton et al. Biochem J., 234:295-303). Collectively, these data have led previous studies to conclude that BCKDH and pyruvate dehydrogenase are redundant for the first step of 2KB oxidation, with a preference for BCKDH. The flux through either may depend upon the metabolic environment. The aim for figure 3C was to determine whether the BCAA degradation pathway was required for 2KB oxidation. We conclude that this pathway is required, first at the step of PCC.

      While these past studies were mentioned in paragraph 2 of the discussion, in light of the reviewer’s comment we have expanded this paragraph. We have added language to explain that future research interested in the presented 2HB mechanism should carefully consider BCKDH and PDH expression in the cell or tissue of interest, as the metabolism of 2KB is quite central to the presented mechanism.

      Nevertheless, this innovative model of metabolic signaling during exercise will serve as a valuable reference for informing future.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript entitled "A 2-HB-mediated feedback loop regulates muscular fatigue" by the Johnson group reports interesting findings with implications for the health benefits of exercise. The authors use a combination of metabolic/biochemical in vivo and in vitro assays to delineate a metabolic route triggered by 2-HB (a relatively stable metabolite induced by exercise in humans and mice) that controls branched-chain amino transferase enzymes and mitochondrial oxidative capacity. Mechanistically, the author shows that 2-HB is a direct inhibitor of BCAT enzymes that in turn control levels of SIRT4 activity and ADp-ribosylation in the nucleus targeting C/EBP transcription factor, affecting BCAA oxidation genes (see Fig 4i in the paper). Overall, these are interesting and novel observations and findings with relevance to human exercise, with the potential implication of using these metabolites to mimic exercise benefits, or conditions or muscular fatigue that occurs in different human chronic diseases including rheumatic diseases or long COVID.

      Weaknesses:

      There are several experiments/comments that will strengthen the manuscript-

      (1) A final model in Figure 6 integrating the exercise/mechanistic findings, expanding on Fig 4i) will clarify the findings.

      We appreciate the reviewer’s suggestion to incorporate the exercise findings into a summary figure. However, upon internal review we find that such a figure is too similar to Fig 4i to warrant a new diagram.

      (2) In some of the graphs, statistics are missing (e.g Fig 6G).

      Some figures are included primarily for the reader to visualize the data while statistical comparison is conducted in a separate figure, for example Fig 2D-G. However, we have revised the figure legends to ensure that statistical comparisons are described for all appropriate figures, including Fig 6G identified by the reviewer.

      (3) The conclusions on SIRT4 dependency should be carefully written, as it is likely that this is only one potential mechanism, further validation with mouse models would be necessary.

      We appreciate the reviewers feedback and take the point well that a NAD-dependent mechanism will likely stimulate other sirtuins, which are often in fact expressed at greater levels than SIRT4. To reflect this comment in the manuscript we have altered paragraph 5 of the discussion to now focus on sirtuins. We briefly discuss SIRT4 and highlight the need for future consideration of other sirtuins, perhaps particularly mitochondrial sirtuins.

      (4) One of the needed experiments to support the oxidative capacity effects that could be done in cultured cells, is the use of radiosotope metabolites including BCCAs to determine the ability to produce CO2. Alternatively or in combination metabolite flux using isotopes would be useful to strengthen the current results.

      We appreciate the suggestion from the reviewer and we will look to conduct such an experiment in our follow-up work.

      We sincerely thank the reviewers for their input on this study as their suggestions have led to an improved manuscript for the version of record. The reviewer comments are well taken and we are glad that they will be present alongside the final manuscript to provide an important perspective on the work.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Using a cross-modal sensory selection task in head-fixed mice, the authors attempted to characterize how different rules reconfigured representations of sensory stimuli and behavioral reports in sensory (S1, S2) and premotor cortical areas (medial motor cortex or MM, and ALM). They used silicon probe recordings during behavior, a combination of single-cell and population-level analyses of neural data, and optogenetic inhibition during the task.

      Strengths:

      A major strength of the manuscript was the clarity of the writing and motivation for experiments and analyses. The behavioral paradigm is somewhat simple but well-designed and wellcontrolled. The neural analyses were sophisticated, clearly presented, and generally supported the authors' interpretations. The statistics are clearly reported and easy to interpret. In general, my view is that the authors achieved their aims. They found that different rules affected preparatory activity in premotor areas, but not sensory areas, consistent with dynamical systems perspectives in the field that hold that initial conditions are important for determining trial-based dynamics.

      Weaknesses:

      The manuscript was generally strong. The main weakness in my view was in interpreting the optogenetic results. While the simplicity of the task was helpful for analyzing the neural data, I think it limited the informativeness of the perturbation experiments. The behavioral read-out was low dimensional -a change in hit rate or false alarm rate- but it was unclear what perceptual or cognitive process was disrupted that led to changes in these read-outs. This is a challenge for the field, and not just this paper, but was the main weakness in my view. I have some minor technical comments in the recommendations for authors that might address other minor weaknesses.

      I think this is a well-performed, well-written, and interesting study that shows differences in rule representations in sensory and premotor areas and finds that rules reconfigure preparatory activity in the motor cortex to support flexible behavior.

      Reviewer #2 (Public Review):

      Summary:

      Chang et al. investigate neuronal activity firing patterns across various cortical regions in an interesting context-dependent tactile vs visual detection task, developed previously by the authors (Chevee et al., 2021; doi: 10.1016/j.neuron.2021.11.013). The authors report the important involvement of a medial frontal cortical region (MM, probably a similar location to wM2 as described in Esmaeili et al., 2021 & 2022; doi: 10.1016/j.neuron.2021.05.005; doi: 10.1371/journal.pbio.3001667) in mice for determining task rules.

      Strengths:

      The experiments appear to have been well carried out and the data well analysed. The manuscript clearly describes the motivation for the analyses and reaches clear and well-justified conclusions. I find the manuscript interesting and exciting!

      Weaknesses:

      I did not find any major weaknesses.

      Reviewer #3 (Public Review):

      This study examines context-dependent stimulus selection by recording neural activity from several sensory and motor cortical areas along a sensorimotor pathway, including S1, S2, MM, and ALM. Mice are trained to either withhold licking or perform directional licking in response to visual or tactile stimulus. Depending on the task rule, the mice have to respond to one stimulus modality while ignoring the other. Neural activity to the same tactile stimulus is modulated by task in all the areas recorded, with significant activity changes in a subset of neurons and population activity occupying distinct activity subspaces. Recordings further reveal a contextual signal in the pre-stimulus baseline activity that differentiates task context. This signal is correlated with subsequent task modulation of stimulus activity. Comparison across brain areas shows that this contextual signal is stronger in frontal cortical regions than in sensory regions. Analyses link this signal to behavior by showing that it tracks the behavioral performance switch during task rule transitions. Silencing activity in frontal cortical regions during the baseline period impairs behavioral performance.

      Overall, this is a superb study with solid results and thorough controls. The results are relevant for context-specific neural computation and provide a neural substrate that will surely inspire follow-up mechanistic investigations. We only have a couple of suggestions to help the authors further improve the paper.

      (1) We have a comment regarding the calculation of the choice CD in Fig S3. The text on page 7 concludes that "Choice coding dimensions change with task rule". However, the motor choice response is different across blocks, i.e. lick right vs. no lick for one task and lick left vs. no lick for the other task. Therefore, the differences in the choice CD may be simply due to the motor response being different across the tasks and not due to the task rule per se. The authors may consider adding this caveat in their interpretation. This should not affect their main conclusion.

      We thank the Reviewer for the suggestion. We have discussed this caveat and performed a new analysis to calculate the choice coding dimensions using right-lick and left-lick trials (Fig. S3h) on page 8. 

      “Choice coding dimensions were obtained from left-lick and no-lick trials in respond-to-touch blocks and right-lick and no-lick trials in respond-to-light blocks. Because the required lick directions differed between the block types, the difference in choice CDs across task rules (Fig. S4f) could have been affected by the different motor responses. To rule out this possibility, we did a new version of this analysis using right-lick and left-lick trials to calculate the choice coding dimensions for both task rules. We found that the orientation of the choice coding dimension in a respond-to-touch block was still not aligned well with that in a respond-to-light block (Fig. S4h;  magnitude of dot product between the respond-to-touch choice CD and the respond-to-light choice CD, mean ± 95% CI for true vs shuffled data: S1: 0.39 ± [0.23, 0.55] vs 0.2 ± [0.1, 0.31], 10 sessions; S2: 0.32 ± [0.18, 0.46] vs 0.2 ± [0.11, 0.3], 8 sessions; MM: 0.35 ± [0.21, 0.48] vs 0.18 ± [0.11, 0.26], 9 sessions; ALM: 0.28 ± [0.17, 0.39] vs 0.21 ± [0.12, 0.31], 13 sessions).”

      We also have included the caveats for using right-lick and left-lick trials to calculate choice coding dimensions on page 13.

      “However, we also calculated choice coding dimensions using only right- and left-lick trials. In S1, S2, MM and ALM, the choice CDs calculated this way were also not aligned well across task rules (Fig. S4h), consistent with the results calculated from lick and no-lick trials (Fig. S4f). Data were limited for this analysis, however, because mice rarely licked to the unrewarded water port (# of licksunrewarded port  / # of lickstotal , respond-to-touch: 0.13, respond-to-light: 0.11). These trials usually came from rule transitions (Fig. 5a) and, in some cases, were potentially caused by exploratory behaviors. These factors could affect choice CDs.”

      (2) We have a couple of questions about the effect size on single neurons vs. population dynamics. From Fig 1, about 20% of neurons in frontal cortical regions show task rule modulation in their stimulus activity. This seems like a small effect in terms of population dynamics. There is somewhat of a disconnect from Figs 4 and S3 (for stimulus CD), which show remarkably low subspace overlap in population activity across tasks. Can the authors help bridge this disconnect? Is this because the neurons showing a difference in Fig 1 are disproportionally stimulus selective neurons?

      We thank the Reviewer for the insightful comment and agree that it is important to link the single-unit and population results. We have addressed these questions by (1) improving our analysis of task modulation of single neurons  (tHit-tCR selectivity) and (2) examining the relationship between tHit-tCR selective neurons and tHit-tCR subspace overlaps.  

      Previously, we averaged the AUC values of time bins within the stimulus window (0-150 ms, 10 ms bins). If the 95% CI on this averaged AUC value did not include 0.5, this unit was considered to show significant selectivity. This approach was highly conservative and may underestimate the percentage of units showing significant selectivity, particularly any units showing transient selectivity. In the revised manuscript, we now define a unit as showing significant tHit-tCR selectivity when three consecutive time bins (>30 ms, 10ms bins) of AUC values were significant. Using this new criterion, the percentage of tHittCR selective neurons increased compared with the previous analysis. We have updated Figure 1h and the results on page 4:

      “We found that 18-33% of neurons in these cortical areas had area under the receiver-operating curve (AUC) values significantly different from 0.5, and therefore discriminated between tHit and tCR trials (Fig. 1h; S1: 28.8%, 177 neurons; S2: 17.9%, 162 neurons; MM: 32.9%, 140 neurons; ALM: 23.4%, 256 neurons; criterion to be considered significant: Bonferroni corrected 95% CI on AUC did not include 0.5 for at least 3 consecutive 10-ms time bins).”

      Next, we have checked how tHit-tCR selective neurons were distributed across sessions. We found that the percentage of tHit-tCR selective neurons in each session varied (S1: 9-46%, S2: 0-36%, MM:25-55%, ALM:0-50%). We examined the relationship between the numbers of tHit-tCR selective neurons and tHit-tCR subspace overlaps. Sessions with more neurons showing task rule modulation tended to show lower subspace overlap, but this correlation was modest and only marginally significant (r= -0.32, p= 0.08, Pearson correlation, n= 31 sessions). While we report the percentage of neurons showing significant selectivity as a simple way to summarize single-neuron effects, this does neglect the magnitude of task rule modulation of individual neurons, which may also be relevant. 

      In summary, the apparent disconnect between the effect sizes of task modulation of single neurons and of population dynamics could be explained by (1) the percentages of tHit-tCR selective neurons were underestimated in our old analysis, (2) tHit-tCR selective neurons were not uniformly distributed among sessions, and (3) the percentages of tHit-tCR selective neurons were weakly correlated with tHit-tCR subspace overlaps. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      For the analysis of choice coding dimensions, it seems that the authors are somewhat data limited in that they cannot compare lick-right/lick-left within a block. So instead, they compare lick/no lick trials. But given that the mice are unable to initiate trials, the interpretation of the no lick trials is a bit complicated. It is not clear that the no lick trials reflect a perceptual judgment about the stimulus (i.e., a choice), or that the mice are just zoning out and not paying attention. If it's the latter case, what the authors are calling choice coding is more of an attentional or task engagement signal, which may still be interesting, but has a somewhat different interpretation than a choice coding dimension. It might be worth clarifying this point somewhere, or if I'm totally off-base, then being more clear about why lick/no lick is more consistent with choice than task engagement.

      We thank the Reviewer for raising this point. We have added a new paragraph on page 13 to clarify why we used lick/no-lick trials to calculate choice coding dimensions, and we now discuss the caveat regarding task engagement.  

      “No-lick trials included misses, which could be caused by mice not being engaged in the task. While the majority of no-lick trials were correct rejections (respond-to-touch: 75%; respond-to-light: 76%), we treated no-licks as one of the available choices in our task and included them to calculate choice coding dimensions (Fig. S4c,d,f). To ensure stable and balanced task engagement across task rules, we removed the last 20 trials of each session and used stimulus parameters that achieved similar behavioral performance for both task rules (Fig. 1d; ~75% correct for both rules).”

      In addition, to address a point made by Reviewer 3 as well as this point, we performed a new analysis to calculate choice coding dimensions using right-lick vs left-lick trials. We report this new analysis on page 8:

      “Choice coding dimensions were obtained from left-lick and no-lick trials in respond-to-touch blocks and right-lick and no-lick trials in respond-to-light blocks. Because the required lick directions differed between the block types, the difference in choice CDs across task rules (Fig. S4f) could have been affected by the different motor responses. To rule out this possibility, we did a new version of this analysis using right-lick and left-lick trials to calculate the choice coding dimensions for both task rules. We found that the orientation of the choice coding dimension in a respond-to-touch block was still not aligned well with that in a respond-to-light block (Fig. S4h;  magnitude of dot product between the respond-to-touch choice CD and the respond-to-light choice CD, mean ± 95% CI for true vs shuffled data: S1: 0.39 ± [0.23, 0.55] vs 0.2 ± [0.1, 0.31], 10 sessions; S2: 0.32 ± [0.18, 0.46] vs 0.2 ± [0.11, 0.3], 8 sessions; MM: 0.35 ± [0.21, 0.48] vs 0.18 ± [0.11, 0.26], 9 sessions; ALM: 0.28 ± [0.17, 0.39] vs 0.21 ± [0.12, 0.31], 13 sessions).” 

      We added discussion of the limitations of this new analysis on page 13:

      “However, we also calculated choice coding dimensions using only right- and left-lick trials. In S1, S2, MM and ALM, the choice CDs calculated this way were also not aligned well across task rules (Fig. S4h), consistent with the results calculated from lick and no-lick trials (Fig. S4f). Data were limited for this analysis, however, because mice rarely licked to the unrewarded water port (# of licksunrewarded port  / # of lickstotal , respond-to-touch: 0.13, respond-to-light: 0.11). These trials usually came from rule transitions (Fig. 5a) and, in some cases, were potentially caused by exploratory behaviors. These factors could affect choice CDs.”

      The authors find that the stimulus coding direction in most areas (S1, S2, and MM) was significantly aligned between the block types. How do the authors interpret that finding? That there is no major change in stimulus coding dimension, despite the change in subspace? I think I'm missing the big picture interpretation of this result.

      That there is no significant change in stimulus coding dimensions but a change in subspace suggests that the subspace change largely reflects a change in the choice coding dimensions.

      As I mentioned in the public review, I thought there was a weakness with interpretation of the optogenetic experiments, which the authors generally interpret as reflecting rule sensitivity. However, given that they are inhibiting premotor areas including ALM, one might imagine that there might also be an effect on lick production or kinematics. To rule this out, the authors compare the change in lick rate relative to licks during the ITI. What is the ITI lick rate? I assume pretty low, once the animal is welltrained, in which case there may be a floor effect that could obscure meaningful effects on lick production. In addition, based on the reported CI on delta p(lick), it looks like MM and AM did suppress lick rate. I think in the future, a task with richer behavioral read-outs (or including other measurements of behavior like video), or perhaps something like a psychological process model with parameters that reflect different perceptual or cognitive processes could help resolve the effects of perturbations more precisely.

      Eighteen and ten percent of trials had at least one lick in the ITI in respond-to-touch and  respond-tolight blocks, respectively. These relatively low rates of ITI licking could indeed make an effect of optogenetics on lick production harder to observe. We agree that future work would benefit from more complex tasks and measurements, and have added the following to make this point (page 14):

      “To more precisely dissect the effects of perturbations on different cognitive processes in rule-dependent sensory detection, more complex behavioral tasks and richer behavioral measurements are needed in the future.”

      Reviewer #2 (Recommendations For The Authors):

      I have the following minor suggestions that the authors might consider in revising this already excellent manuscript :

      (1) In addition to showing normalised z-score firing rates (e.g. Fig 1g), I think it is important to show the grand-average mean firing rates in Hz.

      We thank the Reviewer for the suggestion and have added the grand-average mean firing rates as a new supplementary figure (Fig. S2a). To provide more details about the firing rates of individual neurons, we have also added to this new figure the distribution of peak responses during the tactile stimulus period (Fig. S2b).

      (2) I think the authors could report more quantitative data in the main text. As a very basic example, I could not easily find how many neurons, sessions, and mice were used in various analyses.

      We have added relevant numbers at various points throughout the Results, including within the following examples:

      Page 3: “To examine how the task rules influenced the sensorimotor transformation occurring in the tactile processing stream, we performed single-unit recordings from sensory and motor cortical areas including S1, S2, MM and ALM (Fig. 1e-g, Fig. S1a-h, and Fig. S2a; S1: 6 mice, 10 sessions, 177 neurons, S2: 5 mice, 8 sessions, 162 neurons, MM: 7 mice, 9 sessions, 140 neurons, ALM: 8 mice, 13 sessions, 256 neurons).”

      Page 5: “As expected, single-unit activity before stimulus onset did not discriminate between tactile and visual trials (Fig. 2d; S1: 0%, 177 neurons; S2: 0%, 162 neurons; MM: 0%, 140 neurons; ALM: 0.8%, 256 neurons). After stimulus onset, more than 35% of neurons in the sensory cortical areas and approximately 15% of neurons in the motor cortical areas showed significant stimulus discriminability (Fig. 2e; S1: 37.3%, 177 neurons; S2: 35.2%, 162 neurons; MM: 15%, 140 neurons; ALM: 14.1%, 256 neurons).”

      Page 6: “Support vector machine (SVM) and Random Forest classifiers showed similar decoding abilities

      (Fig. S3a,b; medians of classification accuracy [true vs shuffled]; SVM: S1 [0.6 vs 0.53], 10 sessions, S2

      [0.61 vs 0.51], 8 sessions, MM [0.71 vs 0.51], 9 sessions, ALM [0.65 vs 0.52], 13 sessions; Random

      Forests: S1 [0.59 vs 0.52], 10 sessions, S2 [0.6 vs 0.52], 8 sessions, MM [0.65 vs 0.49], 9 sessions, ALM [0.7 vs 0.5], 13 sessions).”

      Page 6: “To assess this for the four cortical areas, we quantified how the tHit and tCR trajectories diverged from each other by calculating the Euclidean distance between matching time points for all possible pairs of tHit and tCR trajectories for a given session and then averaging these for the session (Fig. 4a,b; S1: 10 sessions, S2: 8 sessions, MM: 9 sessions, ALM: 13 sessions, individual sessions in gray and averages across sessions in black; window of analysis: -100 to 150 ms relative to stimulus onset; 10 ms bins; using the top 3 PCs; Methods).” 

      Page 8: “In contrast, we found that S1, S2 and MM had stimulus CDs that were significantly aligned between the two block types (Fig. S4e; magnitude of dot product between the respond-to-touch stimulus CDs and the respond-to-light stimulus CDs, mean ± 95% CI for true vs shuffled data: S1: 0.5 ± [0.34, 0.66] vs 0.21 ± [0.12, 0.34], 10 sessions; S2: 0.62 ± [0.43, 0.78] vs 0.22 ± [0.13, 0.31], 8 sessions; MM: 0.48 ± [0.38, 0.59] vs 0.24 ± [0.16, 0.33], 9 sessions; ALM: 0.33 ± [0.2, 0.47] vs 0.21 ± [0.13, 0.31], 13 sessions).”  Page 9: “For respond-to-touch to respond-to-light block transitions, the fractions of trials classified as respond-to-touch for MM and ALM decreased progressively over the course of the transition (Fig. 5d; rank correlation of the fractions calculated for each of the separate periods spanning the transition, Kendall’s tau, mean ± 95% CI: MM: -0.39 ± [-0.67, -0.11], 9 sessions, ALM: -0.29 ± [-0.54, -0.04], 13 sessions; criterion to be considered significant: 95% CI on Kendall’s tau did not include 0).

      Page 11: “Lick probability was unaffected during S1, S2, MM and ALM experiments for both tasks, indicating that the behavioral effects were not due to an inability to lick (Fig. 6i, j; 95% CI on Δ lick probability for cross-modal selection task: S1/S2 [-0.18, 0.24], 4 mice, 10 sessions; MM [-0.31, 0.03], 4 mice, 11 sessions; ALM [-0.24, 0.16], 4 mice, 10 sessions; Δ lick probability for simple tactile detection task: S1/S2 [-0.13, 0.31], 3 mice, 3 sessions; MM [-0.06, 0.45], 3 mice, 5 sessions; ALM [-0.18, 0.34], 3 mice, 4 sessions).”

      (3) Please include a clearer description of trial timing. Perhaps a schematic timeline of when stimuli are delivered and when licking would be rewarded. I may have missed it, but I did not find explicit mention of the timing of the reward window or if there was any delay period.

      We have added the following (page 3): 

      “For each trial, the stimulus duration was 0.15 s and an answer period extended from 0.1 to 2 s from stimulus onset.”

      (4) Please include a clear description of statistical tests in each figure legend as needed (for example please check Fig 4e legend).

      We have added details about statistical tests in the figure legends:

      Fig. 2f: “Relationship between block-type discriminability before stimulus onset and tHit-tCR discriminability after stimulus onset for units showing significant block-type discriminability prior to the stimulus. Pearson correlation: S1: r = 0.69, p = 0.056, 8 neurons; S2: r = 0.91, p = 0.093, 4 neurons; MM: r = 0.93, p < 0.001, 30 neurons; ALM: r = 0.83, p < 0.001, 26 neurons.” 

      Fig. 4e: “Subspace overlap for control tHit (gray) and tCR (purple) trials in the somatosensory and motor cortical areas. Each circle is a subspace overlap of a session. Paired t-test, tCR – control tHit: S1: -0.23, 8 sessions, p = 0.0016; S2: -0.23, 7 sessions, p = 0.0086; MM: -0.36, 5 sessions, p = <0.001; ALM: -0.35, 11 sessions, p < 0.001; significance: ** for p<0.01, *** for p<0.001.”  

      Fig. 5d,e: “Fraction of trials classified as coming from a respond-to-touch block based on the pre-stimulus population state, for trials occurring in different periods (see c) relative to respond-to-touch → respondto-light transitions. For MM (top row) and ALM (bottom row), progressively fewer trials were classified as coming from the respond-to-touch block as analysis windows shifted later relative to the rule transition. Kendall’s tau (rank correlation): MM: -0.39, 9 sessions; ALM: -0.29, 13 sessions. Left panels: individual sessions, right panels: mean ± 95% CI. Dash lines are chance levels (0.5). e, Same as d but for respond-to-light → respond-to-touch transitions. Kendall’s tau: MM: 0.37, 9 sessions; ALM: 0.27, 13 sessions.”

      Fig. 6: “Error bars show bootstrap 95% CI. Criterion to be considered significant: 95% CI did not include 0.”

      (5) P. 3 - "To examine how the task rules influenced the sensorimotor transformation occurring in the tactile processing stream, we performed single-unit recordings from sensory and motor cortical areas including S1, S2, MM, and ALM using 64-channel silicon probes (Fig. 1e-g and Fig. S1a-h)." Please specify if these areas were recorded simultaneously or not.

      We have added “We recorded from one of these cortical areas per session, using 64-channel silicon probes.”  on page 3.  

      (6) Figure 4b - Please describe what gray and black lines show.

      The gray traces are the distance between tHit and tCR trajectories in individual sessions and the black traces are the averages across sessions in different cortical areas. We have added this information on page 6 and in the Figure 4b legend. 

      Page 6: “To assess this for the four cortical areas, we quantified how the tHit and tCR trajectories diverged from each other by calculating the Euclidean distance between matching time points for all possible pairs of tHit and tCR trajectories for a given session and then averaging these for the session (Fig. 4a,b; S1: 10 sessions, S2: 8 sessions, MM: 9 sessions, ALM: 13 sessions, individual sessions in gray and averages across sessions in black; window of analysis: -100 to 150 ms relative to stimulus onset; 10 ms bins; using the top 3 PCs; Methods).

      Fig. 4b: “Distance between tHit and tCR trajectories in S1, S2, MM and ALM. Gray traces show the time varying tHit-tCR distance in individual sessions and black traces are session-averaged tHit-tCR distance (S1:10 sessions; S2: 8 sessions; MM: 9 sessions; ALM: 13 sessions).”

      (7) In addition to the analyses shown in Figure 5a, when investigating the timing of the rule switch, I think the authors should plot the left and right lick probabilities aligned to the timing of the rule switch time on a trial-by-trial basis averaged across mice.

      We thank the Reviewer for suggesting this addition. We have added a new figure panel to show the probabilities of right- and left-licks during rule transitions (Fig. 5a).

      Page 8: “The probabilities of right-licks and left-licks showed that the mice switched their motor responses during block transitions depending on task rules (Fig. 5a, mean ± 95% CI across 12 mice).” 

      (8) P. 12 - "Moreover, in a separate study using the same task (Finkel et al., unpublished), high-speed video analysis demonstrated no significant differences in whisker motion between respond-to-touch and respond-to-light blocks in most (12 of 14) behavioral sessions.". Such behavioral data is important and ideally would be included in the current analysis. Was high-speed videography carried out during electrophysiology in the current study?

      Finkel et al. has been accepted in principle for publication and will be available online shortly. Unfortunately we have not yet carried out simultaneous high-speed whisker video and electrophysiology in our cross-modal sensory selection task.

      Reviewer #3 (Recommendations For The Authors):

      (1) Minor point. For subspace overlap calculation of pre-stimulus activity in Fig 4e (light purple datapoints), please clarify whether the PCs for that condition were constructed in matched time windows. If the PCs are calculated from the stimulus period 0-150ms, the poor alignment could be due to mismatched time windows.

      We thank the Reviewer for the comment and clarify our analysis here. We previously used timematched windows to calculate subspace overlaps. However, the pre-stimulus activity was much weaker than the activity during the stimulus period, so the subspaces of reference tHit were subject to noise and we were not able to obtain reliable PCs. This caused the subspace overlap values between the reference tHit and control tHit to be low and variable (mean ± SD, S1:  0.46± 0.26, n = 8 sessions, S2: 0.46± 0.18, n = 7 sessions, MM: 0.44± 0.16, n = 5 sessions, ALM: 0.38± 0.22, n = 11 sessions).  Therefore, we used the tHit activity during the stimulus window to obtain PCs and projected pre-stimulus and stimulus activity in tCR trials onto these PCs. We have now added a more detailed description of this analysis in the Methods (page 32). 

      “To calculate the separation of subspaces prior to stimulus delivery, pre-stimulus activity in tCR trials (100 to 0 ms from stimulus onset) was projected to the PC space of the tHit reference group and the subspace overlap was calculated. In this analysis, we used tHit activity during stimulus delivery (0 to 150 ms from stimulus onset) to obtain reliable PCs.”   

      We acknowledge this time alignment issue and have now removed the reported subspace overlap between tHit and tCR during the pre-stimulus period from Figure 4e (light purple). However, we think the correlation between pre- and post- stimulus-onset subspace overlaps should remain similar regardless of the time windows that we used for calculating the PCs. For the PCs calculated from the pre-stimulus period (-100 to 0 ms), the correlation coefficient was 0.55 (Pearson correlation, p <0.01, n = 31 sessions). For the PCs calculated from the stimulus period (0-150 ms), the correlation coefficient was 0.68 (Figure 4f, Pearson correlation, p <0.001, n = 31 sessions). Therefore, we keep Figure 4f.  

      (2) Minor point. To help the readers follow the logic of the experiments, please explain why PPC and AMM were added in the later optogenetic experiment since these are not part of the electrophysiology experiment.

      We have added the following rationale on page 9.

      “We recorded from AMM in our cross-modal sensory selection task and observed visually-evoked activity (Fig. S1i-k), suggesting that AMM may play an important role in rule-dependent visual processing. PPC contributes to multisensory processing51–53 and sensory-motor integration50,54–58.  Therefore, we wanted to test the roles of these areas in our cross-modal sensory selection task.”

      (3) Minor point. We are somewhat confused about the timing of some of the example neurons shown in figure S1. For example, many neurons show visually evoked signals only after stimulus offset, unlike tactile evoked signals (e.g. Fig S1b and f). In addition, the reaction time for visual stimulus is systematically slower than tactile stimuli for many example neurons (e.g. Fig S1b) but somehow not other neurons (e.g. Fig S1g). Are these observations correct?

      These observations are all correct. We have a manuscript from a separate study using this same behavioral task (Finkel et al., accepted in principle) that examines and compares (1) the onsets of tactile- and visually-evoked activity and (2) the reaction times to tactile and visual stimuli. The reaction times to tactile stimuli were slightly but significantly shorter than the reaction times to visual stimuli (tactile vs visual, 397 ± 145 vs 521 ± 163 ms, median ± interquartile range [IQR], Tukey HSD test, p = 0.001, n =155 sessions). We examined how well activity of individual neurons in S1 could be used to discriminate the presence of the stimulus or the response of the mouse. For discriminability for the presence of the stimulus, S1 neurons could signal the presence of the tactile stimulus but not the visual stimulus. For discriminability for the response of the mouse, the onsets for significant discriminability occurred earlier for tactile compared with visual trials (two-sided Kolmogorov-Smirnov test, p = 1x10-16, n = 865 neurons with DP onset in tactile trials, n = 719 neurons with DP onset in visual trials).

    2. eLife assessment

      This important work advances our understanding of how brains flexibly gate actions in different contexts, a topic of great interest to the broader field of systems neuroscience. Recording neural activity from several sensory and motor cortical areas along a sensorimotor pathway, the authors found that preparatory activity in motor cortical areas of the mouse depends on the context in which an action will be carried out, consistent with previous theoretical and experimental work. Furthermore, the authors provide causal evidence that these changes support flexible gating of actions. The carefully carried out experiments were analyzed using state-of-the-art methodology and provide convincing conclusions.

    3. Reviewer #1 (Public Review):

      Summary:

      Using a cross-modal sensory selection task in head-fixed mice, the authors attempted to characterize how different rules reconfigured representations of sensory stimuli and behavioral reports in sensory (S1, S2) and premotor cortical areas (medial motor cortex or MM, and ALM). They used silicon probe recordings during behavior, a combination of single-cell and population-level analyses of neural data, and optogenetic inhibition during the task.

      Strengths:

      A major strength of the manuscript was the clarity of the writing and motivation for experiments and analyses. The behavioral paradigm is somewhat simple but well-designed and well-controlled. The neural analyses were sophisticated, clearly presented, and generally supported the authors' interpretations. The statistics are clearly reported and easy to interpret. In general, my view is that the authors achieved their aims. They found that different rules affected preparatory activity in premotor areas, but not sensory areas, consistent with dynamical systems perspectives in the field that hold that initial conditions are important for determining trial-based dynamics.

      I think this is a well-performed, well-written and interesting study that shows differences in rule representations in sensory and premotor areas, and finds that rules reconfigure preparatory activity in motor cortex to support flexible behavior.

    4. Reviewer #2 (Public Review):

      Summary:

      Chang et al. investigated neuronal activity firing patterns across various cortical regions in an interesting context-dependent tactile vs visual detection task, developed previously by the authors (Chevee et al., 2021; doi: 10.1016/j.neuron.2021.11.013). The authors report the important involvement of a medial frontal cortical region (MM, probably a similar location to wM2 as described in Esmaeili et al., 2021 & 2022; doi: 10.1016/j.neuron.2021.05.005; doi: 10.1371/journal.pbio.3001667) in mice for determining task rules.

      Strengths:

      The experiments appear to have been well carried out and the data well analysed. The manuscript clearly describes the motivation for the analyses and reaches clear and well-justified conclusions. I find the manuscript interesting and exciting!

      Weaknesses:

      I did not find any major weaknesses.

    5. Reviewer #3 (Public Review):

      Summary:

      This study examines context-dependent stimulus selection by recording neural activity from several sensory and motor cortical areas along a sensorimotor pathway, including S1, S2, MM, and ALM. Mice are trained to either withhold licking or perform directional licking in response to visual or tactile stimulus. Depending on the task rule, the mice must respond to one stimulus modality while ignoring the other. Neural activity to the same tactile stimulus is modulated by task in all the areas recorded, with significant activity changes in a subset of neurons and population activity occupying distinct activity subspaces. Recordings further reveal a contextual signal in the pre-stimulus baseline activity that differentiates task context. This signal is correlated with subsequent task modulation of neural activity. Comparison across brain areas shows that this contextual signal is stronger in frontal cortical regions than sensory regions. Analyses link this signal to behavior by showing that it tracks the behavioral performance switch during task rule transitions. Silencing activity in frontal cortical regions during the baseline period impairs behavioral performance.

      Strengths:

      This is a carefully done study with solid results and thorough controls. The authors identify a contextual signal in baseline neural activity that predicts rule-dependent decision-related activity. The comprehensive characterization across a sensorimotor pathway is another strength. Analyses and perturbation experiments link this contextual signal to animals' behavior. The results provide a neural substrate that will surely inspire follow-up mechanistic investigations.

      Weaknesses:

      None. The authors have further improved the manuscript during the revision with additional analyses.

      Impact:

      This study reports an important neural signature for context-dependent decision-making that has important implications for mechanisms of context-dependent neural computation in general.

    1. eLife assessment

      This fundamental study provides insights into the interplay of endogenous orienting and the planning of goal-directed gaze shifts (saccades). Using an elegant experimental protocol and detailed analyses of the time course of saccadic choices, the authors provide compelling evidence for independent mechanisms that guide early, reflexive eye movements and later, voluntary gaze shifts. This work will be of interest to neuroscientists and psychologists working on vision and motor control and to those researching decision-making across disciplines.

    2. Reviewer #1 (Public Review):

      Summary:

      The classical pro/antisaccade task has become a valuable diagnostic tool in neurology and psychiatry (Antoniades et al., 2013, Vision Res). Although it is well-established that antisaccades require substantially longer latencies than prosaccades, the exact attentional mechanisms underlying these differences are not yet fully elucidated. This study investigates the separate influences of exogenous and endogenous attention on saccade generation. These two mechanisms are often confounded in classical pro/antisaccade tasks. In the current study, the authors build on their previous work using an urgent choice task (Salinas et al., 2019, eLife) to time-resolve the influences of exogenous and endogenous factors on saccade execution. The key contribution of the current study is to show that, when controlling for exogenous capture, antisaccades continue to require longer processing times. This longer processing time may be explained by a coupling between endogenous attention and saccade motor plans.

      Strengths:

      In the classical pro/antisaccade task the direction of exogenous capture (caused by the presentation of the cue) is typically congruent with the direction of prosaccades and incongruent with antisaccades. A key strength of the current study is the introduction of different experimental conditions that control for the effects of exogenous capture on saccade generation. In particular, Experiments 3 and 4 provide strong evidence for two independent (exogenous and endogenous) mechanisms that guide saccadic choices, acting at different times. Differences in timing for pro and antisaccades during the endogenous phase were consistent and independent of whether the exogenous capture biased early saccades toward the correct prosaccade direction or toward the correct antisaccade directions.

      As in previous studies by the same group (Salinas et al., 2019, eLife; Goldstein et al., 2023, eLife), the detailed analysis of the time course of goal-directed saccades allowed the authors to determine the exact, additional time of 30 ms that is necessary to generate a correct antisaccade versus prosaccade.

      Overall, the manuscript is very well written, and the data are presented clearly.

      Weaknesses:

      The main research question could be defined more clearly. In the abstract and at some points throughout the manuscript, the authors indicate that the main purpose of the study was to assess whether the allocation of endogenous attention requires saccade planning [e.g., ll.3-5 or ll.247-248]. While the data show a coupling between endogenous attention and saccades, they do not point to a specific direction of this coupling (i.e., whether endogenous attention is necessary to successfully execute a saccade plan or whether a saccade plan necessarily accompanies endogenous attention).

      Some of the analyses were performed only on subgroups of the participants. The reporting of these subgroup analyses is transparent and data from all participants are reported in the supplementary figures. Still, these subgroup analyses may make the data appear more consistent, compared to when data is considered across all participants. For instance, the exogenous capture in Experiments 1 and 2 appears much weaker in Figure 2 (subgroup) than Figure S3 (all participants). Moreover, because different subgroups were used for different analyses, it is often difficult to follow and evaluate the results. For instance, the tachometric curves in Figure 2 (see also Figure 3 and 4) show no motor bias towards the cue (i.e., performance was at ~50% for rPTs <75 ms). I assume that the subsequent analyses of the motor bias were based on a very different subgroup. In fact, based on Figure S2, it seems that the motor bias was predominantly seen in the unreliable participants. Therefore, I often found the figures that were based on data across all participants (Figures 7 and S3) more informative to evaluate the overall pattern of results.

    3. Reviewer #2 (Public Review):

      Goldstein et al. provide a thorough characterization of the interaction of attention and eye movement planning. These processes have been thought to be intertwined since at least the development of the Premotor Theory of Attention in 1987, and their relationship has been a continual source of debate and research for decades. Here, Goldstein et al. capitalize on their novel urgent saccade task to dissociate the effects of endogenous and exogenous attention on saccades towards and away from the cue. They find that attention and eye movements are, to some extent, linked to one another but that this link is transient and depends on the nature of the task. A primary strength of the work is that the researchers are able to carefully measure the timecourse of the interaction between attention and eye movements in various well-controlled experimental conditions. As a result, the behavioral interplay of two forms of attention (endogenous and exogenous) is illustrated at the level of tens of milliseconds as they interact with the planning and execution of saccades towards and away from the cued location. Overall, the results allow the authors to make meaningful claims about the time course of visual behavior, attention, and the potential neural mechanisms at a timescale relevant to everyday human behavior.

    4. Reviewer #3 (Public Review):

      Summary and overall evaluation:

      Human vision is inherently limited so that only a small part of a visual scene can be perceived at a given moment. To address this limitation, the visual system has evolved a number of strategies and mechanisms that work in concert. First, humans move their eyes using saccadic eye movements. This allows us to place the high-resolution region in the center of the eye's retina (the fovea centralis) on objects of interest so that these are sampled with high acuity. Second, salient, conspicuous stimuli that appear abruptly and/or differ strongly from the other stimuli in the scene, seem to automatically attract ("exogenous") attention, so that a large share of the neuronal "resources" for visual processing is devoted to the stimuli, which improves the perception of the stimuli. Third, stimuli that are important for the current task and the current behavioral goals can be prioritized by attention mechanisms ("endogenous" attention), which also secures their allocated share of processing resources and helps them be perceived. It is well-established that eye movements are closely linked to the mechanisms of attention (for a review, see Carrasco, 2011, cited in the manuscript). However, it is still unclear what role voluntary, endogenous attention plays in the control of saccadic eye movements.

      The present study used an experimental procedure involving time-pressure for responding, in order to uncover how the control of saccades by exogenous and endogenous attention unfolds over time. The findings of the study indicate that saccade planning was indeed influenced by the locus of endogenous attention, but that this influence was short-lasting and could be overcome quickly. Taken together, the present findings reveal new dynamics between endogenous attention and eye movement control, and lead the way for studying them using experiments under time pressure.

      The results provided by the present study advance our understanding of vision, eye movements, and their control by brain mechanisms for attention. In addition, they demonstrate how tasks involving time pressure can be used to study the dynamics of cognitive processes. Therefore, the present study seems highly important not only for vision science, but also for psychology, (cognitive) neuroscience, and related research fields more generally.

      Strengths:

      The experiments of the study are performed with great care and rigor and the data is analyzed thoroughly and comprehensively. Overall, the results support the authors' conclusions, so I have only minor comments (see below). Taken together, the findings seem important for a wide community of researchers in vision science, psychology, and neuroscience.

      Weaknesses (minor points):

      (1) In this experimental paradigm, participants must decide where to saccade based on the color of the cue in the visual periphery (they should have made a prosaccade toward a green cue and an antisaccade away from a magenta cue). Thus, irrespective of whether the cue signaled that a prosaccade or an antisaccade was to be made, the identity of the cue was always essential for the task (as the authors explain on p. 5, lines 129-138). Also, the location where the cue appeared was blocked, and thus known to the participants in advance, so that endogenous attention could be directed to the cue at the beginning of a trial (e.g., p. 5, lines 129-132). These aspects of the experimental paradigm differ from the classic prosaccade/antisaccade paradigm (e.g. Antoniades et al., 2013, Vision Research). In the classic paradigm, the identity of the cues does not have to be distinguished to solve the task, since there is only one stimulus that should be looked at (prosaccade) or away from (antisaccade), and whether a prosaccade or antisaccade was required is constant across a block of trials. Thus, in contrast to the present paradigm, in the classic paradigm, the participants do not know where the cue is about to appear, but they know whether to perform a prosaccade or an antisaccade based on the location of the cue.

      The present paradigm keeps the location of the cue constant in a block of trials by intention, because this ensures that endogenous attention is allocated to its location and is not overpowered by the exogenous capture of attention that would happen when a single stimulus appeared abruptly in the visual field. Thus, the reason for keeping the location of the cue constant seems convincing. However, I wondered what consequences the constant location would have for the task representations that persist across the task and govern how attention is allocated. In the classic paradigm, there is always a single stimulus that captures attention exogenously (as it appears abruptly). In a prosaccade block, participants can prioritize the visual transient caused by the stimulus, and follow it with a saccade to its coordinates. In an antisaccade block, following the transient with a saccade would always be wrong, so that participants could try to suppress the attention capture by the transient, and base their saccade on the coordinates of the opposite location. Thus, in prosaccade and antisaccade blocks, the task representations controlling how visual transients are processed to perform the task differ. In the present task, prosaccades and antisaccades cannot be distinguished by the visual transients. Thus, such a situation could favor endogenous attention and increase its influence on saccade planning, even though saccade planning under more naturalistic conditions would be dominated by visual transients. I suggest discussing how this (and vice versa the emphasis on visual transients in the classic paradigm) could affect the generality of the presented findings (e.g., how does this relate to the interpretation that saccade plans are obligatorily coupled to endogenous attention? See, Results, p. 10, lines 306-308, see also Deubel & Schneider, 1996, Vision Research).

      (2) Discussion (p. 16, lines 472-475): The authors suppose that "It is as if the exogenous response was automatically followed by a motor bias in the opposite direction. Perhaps the oculomotor circuitry is such that an exogenous signal can rapidly trigger a saccade, but if it does not, then the corresponding motor plan is rapidly suppressed regardless of anything else.". I think this interesting point should be discussed in more detail. Could it also be that instead of suppression, other currently active motor plans were enhanced? Would this involve attention? Some attention models assume that attention works by distributing available (neuronal) processing resources (e.g., Desimone & Duncan, 1995, Annual Review of Neuroscience; Bundesen, 1990, Psychological Review; Bundesen et al., 2005, Psychological Review) so that the information receiving the largest share of resources results in perception and is used for action, but this happens without the active suppression of information.

      (3) Methods, p. 19, lines 593-596: It is reported that saccades were scored based on their direction. I think more information should be provided to understand which eye movements entered the analysis. Was there a criterion for saccade amplitude? I think it would be very helpful to provide data on the distributions of saccade amplitudes or on their accuracy (e.g. average distance from target) or reliability (e.g. standard deviation of landing points). Also, it is reported that some data was excluded from the analysis, and I suggest reporting how much of the data was excluded. Was the exclusion of the data related to whether participants were "reliable" or "unreliable" performers?

      (4) Results, p. 9, lines 262-266: Some data analyses are performed on a subset of participants that met certain performance criteria. The reasons for this data selection seem convincing (e.g. to ensure empirical curves were not flat, line 264). Nevertheless, I suggest to explain and justify this step in more detail. In addition, if not all participants achieved an acceptable performance and data quality, this could also speak to the experimental task and its difficulty. Thus, I suggest discussing the potential implications of this, in particular, how this could affect the studied mechanisms, and whether it could limit the presented findings to a special group within the studied population.

    1. Author response:

      [The following is the authors’ response to the current reviews.]

      In response to Reviewer #2, we agree with the reviewer that it needs to be noted that not all forms of recognition are the same and have added the following: "However, we note that not all forms of recognition are the same; researchers may prefer to have their work featured instead of personal stories or critiques of the scientific environment."


      [The following is the authors’ response to the previous reviews.]

      We thank both reviewers for their detailed comments and insightful suggestions. Below we summarize our responses to each concern in addition to the edits within the manuscript.

      We would also like to add a clarification to the eLife assessment, it states “This important bibliometric analysis shows that authors of scientific papers whose names suggest they are female or East Asian get quoted less often in news stories about their work.” We show that individuals with names predicted to be from women or East Asian name origins are less likely to be quoted or mentioned in Nature’s scientific news stories than expected by publication demographics. In this study, we did not compare the level of coverage of a scientific article by the demographics of the authors of the article.

      Reviewer #1

      The article is not so clearly structured, which makes it hard to follow. A better framing, contextualization, and conceptualization of their analysis would help the readers to better understand the results. There are some unclear definitions and wrong wording of key concepts.

      We have adapted our wording in the text and added a more detailed discussion which hopefully makes the paper easier to comprehend. These changes are described in the context of your reviewer's suggestions and addressed in the next section.

      Language use: Male/Female refers to sex, not to gender.

      We have now updated the language throughout the text. Thank you for pointing this out.

      Regional disparities are not the same as names' origin. While the first might relate to the academic origin of authors, inferred from their institutional belonging, the latter reflects the authors' inferred identity. Ethnic identities and the construction of prejudice against specific populations need proper contextualization.

      We have added better contextualization in the manuscript and reworded the section in our results and discussion to clarify that we are analyzing disparities related to perceived ethnicity and not regions. We also added the following text to the results section “In our analysis, we use name origin as an estimate for the perceived ethnicity of a primary source by a journalist. Our prediction is not intended to assign ethnicity to an individual, but to be used broadly as a tool to quantify representational differences in a journalist's sociologically constructed perception of a primary source's ethnicity.” We also added the following text to our Discussion: “Our use of name origins is a proxy for a journalist's or referring scholarly peer’s potential perceptions of the ethnicity of a primary source as signaled by an individual's name. We do not intend to assign an identity to an individual, but to generate a broad metric to measure possible bias for particular ethnicities during journalists' primary source gathering.”

      It would be helpful to have a clear definition of what are quotes, mentions, and citations. For me, it was not so clear and made understanding the results more difficult.

      We added the following text to the results section Extracted Data Used for Analysis: “Quoted names are any names that were attached to a quote within the article. Mentioned names are any names that were stated within the article. Cited names are all author names of a scientific paper that was cited in the news article.”

      The comparison against Nature published research articles is not perfect because journalists will also cover articles not published in Nature. If for example, the gender representation in the quoted articles is not the same between Nature journals and other journals, then this source of inequality would be missing (e.g. if the journalists are biased against women, but not as much when they published in Nature, because they are also biased towards Nature articles). Also, the gender representation among Nature authors could not be the same as in general. Nevertheless, this seems to be a fair benchmark, especially if the authors did not have access to other more comprehensive databases. But a statement of limitations including these potential issues would be good to have.

      To add better context to the generalizability of our work, we added the following text to our discussion: “Furthermore, the news articles present on "www.nature.com" are intended for a very specific readership that may not be reflective of more broad scientific news outlets. In a separate analysis, we took a cursory look into a comparison with The Guardian and found similar disparities in gender and name origin. However, it is not clear which publications should be used as a comparator for science-related articles in The Guardian, and difficult to compare relative rates of representation. While other science news outlets may not have a direct comparator, it would be useful to take a broad comparison across multiple science news outlets to compare against one another. Our existing pipeline could be easily applied to other science news outlets and identify if there exists a consistent pattern of disparity regardless of the intended readership.”

      "we select the highest probability origin for each name as the resultant assignment". Threshold based approaches for race/ethnicity name-based inference have been criticized by the literature as they might reproduce biases (see Kozlowski, D., Murray, D. S., Bell, A., Hulsey, W., Larivière, V., Monroe-White, T., & Sugimoto, C. R. (2022). Avoiding bias when inferring race using name-based approaches. Plos one, 17(3), e0264270.). The authors could use the full distribution of probabilities over names instead of selecting one. The formulae proposed (3-5) could be easily adapted to this change.

      We thank the author for pointing this out. We have updated our analysis to use the probabilities instead of hard assignments. Figure 3 and formulae 3-5 have been updated. While we observe a slight shift in the calculated values, the overall trends are unchanged.

      Is it possible to make an analysis that intersects both name origin and gender? I am not sure if the sample size would allow for this, but if some other dimensions were collapsed, it would be very important to show what happens at the intersection of these two dimensions of discrimination.

      We agree that identifying any differences in quotation patterns at the intersection of gender and name origin would be very useful to identify. To address this, we added supplemental table 5. This table identifies the number of quotes per predicted name origin and gender over all years and article types. In this table, we don’t see a significant difference in gender distribution across predicted name origins.

      Given a larger sample size, we would be able to better identify more subtle differences, but at this sample size, we cannot make more detailed inferences. Additionally, this also addresses a QC-issue, where predicted gender accuracy varies by name origin, specifically East Asian name origin. From our data, we don’t see a large difference in proportions across any name origin. We added the following text to the results section to incorporate this analysis:

      “However, it should be noted that the error rate varies by name origin with the largest decrease in performance on names with an Asian origin [@doi:10.7717/peerj-cs.156;@doi:10.5195/jmla.2021.1252]

      . In our analysis, we did not observe a large difference in names predicted to come from a man or woman between predicted East Asian and other name origins (Table 5). “

      The use of vocabulary should be more homogeneous. For example, in page 13 the authors start to use the concepts of over/under enrichment, which appeared before in a title but was not used.

      The text has been updated to remove all mentions of “over/under enrichment” with “over/under representation”

      In the discussions section, it would be important to see as a statement of limitations the problems that automatic origin and gender inference have.

      We thank the reviewer for this suggestion. We have added the following paragraph to our discussion.

      Computational tools enabled us to automatically analyze thousands of articles to identify existing disparities by gender and name origin, but these tools are not without limitations. Our tools are unable to identify non-binary people and rely on gender predictors that are known to have region-specific biases, with the largest decrease in performance on names of an Asian origin [@doi:10.7717/peerj-cs.156;@doi:10.5195/jmla.2021.1252]. Furthermore, name origin is only a proxy for externally perceived racial or ethnic origins of a source or author and is not as accurate as self-identified race or ethnicity. Self-identification better captures the lived experience of an individual that computational estimates from a name can not capture. This is highlighted in our inability to distinguish between Black and White people from the US by their names. As the collection of demographic data by publication outlets grows, we believe this will enable a more fine-grained and accurate analysis of disparities in scientific journalism.

      Figures 2a and 3a show that the affiliations of authors and their countries was going to be used in this analysis. Yet, this section is not present in the article. I would encourage the authors to add this to the analysis as it would show important patterns, and to intersect the dimensions of gender, name origin and country.

      We were interested in using this analysis in our work, but unfortunately the sample size of cited works in each country was too small to make inferences. If this work was extended to larger scientific outlets to include larger corpora such as The Guardian or New York Times, we think one could be able to make more robust inferences. Since our work only focuses on Nature, we decided not to include this analysis. However, we do include a section in our discussion for future work.

      “As a proxy for measuring possible geographical bias of a journalist, we attempted to identify if there was any geographical bias of cited authors. To do this, we identified the affiliation of each cited author and identified their affiliated country. Unfortunately, we could not robustly extract a large enough number of cited authors from different countries to make any conclusive statements. Expanding our work to other science journalism outlets could help identify possible ways in which geographic region, genders, and perceived ethnicity interact and affect scientific visibility of specific groups. While we are unable to identify that journalists have a specific geographical bias, having reporters explicitly focused on specific regional sources will broaden coverage of international opinions in science.”

      It is not clear at that point what column dependence means.

      The abstract has been updated to state, “Gender disparity in Nature quotes was dependent on the article type.”

      Reviewer #2

      We thank the reviewer for their very detailed and insightful suggestions regarding our analysis and the key caveats that needed better contextualization in our analysis. We went through each major point the reviewer brought up below and included any additional text that was needed.

      In some cases, the manuscript lacks consistency in terminology, and uses word choice that is strange (e.g., "enrichment" and "depletion" when discussion representation).

      We thank the review for pointing this out, we have removed all instances of depletion/enrichment for over/under-representation

      Caveats to Claim 1. So while Claim 1 holds, it does not hold for all comparator sets and for all years. I don't think this is critical of the paper-the authors do discuss the trend in Claim 2-but interpretation of this claim should take care of these caveats, and readers should consider the important differences in first and last authorship.

      We thank the reviewer for their detailed feedback on this section. We have added the missing contextualization of our results. In the results section, I changed the figure caption to: “Speakers predicted to be men are sometimes overrepresented in quotes, but this depends on the year and article type.” Added the following paragraph “When considering the relative proportion of authors and speakers predicted to be men, we only find a slight over-representation of men. This overrepresentation is dependent on the authorship position and the year. Before 2010, quotes predicted as from men are overrepresented in comparison to both first and last authors, but between 2010 and 2017 quotes predicted from men are only overrepresented in comparison for first authors. In 2020, we find a slight over-representation of quotes predicted to be from women relative to first and last authors, but still severely under-represented when considering the general population. The choice of comparison between first and last authors can reveal different aspects of the current state of academia. While this does not hold in all scientific fields, first authors are typically early career scientists and last authors are more senior scientists. It has also been shown that early career scientists tend to be more diverse than senior scientists [@doi:10.7554/eLife.60829; @doi:10.1096/fj.201800639]. Since we find that quotes are only slightly more likely to come from a last author, it is reasonable to compare the relative rate of predicted quotes from men to either authorship position. Comparison with last authorships may reveal more how gender bias currently exists whereas comparison with early career scientists may reveal bias in comparison to a future, more possibly diverse academic environment. We hope that increased representation and recognition of women in science, even beyond what is observed in authorship, can increase the proportion of women first and last authors such that it better reflects the general population.”

      Generalizability to other contexts of science journalism:

      We thank the reviewer for their feedback on the generalizability of our work. We have now added the following text to our discussion to provide the reader with a better context of our results: “To articles presented on "www.nature.com" are intended for a very specific readership that may not be reflective of more broad scientific news outlets. In a separate analysis, we took a cursory look into a comparison with The Guardian and found very similar disparities in gender and name origin. However, it is not clear which publications should be used as a comparator for science-related articles in The

      Guardian, and difficult to compare relative rates of representation. While other science news outlets may not have a direct comparator, it would be useful to take a broad comparison across multiple science news outlets to compare against one another. Our existing pipeline could be easily applied to other science news outlets and identify if there exists a consistent pattern of disparity regardless of the intended readership. ”

      Shallow discussion:

      The authors highlight gender parity in career features, but why exactly is there gender parity in this format

      We thank the reviewer for encouraging us to better contextualize our findings in the broader discourse. We have now added several sections to our Discussion. To address gender parity, we have added the following text: “This finding, coupled with the near equal number of articles written by journalists predicted to be men or women, argues for more diversity in topical coverage. "Career Feature" articles highlight current topics relevant to working scientists and frequently highlight systemic issues with the scientific environment. This column allows space for marginalized people to critique the current state of affairs in science or share their personal stories. This type of content encourages the journalist to seek out a diverse set of primary sources. Including more content that is not primarily focused on recent publications, but all topics surrounding the practice of science, can serve as an additional tool to rapidly achieve gender parity in journalistic recognition.”

      Representation in quotations varies by first and last author, most certainly as a result of the academic division of labor in the life sciences. However, what does it say about the scientific quotation that it appears first authors are more often to be quoted? Does this mean that the division of labor is changing such that the first authors are the lead scientists? Or does it imply that senior authors are being skipped over, or giving away their chance to comment on a study to the first author?

      We thank the reviewer for asking bringing up these important questions. We have added better context to our first author analysis in our discussion. We have included the following two sections to address this. Also, we want to state that we find last authors to be slightly more quoted than first authors, as depicted in Fig. 2d., with first author quotation percentage largely appearing below the red line. We included this text in a response above and include it again here for convenience.

      “Before 2010, quotes predicted as from men are overrepresented in comparison to both first and last authors, but between 2010 and 2017 quotes predicted from men are only overrepresented in comparison for first authors. In 2020, we find a slight over-representation of quotes predicted to be from women relative to first and last authors, but still severely under-represented when considering the general population. The choice of comparison between first and last authors can reveal different aspects of the current state of academia. While this does not hold in all scientific fields, first authors are typically early career scientists and last authors are more senior scientists. It has also been shown that early career scientists tend to be more diverse than senior scientists [@doi:10.7554/eLife.60829; @doi:10.1096/fj.201800639]. Since we find that quotes are only slightly more likely to come from a last author, it is reasonable to compare the relative rate of predicted quotes from men to either authorship position. Comparison with last authorships may reveal more how gender bias currently exists whereas comparison with early career scientists may reveal bias in comparison to a future, more possibly diverse academic environment. We hope that increased representation and recognition of women in science, even beyond what is observed in authorship, can increase the proportion of women first and last authors such that it better reflects the general population.”

      “In our analysis, we also find that there are more first authors with predicted East Asian name origin than last authors. This is in contrast to predicted Celtic/English and European name origins.

      Furthermore, we see that the amount of first author people with predicted East Asian name origins is increasing at a much faster rate than quotes are increasing. If this mismatched rate of representation continues, this could lead to an increasingly large erasure of early career scientists with East Asian name origins. As noted before, focusing on increasing engagement with early career scientists can help to reduce the growing disparity of public visibility of scientists with East Asian name origins.”

      What might be the downstream impacts on the public stemming from the under-representation of scientists with East Asian names? According to Figure 3d, not only are East Asian names under-represented in quotations, but they are becoming more under-represented over time as they appear as authors in a greater number of Nature publications; Those with European names are proportionately represented in quotations given their share of authors in Nature. Why might this be, especially seeing as Anglo names are heavily over-represented?

      To address this point, we have added the following text to our discussion: “In our analysis, we also find that there are more first authors with predicted East Asian name origin than last authors. This is in contrast to predicted Celtic/English and European name origins. Furthermore, the amount of first author people with predicted East Asian name origins is increasing at a much faster rate than quotes are increasing. If this mismatched rate of representation continues, this could lead to an increasingly large erasure of early career scientists with East Asian name origins. As noted before, focusing on increasing engagement with early career scientists can help to reduce the growing disparity of public visibility of scientists with East Asian name origins.”

      I am very confused by Figure 1B. It mixes the counts of News-related items with (non-Springer) research articles in a single stacked bar plot which makes determining the quantity of either difficult. I would advise splitting them out

      Figure 1B has been updated, and the News and Research articles have been separated.

      When querying the first 2000 or so results from the SpringerNature API, are the authors certain that they are getting a random sample of papers?

      These papers were the first 200 English language "Journal" papers returned by the Springer Nature API for each month, resulting in 2400 papers per year from 2005 through 2020. These papers are the first 200 papers published each month by a Springer Nature journal, which may not be completely random, but we believe to be a reasonably representative sample. Furthermore, the Springer Nature comparator set is being used as an additional comparator to the complete set of all Nature research papers used in our analyses.

      In all figures: the authors use capital letters to indicate panels in the caption, but lowercase letters in the figure itself and in the main text. This should be made consistent.

      This has been updated.

      In all figures: the authors should make the caption letter bold in the figure captions, which makes it much easier to find descriptions of specific panels

      This has been updated.

      In the section "coreNLP": the authors mention "co-reference resolution" but without really remarking why it is being used. This is an issue throughout the methods-the authors describe what method they are using but either they don't mention why they are using that method until later, or else not at all.

      We have added better reasoning behind our coreNLP selected methods: “We used the standard set of annotaters: tokenize, ssplit, pos, lemma, ner, parse, coref, and additionally the quote annotator. These perform text tokenization, sentence splitting, part of speech recognition, lemmatization, named entity recoginition, division of sentences into constituent phrases, co-reference resolution, and identification of quoted entities, respectively. We used the "statistical" algorithm to perform coreference resolution for speed. Each of these aspects is required to identify the names of quoted or mentioned speakers and identify any of their associated pronouns. All results were output to json format for further downstream processing.”

      We included a better description of scrapy: “Scrapy is a tool that applies user-defined rules to follow hyperlinks on webpages and return the information contained on each webpage.

      We used Scrapy to extract all web pages containing news articles and extract the text.”

      We also included our motivation for bootstrapping: “We used the boostrap method to construct confidence intervals for each of our calculated statistics.”

      In the section "Name Formatting for Gender Prediction in Quotes or Mentions", genderizeR is mentioned before an introduction to the tool

      We added the following text to provide context: “Even though genderizeR, the computational method used to predict the name's gender, only uses the first name to make the gender prediction, identifying the full name gives us greater confidence that we correctly identified the first name. “

      In the section "Name Formatting for Gender Prediction of Authors", you state that you exclude papers with only one author. How many papers is this? I assume few, in Nature, but if not I can imagine gender differences based on who writes first-authored papers.

      We find that the number excluded is roughly 7% of all papers, which is consistent across Nature and Springer Nature (1113/15013 for cited springer articles, 2899/42155 for random springer articles, 955/12459 for nature authors). We have added the following text to the manuscript for better context: “Roughly 7% of all papers were estimated to be by a single author and removed from this analysis.: 1113/15013 for cited Springer articles, 2899/42155 for random Springer articles, 955/12459 for Nature research articles.”

      In "Name Origin Analysis", for the in-text reference to Equation 3: include the prefix "Eq." or similar to mark this as referencing the equation and not something else

      This has been updated.

      The use of the word "enrichment" in reference to the representation of East Asian authors is strange and does not fit the colloquial definition of the term. I suggest just using a simpler term like "representation" instead

      Similarly, the authors use the word "depletion" to reflect the lower rate of quotes to scientists with East-Asian names, but I feel a simpler word would be more appropriate.

      We thank the reviewer for this suggestion, all instances of “enrichment/depletion” have been replaced with “over/under representation”

      The authors claim in Figure 2d that there is a steady increase in the rate of first author citations, however, this graph is not convincing. It appears to show much more noise than anything resembling a steady change.

      We have reworded our figure description to state that there is a consistent bias towards quoting last authors. Our figure description now states: “Panel d shows a consistent but slight bias towards quoting the last author of a cited article than the first author over time.”

      Supplemental Figures 1b and 1c do not seem to be mentioned in the main text, and I struggle to see their relevance.

      We thank the reviewer for identifying this error; these subpanels have been removed.

    2. eLife assessment

      This important bibliometric analysis shows that authors of scientific papers whose names suggest they are female or East Asian get quoted less often in news stories about their work. While caveats are inevitable in this type of study, the evidence for the authors' claims is convincing, with a rigorous, and importantly, reproducible analysis of over 20,000 articles from across 15 years. This paper will be of interest to science journalists and to researchers who study science communication.

    3. Reviewer #1 (Public Review):

      I thank the authors for addressing almost all my comments on the previous version of this manuscript, which studies the representation by gender and name origin of authors from Nature and Springer Nature articles in Nature News.

      The representation of author identities is an important step towards equality in science, and the authors found that women are underrepresented in news quotes and mentions with respect to the proportion of women authors.

      The research is rigorously conducted. It presents relevant questions and compelling answers. The documentation of the data and methods is thoroughly done, and the authors provide the code and data for reproduction.

    4. Reviewer #2 (Public Review):

      The authors have done well to address the points raised in my previous review.

      The updated version of this manuscript retains the technical competence of the first, but with important changes that make the analysis more legible and results better contextualized. Specifically, the discussion is richer, the interpretation of the results is more nuanced, the terminology is more precise, and issues of clarity related to the methodology and results have been resolved.

      Broad caveats remain about the nature of authorship, and who we should expect to be quoted in science journalism. Namely, who is the lead author? Ideally, the corresponding author would be included as well, or else some bibliometric definition of the most senior author on the byline. However, the authors' approach here is certainly adequate, and they did well to incorporate discussion of authorship and the scholarly division of labour in their discussion.

      In sum, I find the article greatly improved and a competent analysis into the unequal use of quotations in scientific journalism.

    1. Reviewer #2 (Public Review):

      This manuscript illustrates the power of "combined" research, incorporating a range of tools, both old and new to answer a question. This thorough approach identifies a novel target in a well-established signalling pathway and characterises a new player in Drosophila CNS development.

      Largely, the experiments are carried out with precision, meeting the aims of the project, and setting new targets for future research in the field. It was particularly refreshing to see the use of multi-omics data integration and Targeted DamID (TaDa) findings to triage scRNA-seq data. Some of the TaDa methodology was unorthodox, however, this does not affect the main finding of the study. The authors (in the revised manuscript) have appropriately justified their TaDa approaches and mentioned the caveats in the main text.

      Their discovery of Spar as a neuropeptide precursor downstream of Alk is novel, as well as its ability to regulate activity and circadian clock function in the fly. Spar was just one of the downstream factors identified from this study, therefore, the potential impact goes beyond this one Alk downstream effector.

    2. Reviewer #3 (Public Review):

      Summary:

      The receptor tyrosine kinase Anaplastic Lymphoma Kinase (ALK) in humans is nervous system expressed and plays an important role as an oncogene. A number of groups have been studying ALK signalling in flies to gain mechanistic insight into its various roles. In flies, ALK plays a critical role in development, particularly embryonic development and axon targeting. In addition, ALK was also shown to regulate adult functions including sleep and memory. In this manuscript, Sukumar et al., used a suite of molecular techniques to identify downstream targets of ALK signalling. They first used targeted DamID, a technique that involves a DNA methylase to RNA polymerase II, so that GATC sites in close proximity to PolII binding sites are marked. They performed these experiments in wild type and ALK loss of function mutants (using an Alk dominant negative ALkDN), to identify Alk responsive loci. Comparing these loci with a larval single cell RNAseq dataset identified neuroendocrine cells as an important site of Alk action. They further combined these TaDa hits with data from RNA seq in Alk Loss and Gain of Function manipulations to identify a single novel target of Alk signalling - a neuropeptide precursor they named Sparkly (Spar) for its expression pattern. They generated a mutant allele of Spar, raised an antibody against Spar, and characterised its expression pattern and mutant behavioural phenotypes including defects in sleep and circadian function.

      Strengths:

      The molecular biology experiments using TaDa and RNAseq were elegant and very convincing. The authors identified a novel gene they named Spar. They also generated a mutant allele of Spar (using CrisprCas technology) and raised an antibody against Spar. These experiments are lovely, and the reagents will be useful to the community. The paper is also well written, and the figures are very nicely laid out making the manuscript a pleasure to read.

      Weaknesses:

      The manuscript has improved very substantially in revision. The authors have clearly taken the comments on board in good faith.

      Editors' note: The authors have satisfactorily addressed the concerns raised in the previous rounds of review. These were related to the unconventional analysis of the TaDa data, the addition of other means of down regulated gene function, and the nature of analyses of behavioural data.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Point-by-point response to concerns raised by reviewer #3:

      The manuscript has improved very substantially in revision. The authors have clearly taken the comments on board in good faith. Yet, some small concerns remain around the behavioural analysis.

      In Fig. 8H and H' average sleep/day is ~100. Is this minutes of sleep? 100 min/day is far too low, is it a typo?

      The numbers for sleep bouts are also too low to me e.g. in Fig 9 number of sleep bouts avg around 4.

      In their response to reviewers the authors say these errors were fixed, yet the figures appear not to have been changed. Perhaps the old figures were left in inadvertently?

      Indeed this correction was somehow missed and we thank the reviewer for noticing this. We have now corrected Fig 8H-H’ and Fig 9D.  

      The circadian anticipatory activity analyses could also be improved. The standard in the field is to perform eduction analyses and quantify anticipatory activity e.g. using the method of Harrisingh et al. (PMID: 18003827). This typically computed as the ratio of activity in the 3hrs preceding light transition to activity in the 6hrs preceding light transition.

      In their response to reviewers, the authors have revised their anticipation analyses by quantifying the mean activity in the 6 hrs preceding light transition. However, in the method of Harrisingh et al., anticipation is the ratio of activity in the 3hrs preceding light transition to activity in the 6hrs preceding light transition. Simply computing the activity in the 6hrs preceding light transition does not give a measure of anticipation, determining the ratio is key.

      We acknowledge the importance of obtaining accurate results in our analysis, therefore we have re-evaluated the anticipation activity by measuring the ratio of the mean activity in the 3h preceding light transition over the activity in the 6h preceding light transition. We have reported the data as percentages in Fig 8F-G and modified the figure legends accordingly.

    1. Reviewer #1 (Public Review):

      Olszyński and colleagues present data showing variability from canonical "aversive calls", typically described as long 22 kHz calls rodents emit in aversive situations. Similarly long but higher-frequency (44 kHz) calls are presented as a distinct call type, including analyses both of their acoustic properties and animals' responses to hearing playback of these calls. While this work adds an intriguing and important reminder, namely that animal behavior is often more variable and complex than perhaps we would like it to be, there is some caution warranted in the interpretation of these data.

      The exclusive use of males is a major concern lacking adequate justification and should be disclosed in the title and abstract to ensure readers are aware of this limitation. With several reported sex differences in rat vocal behaviors this means caution should be exercised when generalizing from these findings. The occurrence of an estrus cycle in typical female rats is not justification for their exclusion. Note also that male rodents experience great variability in hormonal states as well, distinguishing between individuals and within individuals across time. The study of endocrinological influences on behavior can be separated from the study of said behavior itself, across all sexes. Similarly, concerns about needing to increase the number of animals when including all sexes are usually unwarranted (see Shansky [2019] and Phillips et al. [2023]).

      Regarding the analysis where calls were sorted using DBSCAN based on peak frequency and duration, my comment on the originally reviewed version stands. It seems that the calls are sorted by an (unbiased) algorithm into categories based on their frequency and duration, and because 44kHz calls differ by definition on frequency and duration the fact that the algorithm sorts them as a distinct category is not evidence that they are "new calls [that] form a separate, distinct group". I appreciate that the authors have softened their language regarding the novelty and distinctness of these calls, but the manuscript contains several instances where claims of novelty and specificity (e.g. the subtitle on line 193) is emphasized beyond what the data justifies.

      The behavioral response to call playback is intriguing, although again more in line with the hypothesis that these are not a distinct type of call but merely represent expected variation in vocalization parameters. Across the board animals respond rather similarly to hearing 22 kHz calls as they do to hearing 44 kHz calls, with occasional shifts of 44 kHz call responses to an intermediate between appetitive and aversive calls. This does raise interesting questions about how, ethologically, animals may interpret such variation and integrate this interpretation in their responses. However, the categorical approach employed here does not address these questions fully.

      I appreciate the amendment in discussing the idea of arousal being the key determinant for the increased emission of 44kHz, and the addition of other factors. Some of the items in this list, such as annoyance/anger and disgust/boredom, don't really seem to fit the data. I'm not sure I find the idea that rats become annoyed or disgusted during fear conditioning to be a particularly compelling argument. As such the list appears to be a collection of emotion-related words, with unclear potential associations with the 44kHz calls.

      Later in the Discussion the authors argue that the 44kHz aversive calls signal an increased intensity of a negative valence emotional state. It is not clear how the presented arguments actually support this. For example, what does the elongation of fear conditioning to 10 trials have to do with increased negative emotionality? Is there data supporting this relationship between duration and emotion, outside anthropomorphism? Each of the 6 arguments presented seems quite distant from being able to support this conclusion.

      In sum, rather than describing the 44kHz long calls as a new call type, it may be more accurate to say that sometimes aversive calls can occur at frequencies above 22 kHz. Individual and situational variability in vocalization parameters seems to be expected, much more so than all members of a species strictly adhering to extremely non-variable behavioral outputs.

      [Editors' note: The reviewer agrees that the additional analysis has ruled out the possibility that the calls are due to fatigue.]

    1. Author response:

      eLife assessment 

      This important study provides evidence for a combination of the latest generation of Oxford Nanopore Technology long reads with state-of-the art variant callers enabling bacterial variant discovery at accuracy that matches or exceeds the current "gold standard" with short reads. The evidence supporting the claims of the authors is convincing, although the inclusion of a larger number of reference genomes would further strengthen the study. The work will be of interest to anyone performing sequencing for outbreak investigations, bacterial epidemiology, or similar studies. 

      We thank the editor and reviewers for the accurate summary and positive assessment. We address the comment about increasing the number of reference genomes in the response to reviewer 2.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      The authors assess the accuracy of short variant calling (SNPs and indels) in bacterial genomes using Oxford Nanopore reads generated on R10.4 flow cells from a very similar genome (99.5% ANI), examining the impact of variant caller choice (three traditional variant callers: bcftools, freebayes, and longshot, and three deep learning based variant callers: clair3, deep variant, and nano caller), base calling model (fast, hac and sup) and read depth (using both simplex and duplex reads). 

      Strengths: 

      Given the stated goal (analysis of variant calling for reads drawn from genomes very similar to the reference), the analysis is largely complete and results are compelling. The authors make the code and data used in their analysis available for re-use using current best practices (a computational workflow and data archived in INSDC databases or Zenodo as appropriate). 

      Weaknesses: 

      While the medaka variant caller is now deprecated for diploid calling, it is still widely used for haploid variant calling and should at least be mentioned (even if the mention is only to explain its exclusion from the analysis). 

      We agree that this would be an informative addition to the study and will add it to the benchmarking.

      Appraisal: 

      The experiments the authors engaged in are well structured and the results are convincing. I expect that these results will be incorporated into "best practice" bacterial variant calling workflows in the future. 

      Thank you for the positive appraisal.

      Reviewer #2 (Public Review): 

      Summary: 

      Hall et al describe the superiority of ONT sequencing and deep learning-based variant callers to deliver higher SNP and Indel accuracy compared to previous gold-standard Illumina short-read sequencing. Furthermore, they provide recommendations for read sequencing depth and computational requirements when performing variant calling. 

      Strengths: 

      The study describes compelling data showing ONT superiority when using deep learning-based variant callers, such as Clair3, compared to Illumina sequencing. This challenges the paradigm that Illumina sequencing is the gold standard for variant calling in bacterial genomes. The authors provide evidence that homopolymeric regions, a systematic and problematic issue with ONT data, are no longer a concern in ONT sequencing. 

      Weaknesses: 

      (1) The inclusion of a larger number of reference genomes would have strengthened the study to accommodate larger variability (a limitation mentioned by the authors). 

      Our strategic selection of 14 genomes—spanning a variety of bacterial genera and species, diverse GC content, and both gram-negative and gram-positive species (including M. tuberculosis, which is neither)—was designed to robustly address potential variability in our results. Moreover, all our genome assemblies underwent rigorous manual inspection as the quality of the true genome sequences is the foundation this research is built upon. Given this, the fundamental conclusions regarding the accuracy of variant calls would likely remain unchanged with the addition of more genomes.  However, we do acknowledge that a substantially larger sample size, which is beyond the scope of this study, would enable more fine-grained analysis of species differences in error rates.

      (2) In Figure 2, there are clearly one or two samples that perform worse than others in all combinations (are always below the box plots). No information about species-specific variant calls is provided by the authors but one would like to know if those are recurrently associated with one or two species. Species-specific recommendations could also help the scientific community to choose the best sequencing/variant calling approaches.

      Thank you for highlighting this observation. The precision, recall, and F1 scores for each sample and condition can be found in Supplementary Table S4. We will investigate the samples that consistently perform below expectation to determine if this is associated with specific species, which may necessitate tailored recommendations for those species. Additionally, we will produce a species-segregated version of Figure 2 for a clearer interpretation and will place it in the supplementary materials.

      (3) The authors support that a read depth of 10x is sufficient to achieve variant calls that match or exceed Illumina sequencing. However, the standard here should be the optimal discriminatory power for clinical and public health utility (namely outbreak analysis). In such scenarios, the highest discriminatory power is always desirable and as such an F1 score, Recall and Precision that is as close to 100% as possible should be maintained (which changes the minimum read sequencing depth to at least 25x, which is the inflection point).

      We agree that the highest discriminatory power is always desirable for clinical or public health applications. In which case, 25x is probably a better minimum recommendation. However, we are also aware that there are resource-limited settings where parity with Illumina is sufficient. In these cases, 10x depth from ONT would provide sufficient data.

      The manuscript currently emphasises the latter scenario, but we will revise the text to clearly recommend 25x depth as a conservative aim in settings where resources are not a constraint, ensuring the highest possible discriminatory power for applications like outbreak analysis.

      (4) The sequencing of the samples was not performed with the same Illumina and ONT method/equipment, which could have introduced specific equipment/preparation artefacts that were not considered in the study. See for example https://academic.oup.com/nargab/article/3/1/lqab019/6193612

      To our knowledge, there is no evidence that sequencing on different ONT machines or barcoding kits leads to a difference in read characteristics or accuracy. To ensure consistency and minimise potential variability, we used the same ONT flowcells for all samples and performed basecalling on the same Nvidia A100 GPU. We will update the methods to emphasise this.

      For Illumina and ONT, the exact machines used for which samples will be added as a supplementary table. We will also add a comment about possible Illumina error rate differences in the ‘Limitations’ section of the Discussion.

      In summary, while there may be specific equipment or preparation artifacts to consider, we took steps to minimise these effects and maintain consistency across our sequencing methods.

      Reviewer #3 (Public Review): 

      Hall et al. benchmarked different variant calling methods on Nanopore reads of bacterial samples and compared the performance of Nanopore to short reads produced with Illumina sequencing. To establish a common ground for comparison, the authors first generated a variant truth set for each sample and then projected this set to the reference sequence of the sample to obtain a mutated reference. Subsequently, Hall et al. called SNPs and small indels using commonly used deep learning and conventional variant callers and compared the precision and accuracy from reads produced with simplex and duplex Nanopore sequencing to Illumina data. The authors did not investigate large structural variation, which is a major limitation of the current manuscript. It will be very interesting to see a follow-up study covering this much more challenging type of variation. 

      We fully agree that investigating structural variations (SVs) would be a very interesting and important follow-up. Identifying and generating ground truth SVs is a nontrivial task and we feel it deserves its own space and study. We hope to explore this in the future.

      In their comprehensive comparison of SNPs and small indels, the authors observed superior performance of deep learning over conventional variant callers when Nanopore reads were basecalled with the most accurate (but also computationally very expensive) model, even exceeding Illumina in some cases. Not surprisingly, Nanopore underperformed compared to Illumina when basecalled with the fastest (but computationally much less demanding) method with the lowest accuracy. The authors then investigated the surprisingly higher performance of Nanopore data in some cases and identified lower recall with Illumina short read data, particularly from repetitive regions and regions with high variant density, as the driver. Combining the most accurate Nanopore basecalling method with a deep learning variant caller resulted in low error rates in homopolymer regions, similar to Illumina data. This is remarkable, as homopolymer regions are (or, were) traditionally challenging for Nanopore sequencing. 

      Lastly, Hall et al. provided useful information on the required Nanopore read depth, which is surprisingly low, and the computational resources for variant calling with deep learning callers. With that, the authors established a new state-of-the-art for Nanopore-only variant, calling on bacterial sequencing data. Most likely these findings will be transferred to other organisms as well or at least provide a proof-of-concept that can be built upon. 

      As the authors mention multiple times throughout the manuscript, Nanopore can provide sequencing data in nearly real-time and in remote regions, therefore opening up a ton of new possibilities, for example for infectious disease surveillance. 

      However, the high-performing variant calling method as established in this study requires the computationally very expensive sup and/or duplex Nanopore basecalling, whereas the least computationally demanding method underperforms. Here, the manuscript would greatly benefit from extending the last section on computational requirements, as the authors determine the resources for the variant calling but do not cover the entire picture. This could even be misleading for less experienced researchers who want to perform bacterial sequencing at high performance but with low resources. The authors mention it in the discussion but do not make clear enough that the described computational resources are probably largely insufficient to perform the high-accuracy basecalling required. 

      We have provided runtime benchmarks for basecalling in Supplementary Figure S16 and detailed these times in Supplementary Table S7. In addition, we state in the Results section (P10 L228-230) “Though we do note that if the person performing the variant calling has received the raw (pod5) ONT data, basecalling also needs to be accounted for, as depending on how much sequencing was done, this step can also be resource-intensive.”

      Even with super-accuracy basecalling considered, our analysis shows that variant calling remains the most resource-intensive step for Clair3, DeepVariant, FreeBayes, and NanoCaller. Therefore, the statement “the described computational resources are probably largely insufficient to perform the high-accuracy basecalling required”, is incorrect. However, we will endeavour to make the basecalling component and considerations more prominent in the Results and Discussion.

    2. eLife assessment

      This important study provides evidence for a combination of the latest generation of Oxford Nanopore Technology long reads with state-of-the art variant callers enabling bacterial variant discovery at accuracy that matches or exceeds the current "gold standard" with short reads. The evidence supporting the claims of the authors is convincing, although the inclusion of a larger number of reference genomes would further strengthen the study. The work will be of interest to anyone performing sequencing for outbreak investigations, bacterial epidemiology, or similar studies.

    3. Reviewer #1 (Public Review):

      Summary:

      The authors assess the accuracy of short variant calling (SNPs and indels) in bacterial genomes using Oxford Nanopore reads generated on R10.4 flow cells from a very similar genome (99.5% ANI), examining the impact of variant caller choice (three traditional variant callers: bcftools, freebayes, and longshot, and three deep learning based variant callers: clair3, deep variant, and nano caller), base calling model (fast, hac and sup) and read depth (using both simplex and duplex reads).

      Strengths:

      Given the stated goal (analysis of variant calling for reads drawn from genomes very similar to the reference), the analysis is largely complete and results are compelling. The authors make the code and data used in their analysis available for re-use using current best practices (a computational workflow and data archived in INSDC databases or Zenodo as appropriate).

      Weaknesses:

      While the medaka variant caller is now deprecated for diploid calling, it is still widely used for haploid variant calling and should at least be mentioned (even if the mention is only to explain its exclusion from the analysis).

      Appraisal:

      The experiments the authors engaged in are well structured and the results are convincing. I expect that these results will be incorporated into "best practice" bacterial variant calling workflows in the future.

    4. Reviewer #2 (Public Review):

      Summary:

      Hall et al describe the superiority of ONT sequencing and deep learning-based variant callers to deliver higher SNP and Indel accuracy compared to previous gold-standard Illumina short-read sequencing. Furthermore, they provide recommendations for read sequencing depth and computational requirements when performing variant calling.

      Strengths:

      The study describes compelling data showing ONT superiority when using deep learning-based variant callers, such as Clair3, compared to Illumina sequencing. This challenges the paradigm that Illumina sequencing is the gold standard for variant calling in bacterial genomes. The authors provide evidence that homopolymeric regions, a systematic and problematic issue with ONT data, are no longer a concern in ONT sequencing.

      Weaknesses:

      (1) The inclusion of a larger number of reference genomes would have strengthened the study to accommodate larger variability (a limitation mentioned by the authors).

      (2) In Figure 2, there are clearly one or two samples that perform worse than others in all combinations (are always below the box plots). No information about species-specific variant calls is provided by the authors but one would like to know if those are recurrently associated with one or two species. Species-specific recommendations could also help the scientific community to choose the best sequencing/variant calling approaches.

      (3) The authors support that a read depth of 10x is sufficient to achieve variant calls that match or exceed Illumina sequencing. However, the standard here should be the optimal discriminatory power for clinical and public health utility (namely outbreak analysis). In such scenarios, the highest discriminatory power is always desirable and as such an F1 score, Recall and Precision that is as close to 100% as possible should be maintained (which changes the minimum read sequencing depth to at least 25x, which is the inflection point).

      (4) The sequencing of the samples was not performed with the same Illumina and ONT method/equipment, which could have introduced specific equipment/preparation artefacts that were not considered in the study. See for example https://academic.oup.com/nargab/article/3/1/lqab019/6193612.

    5. Reviewer #3 (Public Review):

      Hall et al. benchmarked different variant calling methods on Nanopore reads of bacterial samples and compared the performance of Nanopore to short reads produced with Illumina sequencing. To establish a common ground for comparison, the authors first generated a variant truth set for each sample and then projected this set to the reference sequence of the sample to obtain a mutated reference. Subsequently, Hall et al. called SNPs and small indels using commonly used deep learning and conventional variant callers and compared the precision and accuracy from reads produced with simplex and duplex Nanopore sequencing to Illumina data. The authors did not investigate large structural variation, which is a major limitation of the current manuscript. It will be very interesting to see a follow-up study covering this much more challenging type of variation.

      In their comprehensive comparison of SNPs and small indels, the authors observed superior performance of deep learning over conventional variant callers when Nanopore reads were basecalled with the most accurate (but also computationally very expensive) model, even exceeding Illumina in some cases. Not surprisingly, Nanopore underperformed compared to Illumina when basecalled with the fastest (but computationally much less demanding) method with the lowest accuracy. The authors then investigated the surprisingly higher performance of Nanopore data in some cases and identified lower recall with Illumina short read data, particularly from repetitive regions and regions with high variant density, as the driver. Combining the most accurate Nanopore basecalling method with a deep learning variant caller resulted in low error rates in homopolymer regions, similar to Illumina data. This is remarkable, as homopolymer regions are (or, were) traditionally challenging for Nanopore sequencing.

      Lastly, Hall et al. provided useful information on the required Nanopore read depth, which is surprisingly low, and the computational resources for variant calling with deep learning callers. With that, the authors established a new state-of-the-art for Nanopore-only variant, calling on bacterial sequencing data. Most likely these findings will be transferred to other organisms as well or at least provide a proof-of-concept that can be built upon.

      As the authors mention multiple times throughout the manuscript, Nanopore can provide sequencing data in nearly real-time and in remote regions, therefore opening up a ton of new possibilities, for example for infectious disease surveillance.

      However, the high-performing variant calling method as established in this study requires the computationally very expensive sup and/or duplex Nanopore basecalling, whereas the least computationally demanding method underperforms. Here, the manuscript would greatly benefit from extending the last section on computational requirements, as the authors determine the resources for the variant calling but do not cover the entire picture. This could even be misleading for less experienced researchers who want to perform bacterial sequencing at high performance but with low resources. The authors mention it in the discussion but do not make clear enough that the described computational resources are probably largely insufficient to perform the high-accuracy basecalling required.

    1. eLife assessment

      The study, from the group that pioneered migrasome, describes a novel vaccine platform derived from this newly discovered organelle. Using these cleverly engineered migrasomes – that behave like natural migrasomes – as a novel vaccine platform has the potential to overcome obstacles such as cold chain issues for vaccines like messenger RNA. Although the findings are important with practical implications for the vaccine technology, and the evidence, based on appropriate and validated methodology is convincing and is in line with current state-of-the-art, there are some critical issues that need to be addressed. These include a head-to-head comparison with proven vaccine platforms, for example, a SARS-CoV-2 mRNA vaccine or an adjuvanted recombinant spike protein.

    2. Reviewer #1 (Public Review):

      Summary:<br /> This is an excellent study by a superb investigator who discovered and is championing the field of migrasomes. This study contains a hidden "gem" - the induction of migrasomes by hypotonicity and how that happens. In summary, an outstanding fundamental phenomenon (migrasomes) en route to becoming transitionally highly significant.

      Strengths:

      Innovative approach at several levels. Migrasomes - discovered by Dr Yu's group - are an outstanding biological phenomenon of fundamental interest and now of potentially practical value.

      Weaknesses:

      I feel that the overemphasis on practical aspects (vaccine), however important, eclipses some of the fundamental aspects that may be just as important and actually more interesting. If this can be expanded, the study would be outstanding.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors' report describes a novel vaccine platform derived from a newly discovered organelle called a migrasome. First, the authors address a technical hurdle in using migrasomes as a vaccine platform. Natural migrasome formation occurs at low levels and is labor intensive, however, by understanding the molecular underpinning of migrasome formation, the authors have designed a method to make engineered migrasomes from cultured, cells at higher yields utilizing a robust process. These engineered migrasomes behave like natural migrasomes. Next, the authors immunized mice with migrasomes that either expressed a model peptide or the SARS-CoV-2 spike protein. Antibodies against the spike protein were raised that could be boosted by a 2nd vaccination and these antibodies were functional as assessed by an in vitro pseudoviral assay. This new vaccine platform has the potential to overcome obstacles such as cold chain issues for vaccines like messenger RNA that require very stringent storage conditions.

      Strengths:

      The authors present very robust studies detailing the biology behind migrasome formation and this fundamental understanding was used to form engineered migrasomes, which makes it possible to utilize migrasomes as a vaccine platform. The characterization of engineered migrasomes is thorough and establishes comparability with naturally occurring migrasomes. The biophysical characterization of the migrasomes is well done including thermal stability and characterization of the particle size (important characterizations for a good vaccine).

      Weaknesses:

      With a new vaccine platform technology, it would be nice to compare them head-to-head against a proven technology. The authors would improve the manuscript if they made some comparisons to other vaccine platforms such as a SARS-CoV-2 mRNA vaccine or even an adjuvanted recombinant spike protein. This would demonstrate a migrasome-based vaccine could elicit responses comparable to a proven vaccine technology. Additionally, understanding the integrity of the antigens expressed in their migrasomes could be useful. This could be done by looking at functional monoclonal antibodies binding to their migrasomes in a confocal microscopy experiment.

    1. Reviewer #1 (Public Review):

      Summary:

      Winged seeds or ovules from the Devonian are crucial to understanding the origin and early evolutionary history of wind dispersal strategy. Based on exceptionally well-preserved fossil specimens, the present manuscript documented a new fossil plant taxon (new genus and new species) from the Famennian Series of Upper Devonian in eastern China and demonstrated that three-winged seeds are more adapted to wind dispersal than one-, two- and four-winged seeds by using mathematical analysis.

      Strengths:

      The manuscript is well organised and well presented, with superb illustrations. The methods used in the manuscript are appropriate.

      Weaknesses:

      I would only like to suggest moving the "Mathematical analysis of wind dispersal of ovules with 1-4 wings" section from the supplementary information to the main text, leaving the supplementary figures as supplementary materials.

    2. eLife assessment

      This useful manuscript describes the second earliest known winged ovule without a capule in the Famennian of Late Devonian. Using solid mathematical analysis, the authors demonstrate that three-winged seeds are more adapted to wind dispersal than one-, two- and four-winged seeds. The manuscript will help the scientific community to understand the origin and early evolutionary history of wind dispersal strategy of early land plants.

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript described the second earliest known winged ovule without a capule in the Famennian of Late Devonian. Using Mathematical analysis, the authors suggest that the integuments of the earliest ovules without a cupule, as in the new taxon and Guazia, evolved functions in wind dispersal.

      Strengths:

      The new ovule taxon's morphological part is convincing. It provides additional evidence for the earliest winged ovules, and the mathematical analysis helps to understand their function.

      Weaknesses:

      The discussion should be enhanced to clarify the significance of this finding. What is the new advance compared with the Guazia finding? The authors can illustrate the character transformations using a simplified cladogram. The present version of the main text looks flat.

    1. eLife assessment

      This important study reports the deep evolutionary conservation of a core genetic program regulating spermatogenesis in flies, mice, and humans. The data presented are supportive of the main conclusion and generally convincing. This work will be of interest to evolutionary and reproductive biologists.

    2. Reviewer #1 (Public Review):

      Summary:

      By combining an analysis of the evolutionary age of the genes expressed in male germ cells, a study of genes associated with spermatocyte protein-protein interaction networks and functional experiments in Drosophila, Brattig-Correia and colleagues provide evidence for an ancient origin of the genetic program underlying metazoan spermatogenesis. This leads to identifying a relatively small core set of functional interactions between deeply conserved gene expression regulators, whose impairment is then shown to be associated with cases of human male infertility.

      Strengths:

      In my opinion, the work is important for three different reasons. First, it shows that, even though reproductive genes can evolve rapidly and male germ cells display a significant level of transcriptional noise, it is still possible to obtain convincing evidence that a conserved core of functionally interacting genes lies at the basis of the male germ transcriptome. Second, it reports an experimental strategy that could also be applied to gene networks involved in different biological problems. Third, the authors make a compelling case that, due to its effects on human spermatogenesis, disruption of the male germ cell orthoBackbone can be exploited to identify new genetic causes of infertility.

      Weaknesses:

      The main strength of the general approach followed by the authors is, inevitably, also a weakness. This is because a study rooted in comparative biology is unlikely to identify newly emerged genes that may adopt key roles in processes such as species-specific gamete recognition. Additionally, using a TPM >1 threshold for protein-coding transcripts may exclude genes, such as those encoding proteins required for gamete fusion, which are thought to be expressed at a very low level. Although these considerations raise the possibility that the chosen approach may miss information that, depending on the species, could be potentially highly functionally important, this by no means reduces its value in identifying genes belonging to the conserved genetic program of spermatogenesis.

    3. Reviewer #2 (Public Review):

      Summary:

      This is a tour de force study that aims to understand the genetic basis of male germ cell development across three animal species (human, mouse, and flies) by performing a genetic program conservation analysis (using phylostratigraphy and network science) with a special emphasis on genes that peak or decline during mitosis-to-meiosis. This analysis, in agreement with previous findings, reveals that several genes active during and before meiosis are deeply conserved across species, suggesting ancient regulatory mechanisms. To identify critical genes in germ cell development, the investigators integrated clinical genetics data, performing gene knockdown and knockout experiments in both mice and flies. Specifically, over 900 conserved genes were investigated in flies, with three of these genes further studied in mice. Of the 900 genes in flies, ~250 RNAi knockdowns had fertility phenotypes. The fertility phenotypes for the fly data can be viewed using the following browser link: https://pages.igc.pt/meionav. The scope of target gene validation is impressive. Below are a few minor comments.

      (1) In Supplemental Figure 2, it is notable that enterocyte transcriptomes are predominantly composed of younger genes, contrasting with the genetic age profile observed in brain and muscle cells. This difference is an intriguing observation and it would be curious to hear the author's comments.

      (2) Regarding the document, the figures provided only include supplemental data; none of the main text figures are in the full PDF.

      (3) Lastly, it would be great to section and stain mouse testis to classify the different stages of arrest during meiosis for each of the mouse mutants in order to compare more precisely to flies.

      This paper serves as a vital resource, emphasizing that only through the analysis of hundreds of genes can we prioritize essential genes for germ cell development. its remarkable that about 60% of conserved genes have no apparent phenotype during germ cell development.

      Strengths:

      The high-throughput screening was conducted on a conserved network of 920 genes expressed during the mitosis-to-meiosis transition. Approximately 250 of these genes were associated with fertility phenotypes. Notably, mutations in 5 of the 250 genes have been identified in human male infertility patients. Furthermore, 3 of these genes were modeled in mice, where they were also linked to infertility. This study establishes a crucial groundwork for future investigations into germ cell development genes, aiming to delineate their essential roles and functions.

      Weaknesses:

      The fertility phenotyping in this study is limited, yet dissecting the mechanistic roles of these proteins falls beyond its scope. Nevertheless, this work serves as an invaluable resource for further exploration of specific genes of interest.

    1. eLife assessment

      This important study reports the developmental dynamics and molecular markers of the rete ovarii during ovarian development. However, the data supporting the main conclusions remain incomplete. This study will be of interest to developmental and reproductive biologists.

    2. Reviewer #1 (Public Review):

      Summary:

      The manuscript by Anbarcia et al. re-evaluates the function of the enigmatic Rete Ovarii (RO), a structure that forms in close association with the mammalian ovary. The RO has generally been considered a functionless structure in the adult ovary. This manuscript follows up on a previous study from the lab that analyzed ovarian morphogenesis using high-resolution microscopy (McKey et al., 2022). The present study adds finer details to RO development and possible function by (1) identifying new markers for OR sub-regions (e.g. GFR1a labels the connecting rete) suggesting that the sub-regions are functionally distinct, (2) showing that the OR sub-regions are connected by a luminal system that allows transport of material from the extra-ovarian rete (EOR) to the inter-ovarian rete (IOG), (3) identifies proteins that are secreted into the OR lumen and that may regulate ovarian homeostasis, and finally, (4) better defines how the vasculature, nervous, and immune system integrates with the OR.

      Strengths:

      The data is beautifully presented and convincing. They show that the RO is composed of three distinct domains that have unique gene expression signatures and thus likely are functionally distinct.

      Weaknesses:

      It is not always clear what the novel findings are that this manuscript is presenting. It appears to be largely similar to the analysis done by McKey et al. (2022) but with more time points and molecular markers. The novelty of the present study's findings needs to be better articulated.

    3. Reviewer #2 (Public Review):

      A large number of ovarian experiments have been conducted - especially in morphological and molecular biology studies - specifically removing the ovarian membrane. This experiment is a good supplement to existing knowledge and plays an important role in early ovarian development and the regulation of ovarian homeostasis during the estrous cycle. There are also innovations in research ideas and methods, which will meet the requirements of experimental design and provide inspiration for other researchers.

      This reviewer did not identify any major issues with the article. However, the following points could be further clarified:

      (1) Is there any comparative data on the proteomics of RO and rete testis in early development? With some molecular markers also derived from rete testis, it would be better to provide the data or references.

      (2) Although the size of RO and its components is quite small and difficult to operate, the researchers in this article had already been able to perform intracavitary injection of EOR and extract EOR or CR for mass spectrometry analysis. Therefore, can EOR, CR, or IOR be damaged or removed, providing further strong evidence of ovarian development function?

      (3) Although IOR is shown on the schematic diagram, it cannot be observed in the immunohistochemistry pictures in Figure 1 and Figure 3. The authors should provide a detailed explanation.

    4. Reviewer #3 (Public Review):

      Summary:

      The rete ovarii (RO) has long been disregarded as a non-functional structure within the ovary. In their study, Anbarci and colleagues have delineated the markers and developmental dynamics of three distinct regions of the RO - the intraovarian rete (IOR), the extraovarian rete (EOR), and the connecting rete (CR). Notably focusing on the EOR, the authors presented evidence illustrating that the EOR forms a convoluted tubular structure culminating in a dilated tip. Intriguingly, microinjections into this tip revealed luminal flow towards the ovary containing potentially secreted functional proteins. Additionally, the EOR cells exhibit associations with vasculature, macrophages, and neuronal projections, proposing the notion that the RO may play a functional role in ovarian development during critical ovariogenesis stages. By identifying marker genes within the RO, the authors have also suggested that the RO could serve as a potential structure linking the ovary with the neuronal system.

      Strengths:

      Overall, the reviewer commends the authors for their systematic research on the RO, shedding light on this overlooked structure in developing ovaries. Furthermore, the authors have proposed a series of hypotheses that are both captivating and scientifically significant, with the potential to reshape our understanding of ovarian development through future investigations.

      Weaknesses:

      There is a lack of conclusive data supporting many conclusions in the manuscript. Therefore, the paper's overall conclusions should be moderated until functional validations are conducted.

    1. eLife assessment

      The authors combined human genetic analysis with zebrafish experiments to produce evidence that alleles that impair the function of EPHA4 cause idiopathic scoliosis (IS), a common spinal deformity. The significance of the findings is important because the cellular and molecular mechanisms that contribute to IS remain poorly understood. The human genetic data are quite convincing whereas the zebrafish data, although supportive, are incomplete.

    2. Joint Public Review:

      Summary:

      Idiopathic scoliosis (IS) is a common spinal deformity. Various studies have linked genes to IS, but underlying mechanisms are unclear such that we still lack understanding of the causes of IS. The current manuscript analyzes IS patient populations and identifies EPHA4 as a novel associated gene, finding three rare variants in EPHA4 from three patients (one disrupting splicing and two missense variants) as well as a large deletion (encompassing EPHA4) in a Waardenburg syndrome patient with scoliosis. EPHA4 is a member of the Eph receptor family. Drawing on data from zebrafish experiments, the authors argue that EPHA4 loss of function disrupts the central pattern generator (CPG) function necessary for motor coordination.

      Strengths:

      The main strength of this manuscript is the human genetic data, which provides convincing evidence linking EPHA4 variants to IS. The loss of function experiments in zebrafish strongly support the conclusion that EPHA4 variants that reduce function lead to IS.

      Weaknesses:

      The conclusion that disruption of CPG function causes spinal curves in the zebrafish model is not well supported. The authors' final model is that a disrupted CPG leads to asymmetric mechanical loading on the spine and, over time, the development of curves. This is a reasonable idea, but currently not strongly backed up by data in the manuscript. Potentially, the impaired larval movements simply coincide with, but do not cause, juvenile-onset scoliosis. Support for the authors' conclusion would require independent methods of disrupting CPG function and determining if this is accompanied by spine curvature. At a minimum, the language of the manuscript could be toned down, with the CPG defects put forward as a potential explanation for scoliosis in the discussion rather than as something this manuscript has "shown". An additional weakness of the manuscript is that the zebrafish genetic tools are not sufficiently validated to provide full confidence in the data and conclusions.

    1. eLife assessment

      This work is important because it attempts to elucidate how immune cells migrate across the blood brain barrier. The authors developed a convincing framework to visualize, recognize and track the movement of different immune cells across primary human and mouse brain microvascular endothelial cells without the need for fluorescence-based imaging using microfluidic devices. The data gathered are solid, and this work will be of interest to the cancer biology, immunology and medical therapeutics fields.

    2. Reviewer #1 (Public Review):

      Summary:

      It is evident that studying leukocyte extravasation in vitro is a challenge. One needs to include physiological flow, culture cells and isolate primary immune cells. Timing is of utmost importance and a reproducible setup essential. Extra challenges are met when extravasation kinetics in different vascular beds is required, e.g., across the blood-brain barrier. In this study, the authors describe a reliable and reproducible method to analyze leukocyte TEM under physiological flow conditions, including this analysis. That the software can also detect reverse TEM is a plus.

      Strengths:

      It is quite a challenge to get this assay reproducible and stable, in particular as there is flow included. Also for the analysis, there is currently no clear software analysis program, and many labs have their own methods. This paper gives the opportunity to unify the data and results obtained with this assay under label-free conditions. This should eventually lead to more solid and reproducible results.

      Also, the comparison between manual and software analysis is appreciated.

      Weaknesses:

      The authors stress that it can be done in BBB models, but I would argue that it is much more broadly applicable. This is not necessarily a weakness of the study but more an opportunity to strengthen the method. So I would encourage the authors to rewrite some parts and make it more broadly applicable.

    3. Reviewer #2 (Public Review):

      Summary:

      This paper develops an under-flow migration tracker to evaluate all the steps of the extravasation cascade of immune cells across the BBB. The algorithm is useful and has important applications.

      Strengths:

      Algorithm is almost as accurate as manual tracking and importantly saves time for researchers.

      Weaknesses:

      Applicability can be questioned because the device used is 2D and physiological biology is in 3D. Comparisons to other automated tools was not performed by the authors.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors aimed to establish a faster and more efficient method of tracking steps of T-cell extravasation across the blood brain barrier. The authors developed a framework to visualize, recognize and track the movement of different immune cells across primary human and mouse brain microvascular endothelial cells without the need for fluorescence-based imaging. The authors succinctly describe the basic requirements for tracking in the introduction followed by an in-depth account of the execution.

      Weaknesses and Strengths:

      Materials & methods and results:

      (1) The methods section also lacks details of the microfluidic device that the authors talk about in the paper. Under physiological sheer stress, the T-cells detach from the pMBMEC monolayer, and are hence unable to be detected; however, this observation requires an explanation pertaining to the reason of occurrence and potential solutions to circumvent it to ensure physiologically relevant experimental parameters.

      (2) The author describes a method for debris exclusion using UFMTrack that eliminates objects of <30 pixels in size from analysis based on a mean pixel size of 400 for T lymphocytes. However, this mean pixel size appears to stem from in-vitro activated CD8 T cells, which rapidly grow and proliferate upon stimulation. In line with this, activated lymphocytes exhibit increased cytoplasmic area, making them appear less dense or "brighter" by phase microscopy compared to naïve lymphocytes, which are relatively compact and subsequently appear dimmer. Given this, it is not clear whether UFMTrack is sufficiently trained to identify naïve human lymphocytes in circulating blood, nor smaller, murine lymphocytes. Analysis of each lymphocyte subtype in terms of pixel size and intensity would be beneficial to strengthen the claim that UFMTrack can identify each of these populations. Additionally, demonstrating that UFMTrack can correctly characterize the behavior of naïve versus activated lymphocytes isolated from murine and human sources would strengthen the claim that UFMTrack can be broadly applied to study lymphocyte dynamics in diverse models without additional training

      (3) Average precision was compared to the analysis of UFMTrack but it is unclear how average precision was calculated. This information should have been included in the methods section

      (4) CD4 and CD8 T cells exhibit distinct biology and interaction kinetics driven in part by their MHC molecule affinity and distinct receptor expression profiles. Thus, it is unclear why two distinct mechanisms of endothelial cell activation are needed to see differences between the populations.

      (5) The BMECs are barrier tissues but were cultured on µdishes in this study. To study the transmigration of T-cells across the endothelium, the model would have been more relevant on a semi-permeable membrane instead of a closed surface.

      (6) Methods are provided for the isolation and expansion of human effector and memory CD4+ T cells. However, there is no mention of specific CD4+ T cell populations used for analysis with UFMTrack, nor a clear breakdown of tracking efficiency for each subpopulation. Further, there is no similar method for the isolation of CD8+ T cell compartments. A clear breakdown of the performance efficiency of UFMTrack with each cell population investigated in this study would provide greater insight into the software's performance with regard to tracking the behavior and movement of distinct immune populations.

      (7) The results section is quite extensive and discusses details of establishment of the framework while highlighting both the pros and cons of the different aspects of the process, for example the limitation of the two models, 2D and 2D+T were highlighted well. However, the results section includes details which may be more fitting in the methods section.

      (8) A few statements in the results section lacked literary support, which was not provided in the discussion either, such as support for increased variance of T-cell instantaneous speed on stimulated vs non-stimulated pMBMECs. Another example is the enhancement of cytokine stimulation directed T-cell movement on the pMBMECs that the authors observed but failed to relay the physiological relevance of it. The authors don't provide enough references for developments in the field prior to their work which form the basis and need for this technology.

      (9) The rationale for use of OT-1 and 2D2-derived murine lymphocytes is unclear here. The OT-1 model has been generated to study antigen-specific CD8+ T cell responses, while the 2D2 model has been generated to recapitulate CD4 T cell-specific myelin oligodendrocyte glycoprotein (MOG) responses.

      Figures and text:

      (1) There are certain discrepancies and misarrangement of figures and text. For example, discussion of the effect of sheer flow on T cell attachment as part of the introduction in figure 1 and then mentioning it in the text again in the results section as part of figure 4 is repetitive.

      (2) Section IV, subsection 1 of the results section, refers to 'data acquisition section above' in line 279, however the said section is part of materials and methods which is provided towards the end of the manuscript.

      (3) There are figures in the manuscript that have not been referenced in the results section, for example, figure 3A and B. Figure 1 hasn't been addressed until subsection 7 of materials and methods

      (4) A lack of significance but an observed trend of increased variance of T cell instantaneous speed is reported in line 296-298; however, the graph (figure 4G) shows a significant change in instantaneous speed between non-stimulated and TNFα-stimulated systems. This is misleading to the readers.

      (5) The authors talk about three beginner experimentors testing the manual T cell tracking process but figure 5 only showcases data from two experimentors without stating the reason for excluding experimentor 1.

      Discussion:

      (1) While the discussion captures the major takeaways from the paper, it lacks relevant supporting references to relate the observation to physiological conditions and applicability.

      (2) The discussion lacks connection to the results since the figures were not referenced while discussing an observed trend

      (3) The authors briefly looked into mouse and human BMECs and their individual interaction with T-cells, but don't discuss the differences between the two, if any, that challenged their framework.

      (4) Even though though the imaging tool relies on difference in appearance for detection, the authors talk about lack of feasibility in detecting transmigration of BMDMs due to their significantly different appearance. The statement lacks a problem solving approach to discuss how and why this was the case.

      Relevance to the field:

      Utilizing the framework provided by the authors, the application can be adapted and/or utilized for visualizing a range of different cell types, provided they are different in appearance. However, this would require extensive changes to the script and won't be adaptable in its current form.

    1. eLife assessment

      This fundamental study provides a modeling regime that provides new insight into the energy-preservation parameters among schooling fish. The strength of the evidence supporting observations such as distilled dynamics between leading and lagging schooling fish which are derived from emergent properties is convincing. Overall, the study provides exciting insights into energetic coupling with respect to group swimming dynamics. Some potential improvements to strengthen the study include clarification regarding degrees of freedom and parameter ranges in the model.

    2. Reviewer #1 (Public Review):

      Summary:

      The study seeks to establish accurate computational models to explore the role of hydrodynamic interactions on energy savings and spatial patterns in fish schools. Specifically, the authors consider a system of (one degree-of-freedom) flapping airfoils that passively position themselves with respect to the streamwise direction, while oscillating at the same frequency and amplitude, with a given phase lag and at a constant cross-stream distance. By parametrically varying the phase lag and the cross-stream distance, they systematically explore the stability and energy costs of emergent configurations. Computational findings are leveraged to distill insights into universal relationships and clarify the role of the wake of the leading foil.

      Strengths:

      (1) The use of multiple computational models (computational fluid dynamics, CFD, for full Navier-Stokes equations and computationally efficient inviscid vortex sheet, VS, model) offers an extra degree of reliability of the observed findings and backing to the use of simplified models for future research in more complex settings.

      (2) The systematic assessment of the stability and energy savings in multiple configurations of pairs and larger ensembles of flapping foils is an important addition to the literature.

      (3) The discovery of a linear phase-distance relationship in the formation attained by pairs of flapping foils is a significant contribution, which helps compare different experimental observations in the literature.

      (4) The observation of a critical size effect for in-line formations of larger, above which cohesion and energetic benefits are lost at once, is a new discovery in the field.

      Weaknesses:

      (1) The extent to which observations on one-degree-of-freedom flapping foils could translate to real fish schools is presently unclear so some of the conclusions on live fish schools are likely to be overstated and would benefit from some more biological framing.

      (2) The analysis of non-reciprocal coupling is not as novel as the rest of the study and potentially not as convincing due to the chosen linear metric of interaction (that is, the flow agreement).

      Overall, this is a rigorous effort on a critical topic: findings of the research can offer important insight into the hydrodynamics of fish schooling, stimulating interdisciplinary research at the interface of computational fluid mechanics and biology.

    3. Reviewer #2 (Public Review):

      The document "Mapping spatial patterns to energetic benefits in groups of flow-coupled swimmers" by Heydari et al. uses several types of simulations and models to address aspects of stability of position and power consumption in few-body groups of pitching foils. I think the work has the potential to be a valuable and timely contribution to an important subject area. The supporting evidence is largely quite convincing, though some details could raise questions, and there is room for improvement in the presentation. My recommendations are focused on clarifying the presentation and perhaps spurring the authors to assess additional aspects:

      (1) Why do the authors choose to set the swimmers free only in the propulsion direction? I can understand constraining all the positions/orientations for investigating the resulting forces and power, and I can also understand the value of allowing the bodies to be fully free in x, y, and their orientation angle to see if possible configurations spontaneously emerge from the flow interactions. But why constrain some degrees of freedom and not others? What's the motivation, and what's the relevance to animals, which are fully free?

      (2) The model description in Eq. (1) and the surrounding text is confusing. Aren't the authors computing forces via CFD or the VS method and then simply driving the propulsive dynamics according to the net horizontal force? It seems then irrelevant to decompose things into thrust and drag, and it seems irrelevant to claim that the thrust comes from pressure and the drag from viscous effects. The latter claim may in fact be incorrect since the body has a shape and the normal and tangential components of the surface stress along the body may be complex.

      (3) The parameter taudiss in the VS simulations takes on unusual values such as 2.45T, making it seem like this value is somehow very special, and perhaps 2.44 or 2.46 would lead to significantly different results. If the value is special, the authors should discuss and assess it. Otherwise, I recommend picking a round value, like 2 or 3, which would avoid distraction.

      (4) Some of the COT plots/information were difficult to interpret because the correspondence of beneficial with the mathematical sign was changing. For example, DeltaCOT as introduced on p. 5 is such that negative indicates bad energetics as compared to a solo swimmer. But elsewhere, lower or more negative COT is good in terms of savings. Given the many plots, large amounts of data, and many quantities being assessed, the paper needs a highly uniform presentation to aid the reader.

      (5) I didn't understand the value of the "flow agreement parameter," and I didn't understand the authors' interpretation of its significance. Firstly, it would help if this and all other quantities were given explicit definitions as complete equations (including normalization). As I understand it, the quantity indicates the match of the flow velocity at some location with the flapping velocity of a "ghost swimmer" at that location. This does not seem to be exactly relevant to the equilibrium locations. In particular, if the match were perfect, then the swimmer would generate no relative flow and thus no thrust, meaning such a location could not be an equilibrium. So, some degree of mismatch seems necessary. I believe such a mismatch is indeed present, but the plots such as those in Figure 4 may disguise the effect. The color bar is saturated to the point of essentially being three tones (blue, white, red), so we cannot see that the observed equilibria are likely between the max and min values of this parameter.

      (6) More generally, and related to the above, I am favorable towards the authors' attempts to find approximate flow metrics that could be used to predict the equilibrium positions and their stability, but I think the reasoning needs to be more solid. It seems the authors are seeking a parameter that can indicate equilibrium and another that can indicate stability. Can they clearly lay out the motivation behind any proposed metrics, and clearly present complete equations for their definitions? Further, is there a related power metric that can be appropriately defined and which proves to be useful?

      (7) Why do the authors not carry out CFD simulations on the larger groups? Some explanations should be given, or some corresponding CFD simulations should be carried out. It would be interesting if CFD simulations were done and included, especially for the in-line case of many swimmers. This is because the results seem to be quite nuanced and dependent on many-body effects beyond nearest-neighbor interactions. It would certainly be comforting to see something similar happen in CFD.

      (8) Related to the above, the authors should discuss seemingly significant differences in their results for long in-line formations as compared to the CFD work of Peng et al. [48]. That work showed apparently stable groups for numbers of swimmers quite larger than that studied here. Why such a qualitatively different result, and how should we interpret these differences regarding the more general issue of the stability of tandem groups?

      (9) The authors seem to have all the tools needed to address the general question about how dynamically stable configurations relate to those that are energetically optimal. Are stable solutions optimal, or not? This would seem to have very important implications for animal groups, and the work addresses closely related topics but seems to miss the opportunity to give a definitive answer to this big question.

      (10) Time-delay particle model: This model seems to construct a simplified wake flow. But does the constructed flow satisfy basic properties that we demand of any flow, such as being divergence-free? If not, then the formulation may be troublesome.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This is a valuable study that describes the effects of T. pallidum on neural development by applying single-cell RNA sequencing to an iPSC-derived brain organoid model. The evidence supporting the claims of the authors is solid, although further evidence to understand the differences in infection rates would strengthen the conclusions of the study. In particular, the conclusions would be strengthened by validating infection efficiency as this can impact the interpretation of single-cell sequencing results, and how these metrics affect organoid size as well as comparison with additional infectious agents. Furthermore, additional validations of downstream effectors are not adequate and could be improved. 

      Thank you very much for your valuable comments. Since we used the organoid model for the first time to investigate the effects of T. pallidum on brain development, the study design is not perfect. As you have accurately mentioned, the results of the paper do not have more in-depth details, especially to verify the infection rate of T. pallidum. Your valuable comments will be very useful for us for carrying out further research. In addition, the downstream effector validation is inadequate, so we performed an analysis of single-cell sequencing data to strengthen our view in the revised manuscript (See Figure 5F for a description in current manuscript).

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This is an interesting study by Xu et al showing the effects of infection with the Treponema pallidum virus (which causes syphilis disease) on neuronal development using iPSC-derived human brain organoids as a model and single-cell RNA sequencing. This work provides an important insight into the impact of the virus on human development, bridging the gap between the phenomena observed in studies using animal models as well as non-invasive human studies showing developmental abnormalities in fetuses infected with the virus in utero through maternal vertical transmission.

      Using single-cell RNAseq in combination with qPCR and immunofluorescence techniques, the authors show that T. pallidum infected organoids are smaller in size, in particular during later growth stages, contain a larger number of undifferentiated neuronal lineage cells, and exhibit decreased numbers of specific neuronal subcluster, which the authors have identified as undifferentiated hindbrain neurons.

      The study is an important first step in understanding how T. pallidum affects human neuronal development and provides important insight into the potential mechanisms that underlie the neurodevelopmental abnormalities observed in infected human fetuses. Several important weaknesses have also been noted, which need to be addressed to strengthen the study's conclusions.

      Strengths:

      (1) The study is well written, and the data quality is good for the most part.

      (2) The study provides an important first step in utilizing human brain organoids to study the impact of T. pallidum infection on neuronal development.

      (3) The study's conclusions may provide important insight to other researchers focused on studying how viral infections impact neuronal development. 

      Thank you very much for your positive feedback. Below, you will find our detailed responses to your concerns, addressed point-by-point. I once again sincerely appreciate your time and effort in reviewing our manuscript.

      Weaknesses:

      (1) It is unclear how T. pallidum infection was validated in the organoids. If not all cells are infected, this could have important implications for the study's conclusions, in particular the single-cell RNAseq experiments. Were only cells showing the presence of the virus selected for sequencing? A detailed description of how infection was validated and the process of selection of cells for RNAseq would strongly support the study's conclusions. 

      Thank you for your valuable comment. We completely agree with your point. Exploring the infection rate of T. pallidum to brain organoids is a key factor that must be considered. We selected pluripotent stem cell-derived brain organoids to simulate the process of foetal brain neurodevelopment and cultured them mixed with T. pallidum to mimic T. pallidum invading brain tissue. Since brain organoids are three-dimensional structures formed by nerve cell aggregation, T. pallidum invades organoids from the periphery to the center of the organoids gradually. T. pallidum acts on organoids long enough to increase the infection rates; however, the pathogen is selective in invading human cells. If we only select cells present in T. pallidum for sequencing, the authenticity of simulating "real world" infections is somewhat weakened. To better carry out this study, selecting cells from intact organoids for sequencing, without eliminating cells without T. pallidum, can better simulate the effect of T. pallidum infection on the nervous system. Of course, we should also set up a blank control group.

      (2) The authors show that T. pallidum infection results in impaired development of hindbrain neurons. How does this finding compare to what has already been shown in animal studies? Is a similar deficit in this brain region observed with this specific virus? It would be useful to strengthen the study's conclusions if the authors added a discussion about the observed deficits in hindbrain neuronal development, and prior literature on similar studies conducted in animal models or human patients. Does T. pallidum preferentially target these neurons, or is this a limitation of the current organoid model system? 

      Thank you for your valuable comments. The finding that T. pallidum infection results in impaired development of hindbrain neurons has not been verified in animal experiments. Of course, it is better to further validate the findings in organoid studies through animal experiments. Unfortunately, due to the technical challenges, mature animal models have not been developed for the study of congenital syphilis. Although our team has been working on the development of animal models of congenital neurosyphilis, the current progress is still not satisfactory. After struggling hard in this field for many years, we decided to attempt to utilize human brain organoids instead of animal models to study the impact of T. pallidum infection on neuronal development.

      We also checked prior literature on similar studies that have referred to the content in human patients. Dan Doherty et al. reported that patients with pontocerebellar hypoplasia develop microcephaly at birth or over time after birth (PMID: 23518331). Based on your constructive suggestions, we have added some content related to hindbrain to the “Discussion” section.

      Our study found that T. pallidum could inhibit the differentiation of subNPC1B in brain organoids, thereby reducing the differentiation from subNPC1B to hindbrain neurons, and ultimately affecting the development and maturation of hindbrain neurons during pregnancy. Based on our results, T. pallidum does not preferentially target hindbrain neurons. Of course, there are limitations to the current organoid model system, see the "Limitations" section.

      PMID: 23518331- Dan Doherty et al, Midbrain and hindbrain malformations: advances in clinical diagnosis, imaging, and genetics.

      Revision in the “Discussion” section, line 343-352:

      “The vertebrate hindbrain contains a complex network of dedicated neural circuits that play an essential role in controlling many physiological processes and behaviors, including those related to the cerebellum, pons, and medulla oblongata (Shoja et al., 2018). Patients with pontocerebellar hypoplasia represent the less severe end of the spectrum with early hyperreflexia, developmental delay, and feeding problems, eventually developing spasticity and involuntary movements in childhood, while some patients represent the severe end of the spectrum characterised by polyhydramnios, severe hyperreflexia, contracture, and early death from central respiratory failure. Patients with pontocerebellar hypoplasia develop microcephaly at birth or over time after birth (Doherty et al., 2013).”

      (3) The authors show that T. pallidum-infected organoids are smaller in size by measuring organoid diameter during later stages of organoid growth, with no change during early stages. Does that represent insufficient infection at the early stages? Is this due to increased cell death or lack of cell division in the infected organoids? Experiments using IHC to quantify levels of cleaved caspase and/or protein markers for cell proliferation would be able to address these questions. 

      Thank you for your valuable suggestion. The concentration of T. pallidum in patients with syphilis was generally very low (PMID: 21752804, 35315702, 33099614). In this study, a low concentration of T. pallidum was applied to brain organoids to simulate early foetal transmission of syphilis. Nerve cells mainly establish intercellular connections to form brain organoids in the way of adhesion, which can easily cause organoids to divide and die if treated with a high concentration of T. pallidum. Furthermore, based on your suggestions, we performed additional immunostaining analyses to verify the apoptosis of brain organoids infected by T. pallidum. Cleaved caspase 3 (clCASP3) staining showed that the number of apoptotic cells increased following T. pallidum infection; however, the proportion of apoptotic cells in both groups of brain organoids was very low (Figure supplement 2) (N=12 organoids, each group from three independent bioreactors), which would be not enough to affect the results of the experiment, thereby suggesting that neural differentiation and development of brain organoids were mainly inhibited following T. pallidum infection (rather than promoting organoid apoptosis).

      PMID: 21752804-- Craig Tipple et al, Getting the measure of syphilis: qPCR to better understand early infection.

      PMID: 35315702-- Cuini Wang et al, Quantified Detection of Treponema pallidum DNA by PCR Assays in Urine and Plasma of Syphilis Patients.

      PMID: 33099614—Cuini Wang et al, A New Specimen for Syphilis Diagnosis: Evidence by High Loads of Treponema pallidum DNA in Saliva.

      Revision in the “Results” section, line 105-108:

      “… cleaved caspase 3 (clCASP3) staining showed that the number of apoptotic cells increased significantly following T. pallidum infection, but the proportion of apoptotic cells in both groups of brain organoids was very low (Figure supplement 2) (N=12 organoids, each group from three independent bioreactors) …”

      Revision in the “Materials and methods” section, line 446-447:

      “…anti-cleaved caspase 3 (rabbit, 1:100, Cell Signaling Technology, 9661S),”

      Revision in the “Supplementary File” section, line 78-81:

      Author response image 1.

      The number of clCASP3+ cells in the microscopic field of brain organoids. A nonparametric t-test was used to evaluate the statistical differences between the two groups. (**: P < 0.01).

      (4) In Figure 1D authors show differences in rosette-like structure in the infected organoids. The representative images do not appear to be different in any of the discussed components (e.g., the sox2 signal looks fairly similar between the two conditions). No quantification of these structures was presented. Authors should provide quantification or a more representative image to support their statement. 

      Thank you for your valuable suggestion. I have quantified the neural rosette structure and compared the number of intact rosette-like structures between the two groups (See Figure 1D for a description in current manuscript).

      (5) The IHC images shown in Figures 3E, G, and Figure 4E look very similar between the two conditions despite the discussed decrease in the text. A more suitable representative image should be presented, or the analysis should be amended to reflect the observed results. 

      Thank you for your valuable suggestion. I have replaced more representative images in Figure 3E, G, and Figure 4E in the manuscript.

      Reviewer #2 (Public Review):

      Summary:

      This study provides an important overview of infectious etiology for neurodevelopment delay.

      Strengths:

      Strong RNA evaluation.

      Weaknesses:

      The study lacks an overview of other infectious agents. The study should address the epigenetic contributors (PMID: 36507115) and the role of supplements in improving outcomes (PMID: 27705610). 

      Addressing the above - with references included - is recommended. 

      Thank you for your valuable comment. Our research is mainly inspired by other infectious agents, such as Zika virus; there are many descriptions of Zika virus in the “Discussion” section of the manuscript to better describe and demonstrate our point of view (See pages 12–13). I was unable to retrieve the article (PMID: 36507115), kindly help in confirming the PMID number. I will be very grateful if you can provide the full text. Secondly, I have carefully read the article (PMID: 27705610), which is a very rich and comprehensive review, and summarised and cited it in appropriate places in our manuscript.

      Revision in the “Discussion- limitation” section, line 375-379:

      “First, although several recent protocols have made use of growth factors to promote further neuronal maturation and survival (Lucke-Wold et al., 2018), the organoid culture scheme needs to be further improved owing to the lower percentage of mature neurons and the challenge of cell necrosis within the organoids at this stage in day 55 organoids.”

      Reviewer #3 (Public Review): 

      This article is the first report to study the effects of T. pallidum on the neural development of an iPSC-derived brain organoid model. The study indicates that T. pallidum inhibits the differentiation of subNPC1B neurons into hindbrain neurons, hence affecting brain organoid neurodevelopment. Additionally, the TCF3 and notch signaling pathways may be involved in the inhibition of the subNPC1B-hindbrain neuron differentiation axis. While the majority of the data in this study support the conclusions, there are still some questions that need to be addressed and data quality needs to be improved. The study provides valuable insights for future investigations into the mechanisms underlying congenital neurodevelopment disability. 

      I sincerely appreciate your comments on our paper. The comments have helped us greatly improve the quality of our paper. Thank you for your time and constructive critique.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Paired t-test analysis is not appropriate if two distinct groups are compared. 

      I sincerely apologize for our presentation. We used a nonparametric t-test to compare the two groups. I have confirmed and corrected the statistical method description of this manuscript (Revision in the “Materials and methods” section (line 553-555) and “Figures-legend” section (line 789-790, 817-818, 829-830) in current manuscript).

      Reviewer #3 (Recommendations For The Authors): 

      (1) Can the authors explain why the mean size of organoids infected with T. pallidum is smaller?

      Thank you for your valuable comment. In our study, T. pallidum infection resulted in brain organisational changes in neural rosette-like structures resembling the proliferative regions of the human ventricular zone and caused fewer and incomplete rosette-like structures. Next, the ventricular zone is also the main area where neural progenitor cells (NPCs) reside (PMID: 33838105); our results showed that the proportion of neural progenitor cells (NPC)1 was reduced after T. pallidum infection. Rosette-like structure size changes owing to NPC depletion. Therefore, the mean size of organoids infected with T. pallidum is smaller.

      Revision in the “Results” section, line 101-104:

      “T. pallidum infection resulted in brain organisational changes in neural rosette-like structures resembling the proliferative regions of the human ventricular zone where NPC reside (Krenn et al., 2021), and caused fewer and incomplete rosette-like structures (P < 0.01) (Figure 1D)”

      (2) Why was the target gene for qRT-PCR validation selected to be HOXA5、HOXC5、HOXA4?

      Thank you for your valuable comment. The qRT-PCR experiment was selected here to verify the analysis results of the scRNA-seq. HOX family genes are key factors controlling early hindbrain development, which are expressed in the hindbrain region during the gastrulation stage of early embryonic development and persist into the nerve cell stage, and are essential for the correct induction of hindbrain development and segmentation (PMID: 2571936, 1983472, 1673098, 15930115). Therefore, we selected the HOX family gene for verification.

      PMID: 2571936-WILKINSON D G, et al. Segmental expression of Hox-2 homoeobox- containing genes in the developing mouse hindbrain.

      PMID: 1983472-- FROHMAN M A, et al. Isolation of the mouse Hox-2.9 gene; analysis of embryonic expression suggests that positional information along the anterior-posterior axis is specified by mesoderm.

      PMID: 1673098--MURPHY P, et al. Expression of the mouse labial-like homeobox-containing genes, Hox 2.9 and Hox 1.6, during segmentation of the hindbrain.

      PMID: 15930115-- MCNULTY C L, et al. Knockdown of the complete Hox paralogous group 1 leads to dramatic hindbrain and neural crest defects.

      (3) Why was qRT-PCR not employed in other experimental validations, but solely to validate early neural-specific transcription factor changes?

      Thank you for your valuable comment. The qRT-PCR experiment was selected to validate early neural-specific transcription factor changes, indicating the reliability of the scRNA-seq. Then, validated scRNA-seq data were used to analyze for other neuro-specific gene differences, such as violin plots and heatmap showing differentially expressed genes (Figure 4D and Figure 5B, C). Of course, we also tested it with other experiments, such as immunohistochemistry and flow cytometric screening.

      (4) The authors found that T. pallidum might reduce the differentiation from subNPC1B to hindbrain neurons by inhibiting subNPC1B differentiation in brain organoids. Why were the subNPC1B-specific markers declining?

      Thank you for your valuable comment. scRNA-seq is aimed at complete brain organoids. Cluster analysis of cell types of organoids is performed according to specific marker genes of different cells. The decrease in the expression of marker genes of certain cell groups indicates that the cell proportion of such cell groups in the whole organoids is reduced. We analysed organoids following T. pallidum infection, uniform manifold approximation and projection (UMAP), and clustering of the NPC1 population demonstrated that T. pallidum downregulated the number of subNPC1B population. Therefore, the results demonstrated a decrease in the subNPC1B -specific markers.

      (5) In comparison to the other figures, Figure 5E letter size is excessively small and ambiguous.

      Thanks for your valuable comments, I have adjusted Figure 5E letter size.

      (6) Figure 5E shows that TCF3, more than one gene, is specifically enriched in subNPC1B of the T. pallidum group. It is best to confirm the impact of the other gene. 

      Thank you for raising this key issue that we had not addressed properly in our previous version of the manuscript; we have added further analytical data. The SCENIC analysis found that the transcriptional activity of 52 genes has significantly changed after T. pallidum infection. Furthermore, GO analyses demonstrated that 27 transcription factors were significantly enriched in four key pathways of neural differentiation and development. TCF3 is the sole transcription factor present in all four terms simultaneously, speculating that TCF3 is the key transcription factor for the inhibition of subNPC1B-hindbrain neuron differentiation caused by T. pallidum.

      Revision in the “Results” section, line 261-273:

      “Next, the single-cell regulatory network inference and clustering (SCENIC) analysis for the subNPC1B subcluster was performed to assess the differences in the transcriptional activity of the transcription factors between the two groups and found that the transcriptional activity of 52 genes significantly changed after T. pallidum infection (Figure 5E). Furthermore, GO analyses demonstrated that 27 transcription factors were significantly enriched in key pathways of neural differentiation and development in response to nervous system development, positive regulation of sequence-specific DNA-binding transcription factor activity, positive regulation of neuronal differentiation, and DNA templated transcription regulation. Remarkably, transcription factor 3 (TCF3) is the sole transcription factor present in all four terms simultaneously (Figure 5F), speculating that TCF3 is the key transcription factor for the inhibition of subNPC1B-hindbrain neuron differentiation caused by T. pallidum.”

      Revision in the “Materials and methods” section, line 540-543:

      “The Sankey diagram was created using SankeyMATIC (https://sankeymatic.com/) (Zhang et al., 2023), which was used to characterize the interactions between differential transcription factors and neural differentiation and development.”

      Revision in the “Figure and Figure Legend” section, line 832, 842-844:

      Author response image 2.

      Sankey diagram showing the correspondence between differential transcription factors and neural differentiation and development.

      (7) Are there other experiments demonstrating that TCF3 is a key transcription factor for the inhibition of subNPC1B-hindbrain neuron differentiation caused by T. pallidum

      Thank you for your valuable comment. In the previous experiment, we attempted to select a subNPC1B subcluster by flow sorting to verify the relevant molecular mechanism. Due to the small proportion of subNPC1B subcluster in the whole organoids, the selected cells were in a poor state and could not reach the number of cells required for the experiment. However, we used scRNA-seq data to further identify TCF3 as a key transcription factor that inhibits subNPC1B - hindbrain neuron differentiation induced by T. pallidum. The relevant results and descriptions of the analysis are detailed in the revised manuscript, please see our response to point (6) above.

    2. eLife assessment

      This is a valuable study that describes the effects of T. pallidum on neural development by applying single-cell RNA sequencing to an iPSC-derived brain organoid model. The evidence supporting the claims of the authors is solid, although further evidence to understand the differences in infection rates would strengthen the conclusions of the study. In particular, the conclusions would be strengthened by validating infection efficiency as this can impact the interpretation of single-cell sequencing results, and how these metrics affect organoid size as well as comparison with additional infectious agents. Furthermore, additional functional validations of downstream effectors could be insightful.

    3. Reviewer #1 (Public Review):

      Summary:

      This is an interesting study by Xu et al showing the effects of infection with the Treponema pallidum virus (which causes syphilis disease) on neuronal development using iPSC-derived human brain organoids as a model and single-cell RNA sequencing. This work provides an important insight into the impact of the virus on human development, bridging the gap between the phenomena observed in studies using animal models as well as non-invasive human studies showing developmental abnormalities in fetuses infected with the virus in utero through maternal vertical transmission.

      Using single-cell RNAseq in combination with qPCR and immunofluorescence techniques, the authors show that T. pallidum infected organoids are smaller in size, in particular during later growth stages, contain a larger number of undifferentiated neuronal lineage cells, and exhibit decreased numbers of specific neuronal subcluster, which the authors have identified as undifferentiated hindbrain neurons.

      The study is an important first step in understanding how T. pallidum affects human neuronal development and provides important insight into the potential mechanisms that underlie the neurodevelopmental abnormalities observed in infected human fetuses.

      Strengths:

      (1) The study is well written, and the data quality is good for the most part.

      (2) The study provides an important first step in utilizing human brain organoids to study the impact of T. pallidum infection on neuronal development.

      (3) The study's conclusions may provide important insight to other researchers focused on studying how viral infections impact neuronal development.

    4. Reviewer #3 (Public Review):

      This article is the first report to study the effects of T. pallidum on the neural development of an iSPC-derived brain organoid model. The study indicates that T. pallidum inhibits the differentiation of subNPC1B neurons into hindbrain neurons, hence affecting brain organoid neurodevelopment. Additionally, the TCF3 and notch signaling pathways may be involved in the inhibition of the subNPC1B-hindbrain neuron differentiation axis. While the majority of the data in this study support the conclusions, there are still some questions that need to be addressed and data quality needs to be improved. The study provides valuable insights for future investigations into the mechanisms underlying congenital neurodevelopment disability.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The authors use the innovative CRISPRi method to uncover regulators of cell density and volume in neutrophils. The results show that cells require NHE activity during chemoattractant-driven cell migration. Before migration occurs, cells also undergo a rapid cell volume increase. These results indicate that water flux, driven by ion channels, appears to play a central role in neutrophil migration. The paper is very well written and clear. I suggest adding some discussion about the role of actin in the process, but this is not essential.

      Strengths

      The novel use of CRIPSPi to uncover cell density regulators is very novel. Some of the uncovered molecules were known before, e.g. discussed in Li & Sun, Frontiers in Cell and Developmental Biology, 2021. Others are more interesting, for example PI3K-gamma. The use of caged fMLP is also nice.

      We thank the reviewer for their positive appraisal of our work and have pursued their suggestions for improving our paper in this revision.

      Weaknesses

      One area of investigation that seems to be absent is mentioned in the introduction. I.e., actin is expected to play a role in regulating cell volume increase. Did the authors perform any experiments with LatA? What was seen there? Do cells still migrate with LatA, or is a different interplay seen? The role of PI3K is interesting, and maybe somewhat related to actin. But this may be a different line of inquiry for the future.

      We agree that we could have done a better job explicitly investigating the role of actin dynamics in volume changes. Towards this end, by using Latrunculin B to depolymerize actin, we find that the volume increase in suspension is not affected (Figure 1 – supplemental figure 2A). In our FxM single cell volume measurements of adherent cells, we similarly observed unhindered swelling following latrunculin treatment. These data indicate that actin is dispensable for chemoattractant-induced cell swelling (Figure 1 – supplemental figure 2B) . There was a minor apparent reduction in the final volume reached with the Latrunculin-treated cells as measured by FxM, but this likely reflects minor uptake of the excluded dye following Latrunculin treatment rather than an actual change in final volume. This conclusion is reinforced by the change in 2D footprint area being well modeled by the 2D projection of an isotropically expanding sphere (Figure 1 – supplemental figure 2C) . Latrunculin treatment completely abolishes migration, as is expected for unconfined migration on fibronectin (Figure 1 – supplemental figure 2D-E) . The second Reviewer also wanted us to dig deeper on the role of PI3K-gamma, so we expanded our analysis of this hit (Figure 3 – supplemental figure 1B-D; Figure 4 – supplemental figure 1D-G) .

      Author response image 1.

      Chemoattractant-induced swelling, but not motility, is independent of actin polymerization. (A) Human primary neutrophils were incubated with DMSO or Latrunculin B, activated with 20 nM fMLP, and then volume responses were measured using electronic sizing via a Coulter counter. Latrunculin treatment did not alter cell swelling, indicating that actin polymerization is dispensable for the chemoattractant-induced volume increase. (B) Similar results were obtained using the FxM assay, showing that Latrunculin-treated cells are capable of swelling after stimulation. (C) The Latrunculin-treated cells also increase their footprints, albeit less so than control cells, but this is within the range of what would be expected for this degree of chemoattractant-induced volume increase (modeled by a sphere expanding an equivalent volume). (D) Single cell tracks of primary human neutrophils responding to acute chemoattractant stimulation. Both panels show 15 minutes of tracks with the tracks prior (left) and the 15 minutes post (right) uncaging the chemoattractant. The scale bar is 50 microns. The top panels show the large increase in motility displayed by control cells, while the Latrunculin-treated cells (bottom panels) fail to move. (E) Latrunculin-treated cells consistently fail to move in response to chemoattractant-stimulation. (F) Representative single cell volume traces show that Latrunculin-treated cells (black) lack short-term volume fluctuations but persistently maintain an elevated volume following chemoattractant stimulation. Control cells (blue) exhibit short-term volume fluctuations. (G) The lack of short-term volume fluctuations following latrunculin treatment is borne out across the population, with the coefficient of variation in the volume for single cells (post-swelling) being dramatically lower in Latrunculin-treated cells, suggesting that these short term volume fluctuations depend on actin-based motility.

      Author response image 2.

      Additional validation of swelling screen hits. (A) Mixed WT and CRISPR KO dHL-60 populations post-stimulation show that CA2 (black) and PI3Ky (green) KO both fail to decrease their densities as much as the WT (cyan) population following chemoattractant stimulation. Cells with negative control guides (light gray) have normal volume responses. All tubes were fractionated and aligned on the fraction containing the median of the WT population. Negative values indicate a fraction with a higher density than WT. (B) To validate the perturbations to cell swelling observed with FxM, primary human neutrophils were stimulated in suspension, and their volumes were measured using a Coulter counter. 20 nM fMLP was added at the 0 minute mark. Shaded regions represent the 95% confidence intervals. (C) PI3Kγ inhibition blocks the chemoattractant-induced volume change in primary human neutrophils, as assayed by FxM. (D) PI3Kγ inhibition also blocked the chemoattractant-drive shape change in human primary neutrophils, as measured by the change in footprint area in FxM (E) The coefficient of variation in volume for control (cyan) and iNHE1 (gold) inhibited human primary neutrophils undergoing chemokinesis are comparable, suggesting that the volume fluctuations are unchanged in moving cells upon NHE1 and PI3Kγ inhibition despite the different baseline volumes.

      Author response image 3.

      Additional validation of motility phenotypes. (A-D) Single cell tracks of primary human neutrophils responding to acute chemoattractant stimulation. Both panels show tracks of cells 15 minutes prior (left) versus 15 minutes post (right) uncaging the chemoattractant. The scale bar is 50 microns. Color saturation indicates time with tracks progressing from gray to full color. (A) Control cells show a large increase in movement upon uncaging, (B) NHE1 inhibited cells also initiate movement but to a lesser degree, (C) hypo-osmotic shock rescues the NHE1 motility defect. (D) PI3Kγ leads to a large fraction of cells failing to initiate movement. (E) PI3Kγ inhibition showed near complete blockage of the chemoattractant-induced motility increase in primary human neutrophils. (F) Control neutrophils (blue) show an increased angular alignment upon stimulation as their motility becomes directional. NHE1-inhibition (gold, iNHE1) has very little effect on this process, while PI3Kγ inhibition (green) leads to a reduction in this alignment at the population level. (G) For the PI3Kγ inhibited cells that start migrating, the migration-induced volume fluctuations are comparable to iNHE1 and control cells. The top panel shows the track of a representative migrating PI3Kγ inhibited cell and the bottom panel, its corresponding volume normalized to the pre-stimulation volume. The scale bar is 50 microns.

      Reviewer #2 (Public Review):

      Nagy et al investigated the role of volume increase and swelling in neutrophils in response to the chemoattractant. Authors show that following chemoattractant response cells lose their volume slightly owing to the cell spreading phase and then have a relatively rapid increase in the cell volume that is concomitant with cell migration. The authors performed an impressive genome-wide CRISPR screen and buoyant density assay to identify the regulators of neutrophil swelling. This assay showed that stimulating cells with chemoattractant fMLP led to an increase in the cell volume that was abrogated with the FPR1 receptor knockout. The screen revealed a cascade that could potentially be involved in cell swelling including NHE1 (sodium-proton antiporter) and PI3K. NHE1 and PI3K are required for chemoattractant-induced swelling in human primary neutrophils. Authors also suggest slightly different functions of NHE1 and PI3K activity where PI3K is also required to maintain chemoattractant-induced cell shape changes. The authors convincingly show that chemoattractant-induced cell swelling is linked to cell migration and NHE1 is required for swelling at the later stages of swelling since the cells at the early point work on low-volume and low-velocity regime. Interestingly, the authors also show that lack of swelling in NHE1-inhibited cells could be rescued by mild hypo-osmotic swelling strengthening the argument that water influx followed chemoattractant stimulation is important for potentiation for migration.

      The conclusions of this paper are mostly well supported by data and are pretty convincing, but some aspects of image acquisition and data analysis need to be clarified and extended.

      We thank the reviewer for their positive appraisal of our work and pursued their suggestions for improving our paper in this revision.

      Weaknesses

      (1) It would really help if the authors could add the missing graph for the footprint area when cells are treated with Latranculin. Graph S1F for volume changes with Lat treatment should be compared with DMSO-treated controls.

      We agree that the Latrunculin condition merits more thorough investigation. To this end, we compared the volume response of human primary neutrophils to chemoattractant addition for Latrunculin B treated cells versus DMSO controls in suspension and show that there is no difference in swelling (Figure 1 – supplemental figure 2A) . This is additionally confirmed with FxM measurements with a slight undershooting of the final volume likely due to minor uptake of the excluded dye by Latrunculin treated cells (Figure 1 – supplemental figure 2B) . We have also included the requested footprint area changes in the Latrunculin treated cells as compared to controls (Figure 1 – supplemental figure 2C) . The treated cell footprints increase much less than the controls, and this is likely due to a lack of active cell spreading in the Latrunculin treated cells. The increase in footprint area observed following latrunculin treatment is within the range of what would be expected for the 2D projection of an isotropically expanding sphere fitted to the Latrunculin volume data (salmon line).

      Author response image 4.

      Chemoattractant-induced swelling, but not motility, is independent of actin polymerization. (A) Human primary eutrophils were incubated with DMSO or Latrunculin B, activated with 20 nM fMLP, and then volume responses were measured using electronic sizing via a Coulter counter. Latrunculin treatment did not alter cell swelling, indicating that actin polymerization is dispensable for the chemoattractant-induced volume increase. (B) Similar results were obtained using the FxM assay, showing that Latrunculin-treated cells are capable of swelling after stimulation. (C) The Latrunculin-treated cells also increase their footprints, albeit less so than control cells, but this is within the range of what would be expected for this degree of chemoattractant-induced volume increase (modeled by a sphere expanding an equivalent volume).

      (2) The authors show inhibition of NHE1 blocked cell swelling using Coulter counter, a similar experiment should be done with PI3K inhibitions especially since they see PI3K inhibition impact chemoattractant-induced cell shape change.

      Good idea. PI3Ky inhibition led to a substantial reduction in the chemoattractant-driven swelling in suspension showing the critical role of PI3K in the swelling of human primary neutrophils (Figure 3 – supplemental figure 1B) .

      Author response image 5.

      Additional validation of swelling screen hits. (B) To validate the perturbations to cell swelling observed with FxM, primary human neutrophils were stimulated in suspension, and their volumes were measured using a Coulter counter. 20 nM fMLP was added at the 0 minute mark. Shaded regions represent the 95% confidence intervals.

      (3) It would be more convincing visually if the authors could also include the movie of cell spreading (footprint) and then mobility with PI3K inhibition.

      Included as suggested. We agree this is a more compelling way to present the data (Figure 4 – supplemental figure 1A-D,G)

      Author response image 6.

      Additional validation of motility phenotypes. (A-D) Single cell tracks of primary human neutrophils responding to acute chemoattractant stimulation. Both panels show tracks of cells 15 minutes prior (left) versus 15 minutes post (right) uncaging the chemoattractant. The scale bar is 50 microns. Color saturation indicates time with tracks progressing from gray to full color. (A) Control cells show a large increase in movement upon uncaging. (D) PI3Kγ leads to a large fraction of cells failing to initiate movement. (E) PI3Kγ inhibition showed near complete blockage of the chemoattractant-induced motility increase in primary human neutrophils. (G) For the PI3Kγ inhibited cells that start migrating, the migration-induced volume fluctuations are comparable to iNHE1 and control cells. The top panel shows the track of a representative migrating PI3Kγ inhibited cell and the bottom panel, its corresponding volume normalized to the pre-stimulation volume. The scale bar is 50 microns.

      (4) It is not clear how cell spreading and later volume increase are linked to overall mobility of neutrophils. Are authors suggesting that cell spreading is not required for cell mobility in neutrophils?

      We did not mean to imply that cell spreading is not required for neutrophil motility. We take advantage of the fact that we can inhibit cell swelling without inhibiting spreading to investigate the specific role of swelling on migration ( Figure 4) . Conversely, cell spreading on a substrate is not required for chemoattractant-induced cell swelling, as chemoattractant-induced swelling occurs in latrunculin-treated cells (Figure 1 – supplemental figure 2A-C) . However, these latrunculin-treated cells are not able to migrate, at least not in the context studied here (Figure 1 – supplemental figure 2 D-E) . Cell spreading and swelling are likely both critical contributors to neutrophil motility, but their relative importance is dependent on the migratory context. The single cell volume fluctuation analysis indicates that migration-associated spreading and shape changes have large impacts on cell volume ( Figure 1 F) . These fluctuations are asynchronous, obscuring their observation at the population level, but the single cell traces clearly demonstrate them and their correlation with movement.

      ( 5) Volume fluctuations associated with motility were impacted by NHE1 inhibition at the baselines, what about PI3K inhibitions? Does that impact the actual fluctuations?

      PI3K inhibition causes a significant fraction of cells to stop migrating (Figure 4 – supplemental figure 1D) , but among those that do move, they are still able to fluctuate in volume (Figure 4 – supplemental figure 1G) .

      Author response image 7.

      Additional validation of motility phenotypes. (G) For the PI3Kγ inhibited cells that start migrating, the migration-induced volume fluctuations are comparable to iNHE1 and control cells. The top panel shows the track of a representative migrating PI3Kγ inhibited cell and the bottom panel, its corresponding volume normalized to the pre-stimulation volume. The scale bar is 50 microns.

      In contrast, latrunculin abolishes the volume fluctuations that normally accompany migration (Figure 1 – supplemental figure 2F-G) . These data suggest that movement/spreading itself is the driver of the rapid volume fluctuations. In contrast, the sustained volume increase following chemoattractant stimulation is independent of shape change and still occurs in latrunculin-treated cells.

      Author response image 8.

      Chemoattractant-induced swelling, but not motility, is independent of actin polymerization. (F) Representative single cell volume traces show that Latrunculin-treated cells (black) lack short-term volume fluctuations but persistently maintain an elevated volume following chemoattractant stimulation. Control cells (blue) exhibit short-term volume fluctuations. (G) The lack of short-term volume fluctuations following latrunculin treatment is borne out across the population, with the coefficient of variation in the volume for single cells (post-swelling) being dramatically lower in Latrunculin-treated cells, suggesting that these short term volume fluctuations depend on actin-based motility.

      (6) It would really help if the authors compared similar analyses and drew conclusions from that, for example, it is unclear what the authors mean by they found no change in the angular persistence of WT and NHE1 inhibited cells which is in contrast to PI3K inhibition since they do not really have an analysis for angular persistence in PI3K inhibited cells. (S4A and S4B).

      Thanks for catching this oversight in these experiments that we previously performed but neglected to include in the initial submission. We now include plots for angular persistence, velocity, and footprint size for the PI3K-gamma-inhibited cells. The results show that PI3K-gamma inhibition interferes both with swelling (Figure 3 – supplemental figure 1B-D) and motility (Figure 4 – supplemental figure 1D-F) , which aligns with its role upstream of the other hits identified in our screen.

      Author response image 9.

      Additional validation of motility phenotypes. (A-D) Single cell tracks of primary human neutrophils responding to acute chemoattractant stimulation. Both panels show tracks of cells 15 minutes prior (left) versus 15 minutes post (right) uncaging the chemoattractant. The scale bar is 50 microns. Color saturation indicates time with tracks progressing from gray to full color. (A) Control cells show a large increase in movement upon uncaging, (B) NHE1 inhibited cells also initiate movement but to a lesser degree, (C) hypo-osmotic shock rescues the NHE1 motility defect. (D) PI3Kγ leads to a large fraction of cells failing to initiate movement. (E) PI3Kγ inhibition showed near complete blockage of the chemoattractant-induced motility increase in primary human neutrophils. (F) Control neutrophils (blue) show an increased angular alignment upon stimulation as their motility becomes directional. NHE1-inhibition (gold, iNHE1) has very little effect on this process, while PI3Kγ inhibition (green) leads to a reduction in this alignment at the population level. (G) For the PI3Kγ inhibited cells that start migrating, the migration-induced volume fluctuations are comparable to iNHE1 and control cells. The top panel shows the track of a representative migrating PI3Kγ inhibited cell and the bottom panel, its corresponding volume normalized to the pre-stimulation volume. The scale bar is 50 microns.

    2. eLife assessment

      This fundamental study significantly advances our understanding of the role of water influx and swelling on neutrophil migration in response to chemoattractant. The evidence supporting the conclusions, based on a genome-wide CRISPR screen and high quality cellular observations, is compelling. This paper will be of interest to cell biologists and biophysicists working on cell migration.

    3. Reviewer #1 (Public Review):

      Summary:

      The authors use innovative CRISPRi method to uncover regulators of cell density and volume in neutrophils. The results show that cells require NHE activity during chemoattractant-driven cell migration. Before migration occurs, cells also undergo a rapid cell volume increase. These results indicate that water flux, driven by ion channels, appears to play a central role in neutrophil migration. The paper is very well written and clear. The revised version has addressed all of my questions.

    4. Reviewer #2 (Public Review):

      Nagy et al investigated the role of volume increase and swelling in neutrophils in response to the chemoattractant. Authors show that following chemoattractant response cells lose their volume slightly owing to the cell spreading phase and then have a relatively rapid increase in the cell volume that is concomitant with cell migration. Authors performed an impressive genome-wide CRISPR screen and buoyant density assay to identify the regulators of neutrophil swelling. This assay showed that stimulating cells with chemoattractant fMLP lead to an increase in the cell volume that was abrogated with the FPR1 receptor knockout. The screen revealed a cascade that could potentially be involved cell swelling including NHE1 (sodium-proton antiporter) and PI3K. NHE1 and PI3K is required for chemoattractant-induced swelling in human primary neutrophils. Authors also suggest slightly different functions of NHE1 and PI3K activity where PI3K is also required for maintain chemoattractant-induced cell shape changes. Authors convincingly show that chemoattractant induced cell swelling is linked to cell migration and NHE1 is required for swelling at the later stages of swelling since the cells at the early point work on low-volume and low-velocity regime. Interesting authors also show that lack of swelling in NHE1 inhibited cells could be rescued by mild hypo-osmotic swelling strengthening the argument that water influx followed chemoattractant stimulation is important for potentiation for migration.

      The conclusions of this paper are mostly well supported by data and is pretty convincing

    1. Reviewer #1 (Public Review):

      The revised manuscript "Diffusive lensing as a mechanism of intracellular transport and compartmentalization" is very similar to the original manuscript. The main difference between the revised and the original manuscript is that the authors have removed the reference to viscosity gradient and instead talk of diffusivity gradient. With this change the manuscript the analysis and claims in the manuscript are much more aligned. The manuscript, as the original version, explores the role of spatially varying diffusion constant in three scenarios:

      (i) Spatial localization of non-particles<br /> (ii) Clustering in presence of inter-particle interactions<br /> (iii) Moment analysis for non-interacting particles in space with discrete patches of inhomogeneous diffusivity.

      Since the manuscript has not changed much the strengths and weaknesses, in my opinion, remain similar to that of the original manuscript.

      Strengths: The implications of a heterogeneous environment on phase separation and reaction kinetics in cells are under-explored. This makes the general theme of this manuscript relevant and interesting.

      Weaknesses: The central part of the paper "diffusive lensing", i.e., particles localizing in the region of low diffusion constant is not new. Some of the papers authors cite already show that. The parts on phase separation and frap analysis that could provide new results are not rigorous enough for a theory paper.

      I reiterate some of my comments from the original version that are valid for the revised version as well.

      My main criticism was not to say that some convention should be used or some not. But instead, the main point was to say that just because there is spatial diffusion constant that does not mean there will be a spatial gradient of particles. From the authors response to my comments, it is clear that they understand the subtilties around it and are aware of the relevant papers. However, a reader not familiar with this discussion may work under the impression that if there if there is a spatialy varying diffusion constant in cell there will be an accumulation of particles in the region of low diffusivity but that may not always be the case. Moreover, localisation of particles in the region of low diffusivity has been reported in many different context. Some of the papers that the author cite already show that. For example, in Rupprecht et al. 2018 non-isothermal interpretation is applied to the dynamics of objects inside cells.

      Given that the central result is not new. The paper could still be of general interest to the biophysics community if the follow up sections (ii) Clustering in presence of inter-particle interactions and (iii) Moment analysis for non-interacting particles in space with discrete patches of inhomogeneous diffusivity were analysed rigorously.

    2. Reviewer #2 (Public Review):

      Summary:

      The authors study through theory and simulations the diffusion of microscopic particles, and aim to account for the effects of inhomogeneous viscosity and diffusion - in particular regarding the intracellular environment. They propose a mechanism, termed "Diffusive lensing", by which particles are attracted towards low-diffusivity regions where they remain trapped. To obtain these results, the authors rely on agent-based simulations using custom rules performed within the Ito stochastic calculus convention, without drift. They acknowledge the fact that this convention does not describe equilibrium systems, and that their results would not hold at equilibrium - and discard these facts by invoking the facts that cells are out-of-equilibrium. Finally, they show some applications of their findings, in particular enhanced clustering in the low-diffusivity regions. The authors conclude that as inhomogeneous diffusion is ubiquitous in life, so must their mechanism be, and hence it must be important.

      Strengths:

      The article is well-written, clearly intelligible, its hypotheses are stated relatively clearly and the models and mathematical derivations are compatible with these hypotheses. In the appendices, the authors connect their findings to known results for classic stochastic differential equation formalisms.

      Weaknesses:

      This study is, in my opinion, deeply flawed. The main problem lies in the hypotheses, in particular the choice of considering drift-less dynamics in the Ito convention. It is regrettable that the authors choose to use agent-based custom simulations with little physical motivation, rather than a well-established stochastic differential equations framework.

      Indeed, stochastic conventions are a notoriously tricky business, but they are both mathematically and physically well-understood and do not result in any "dilemma" [some citations in the article, such as (Lau and Lubensky) and (Volpe and Wehr), make an unambiguous resolution of these]. In the continuous-time limit, conventions are not an intrinsic, fixed property of a system, but a choice of writing; however, whenever going from one to another, one must include a corresponding "spurious drift" that compensates the effect of this change - a mathematical subtlety that is omitted in the article (except in a quick note in the appendix): in the presence of diffusive gradients, if the drift is zero in one convention, it will thus be non-zero in another. It is well established that for equilibrium systems obeying fluctuation-dissipation, the spurious drift vanishes in the anti-Ito stochastic convention; more precisely one can write in the anti-Ito convention

      dx/dt = - D(x)/kT grad U(x) + sqrt(2D(x)) dW

      with D(x) the diffusion, kT the thermal energy (which is space-independent at equilibrium), and dW a d-dimensional Wiener process. Equivalently one can write in the Ito convention:

      dx/dt = - D(x)/kT grad U(x) + sqrt(2D(x)) dW + div D(x) (*)

      where the latter term is the spurious drift arising from convention change. This ensures that the diffusion gradients do not induce currents and probability gradients, and thus that the steady-state PDF is the Gibbs measure (this form has been confirmed experimentally, for instance, for colloidal particles near walls, that have strong diffusivity gradients despite not having significant forces). It generalizes to near-equilibrium systems with non-conservative forces and/or temperature gradient in the form:

      dx/dt = F(x) + sqrt(2D(x)) dW + div D(x) (**)

      where the drift field F(x) encodes these forces. In some cases, it has been shown through careful microscopic analysis that one can have effectively a different form for the last term, namely

      dx/dt = F(x) + sqrt(2D(x)) dW + alpha div D(x)

      where alpha is a "convention parameter" that would be =1 at equilibrium. For instance, in the Volpe and Wehr review this can occur through memory effects in robotic dynamics, or through strong fluctuation-dissipation breakdown. In a near-equilibrium system, this should be strongly justified, as the continuous-time dynamics with alpha \neq 1 and drift F would be indistinguishable from one with alpha = 1 and drift F + (1-alpha) div D: the authors would have the burden of proving that the observed (absence of) drift is indeed due to alpha\neq 1, rather than to much more common force fields F(x).

      Here, without further motivation than the statement that cells are out-of-equilibrium, drifts are arbitrarily set to zero in the Ito convention, which is in (**) the equivalent to adding a force with drift $-div D$ exactly compensating the spurious drift. It is the effects of this arbitrary force that are studied in the article. The fact that it results in probability gradients is trivial once formulated this way (and in no way is this new - many of the references, for instance Volpe and Wehr, mention this). Enhanced clustering is also a trivial effect of this probability gradient (the local concentration is increased by this force field, so phase separation can occur). As a side note the "neighbor sensing" scheme to describe interactions is itself very peculiar and not physically motivated - it violates stochastic thermodynamics laws too, as detailed balance is apparently not respected. There again, the authors have chosen to disregard a century of stochastic thermodynamics in favor of a non-justified unphysical custom rule.

      The authors make no further justification of their choice of driftless Ito simulations than the fact that cells are out-of-equilibrium, leaving the feeling that this is a detail. They make mentions of systems (eg glycogen, prebiotic environment) for which (near-)equilibrium physics should mostly prevail, and of fluctuation dissipation ("Diffusivity varies inversely with viscosity", in the introduction). Yet the "phenomenon" they discuss is entirely reliant on an undiscussed mechanism by which these assumptions would be completely violated (the citations they make for this - Gnesotto '18 and Phillips '12 - are simply discussions of the fact that cells are out-of-equilibrium, not on any consequences on the convention).

      Finally, while inhomogeneous diffusion is ubiquitous, the strength of this effect in realistic conditions is not discussed. Even in the most "optimistic" case where alpha=0 would make sense (knowing that in the cellular context we are discussing thermal systems immersed in water and if energy consumption and metabolism were stopped alpha would relax back to 1), the equation (*) above shows that having zero ito drift is equivalent to having a potential countering the spurious drift, with value

      U(x) = kT log(D(x) / D0 )

      [I have assumed isotropic diffusion for simplicity here, so the div is replaced by a grad]. This means that the diffusion contrasts logarithmically compare to the chemical potential ones -- for instance a major diffusion difference of 100x is equivalent to 4.6kT in potential energy, a relatively modest effect. To prove that the authors' effect of "diffusive lensing" is involved in such a system, one would thus have to<br /> 1) observe strong spatial variations of the diffusion coefficient (this is doable, and was done before), AND<br /> 2) show that there is an enrichment of the diffusing species in the low-diffusion region inversely proportional to the diffusion, AND<br /> 3) show that this enrichment cannot be attributed to mild differences in potential energy, for instance by showing that if nonequilibrium energy consumption stops, the concentration fully homogenizes while the diffusion gradients remain.

      If the authors were to successfully show all that in an experimental system, or design a theoretical framework where these effects convincingly emerge from physically realistic microscopic dynamical rules, they would have indeed discovered a new phenomenon. In contrast, the current article only demonstrates the well-known fact that when using arbitrary dynamical rules in heterogeneous diffusion simulations, one can get concentration gradients.

    3. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The authors discuss an effect, "diffusive lensing", by which particles would accumulate in high-viscosity regions, for instance in the intracellular medium. To obtain these results, the authors rely on agent-based simulations using custom rules performed with the Ito stochastic calculus convention. The "lensing effect" discussed is a direct consequence of the choice of the Ito convention without spurious drift which has been discussed before and is likely to be inadequate for the intracellular medium, causing the presented results to likely have little relevance for biology.

      We thank the editors and the reviewers for their consideration of our manuscript. We argue in this rebuttal and revision that our results and conclusions are in fact likely to have relevance for biology. While we use the Itô convention for ease of modeling considering its non-anticipatory nature upon discretization (see (Volpe and Wehr 2016) for the discretization schemes), we refer to Figure S1B to emphasize that diffusive lensing occurs not only under the Itô convention but across a wide parameter space. Indeed, it is absent only in the normative isothermal convention; note that even a stochastic differential equation conforming to the isothermal convention may be reformulated into the Itô convention by adding suitable drift terms, allowing for diffusive lensing to be seen even in case of the isothermal convention. We note in particular that the choice of the convention is a highly context-dependent one (Sokolov 2010); there is not a universally correct choice, and one can obtain stochastic differential equations consistent with Ito or Stratonovich interpretations in different regimes. Lastly, space-dependent diffusivity is now an experimentally well-recognized feature of the cellular interior, as noted in our references and as discussed further later in this response. This fact points towards the potential relevance of our model for subcellular diffusion.

      In our revised preprint, we have made changes to the text and minor changes to figures to address reviewer concerns.

      Responses to the Reviewers

      We thank the reviewers for their feedback and address the issues they raised in this rebuttal and in the revised manuscript. The central point that the reviewers raise concerns the validity of the drift-less Itô interpretation in modeling potential nonequilibrium types of subcellular transport arising from space-dependent diffusivity. If the drift term were considered, the resulting stochastic differential equation stochastic differential equation (SDE) is equivalent to one arising from the isothermal interpretation of heterogeneous diffusivity (Volpe and Wehr 2016), wherein no diffusive lensing is seen (as shown in Fig. S1B). That is, the isothermal interpretation and the drift-comprising Itô SDE produce the same uniform steady-state particle densities.

      While we agree with the reviewers that for a given interpretation, equivalent stochastic differential equations (SDEs) arising from other interpretations may be drawn, we disagree with the generalization that all types of subcellular diffusion conform to the isothermal interpretation. That is, there is no reason why any and all instances of nonequilibrium subcellular particle diffusion must be modeled using isothermal-conforming SDEs (such as the drift-comprising Itô SDE, for instance). We refer to (Sokolov 2010) which prescribes choosing a convention in a context-dependent manner. In this regard, we disagree with the second reviewer’s characterization of making such a choice merely a “choice of writing” considering that it is entirely dependent on the choice of microscopic parameters, as detailed in the discussion section of the manuscript. The following references have also been added to the manuscript: the reference from the first reviewer (Kupferman et al. 2004) proposes a prescription for choosing an appropriate convention based upon comparing the noise correlation time and the particle relaxation time. The reference notes that the Itô convention is appropriate when the particle relaxation time is large when compared to the noise correlation time and the Stratonovich convention is appropriate in the converse scenario. In (Rupprecht et al. 2018), active noise is considered and the resulting Fokker-Planck equation conforms to the Stratonovich convention when thermal noise was negligible. The related reference, (Vishen et al. 2019) compares three timescales: those of particle relaxation, noise correlation and viscoelastic relaxation, to make the choice. Indeed, as noted in the manuscript, lensing is seen in all but one interpretation (without drift additions); only its magnitude is altered by the interpretation/choice of the drift term. The appendix has been modified to include a subsection on the interchangeability of the conventions.

      Separately, with regards to the discussion on anomalous diffusion, the section on mean squared displacement calculation has been amended to avoid confusing our model with canonical anomalous diffusion which considers the anomalous exponent; how the anomalous exponent varies with space-dependent diffusivity offers an interesting future area of study.

      Responses to specific reviewer comments appear below.

      Reviewer #1 (Public Review):

      The manuscript "Diffusive lensing as a mechanism of intracellular transport and compartmentalization", explores the implications of heterogeneous viscosity on the diffusive dynamics of particles. The authors analyze three different scenarios:

      (i)   diffusion under a gradient of viscosity,

      (ii)  clustering of interacting particles in a viscosity gradient, and

      (iii) diffusive dynamics of non-interacting particles with circular patches of heterogeneous viscous medium.

      The implications of a heterogeneous environment on phase separation and reaction kinetics in cells are under-explored. This makes the general theme of this manuscript very relevant and interesting. However, the analysis in the manuscript is not rigorous, and the claims in the abstract are not supported by the analysis in the main text.

      Following are my main comments on the work presented in this manuscript:

      (a) The central theme of this work is that spatially varying viscosity leads to position-dependent diffusion constant. This, for an overdamped Langevin dynamics with Gaussian white noise, leads to the well-known issue of the interpretation of the noise term.

      The authors use the Ito interpretation of the noise term because their system is non-equilibrium.

      One of the main criticisms I have is on this central point. The issue of interpretation arises only when there are ill-posed stochastic dynamics that do not have the relevant timescales required to analyze the noise term properly. Hence, if the authors want to start with an ill-posed equation it should be mentioned at the start. At least the Langevin dynamics considered should be explicitly mentioned in the main text. Since this work claims to be relevant to biological systems, it is also of significance to highlight the motivation for using the ill-posed equation rather than a well-posed equation. The authors refer to the non-equilibrium nature of the dynamics but it is not mentioned what non-equilibrium dynamics to authors have in mind. To properly analyze an overdamped Langevin dynamics a clear source of integrated timescales must be provided. As an example, one can write the dynamics as Eq. (1) \dot x = f(x) + g(x) \eta , which is ill-defined if the noise \eta is delta correlated in time but well-defined when \eta is exponentially correlated in time. One can of course look at the limit in which the exponential correlation goes to a delta correlation which leads to Eq. (1) interpreted in Stratonovich convention. The choice to use the Ito convention for Eq. (1) in this case is not justified.

      We thank the reviewer for detailing their concerns with our model’s assumptions. We have addressed them in the common rebuttal.

      (b) Generally, the manuscript talks of viscosity gradient but the equations deal with diffusion which is a combination of viscosity, temperature, particle size, and particle-medium interaction. There is no clear motivation provided for focus on viscosity (cytoplasm as such is a complex fluid) instead of just saying position-dependent diffusion constant. Maybe authors should use viscosity only when talking of a context where the existence of a viscosity gradient is established either in a real experiment or in a thought experiment.

      The manuscript has been amended to use only “diffusivity” to avoid confusion.

      (c) The section "Viscophoresis drives particle accumulation" seems to not have new results. Fig. 1 verifies the numerical code used to obtain the results in the later sections. If that is the case maybe this section can be moved to supplementary or at least it should be clearly stated that this is to establish the correctness of the simulation method. It would also be nice to comment a bit more on the choice of simulation methods with changing hopping sizes instead of, for example, numerically solving stochastic ODE.

      The main point of this section and of Fig. 1 is the diffusive lensing effect itself: the accumulation of particles in lower-diffusivity areas. To the best of our knowledge, diffusive lensing has not been reported elsewhere as a specific outcome of non-isothermal interpretations of diffusion, with potential relevance to nonequilibrium subcellular motilities. The simulation method has been fully described in the Methods section, and the code has also been shared (see Code Availability).

      A minor comment, the statement "the physically appropriate convention to use depends upon microscopic parameters and timescale hierarchies not captured in a coarse-grained model of diffusion." is not true as is noted in the references that authors mention, a correct coarse-grained model provides a suitable convention (see also Phys. Rev. E, 70(3), 036120., Phys. Rev. E, 100(6), 062602.).

      This has been addressed in the common rebuttal.

      (d) The section "Interaction-mediated clustering is affected by viscophoresis" makes an interesting statement about the positioning of clusters by a viscous gradient. As a theoretical calculation, the interplay between position-dependent diffusivity and phase separation is indeed interesting, but the problem needs more analysis than that offered in this manuscript. Just a plot showing clustering with and without a gradient of diffusion does not give enough insight into the interplay between density-dependent diffusion and position-dependent diffusion. A phase plot that somehow shows the relative contribution of the two effects would have been nice. Also, it should be emphasized in the main text that the inter-particle interaction is through a density-dependent diffusion constant and not a conservative coupling by an interaction potential.

      The density-dependence has been added from the Methods to the main text. The goal of the work is to present lensing as a natural outcome of the parameter choices we make and present its effects as they relate to clustering and commonly used biophysical methods to probe dynamics within cells. A dense sampling of the phase space and how it is altered as a function of diffusivity, and the subsequent interpretation, lie beyond the scope of the present work but offer exciting future directions of study.

      (e) The section "In silico microrheology shows that viscophoresis manifests as anomalous diffusion" the authors show that the MSD with and without spatial heterogeneity is different. This is not a surprise - as the underlying equations are different the MSD should be different.

      The goal here is to compare and contrast the ways in which homogeneous and heterogeneous diffusion manifest in simulated microrheology measurements. We hope that an altered saturation MSD, as is observed in our simulations, provokes interest in considering lensing while modeling experimental data.

      There are various analogies drawn in this section without any justification:

      (i) "the saturation MSD was higher than what was seen in the homogeneous diffusion scenario possibly due to particles robustly populating the bulk milieu followed by directed motion into the viscous zone (similar to that of a Brownian ratchet, (Peskin et al., 1993))."

      In case of i), the Brownian ratchet is invoked as a model to explain directed accumulation. We have removed this analogy to avoid confusion as it is not delved into further over the course of our work.

      (ii) "Note that lensing may cause particle displacements to deviate from a Gaussian distribution, which could explain anomalous behaviors observed both in our simulations and in experiments in cells (Parry et al., 2014)." Since the full trajectory of the particles is available, it can be analyzed to check if this is indeed the case.

      This has been addressed in the common rebuttal.

      (f) The final section "In silico FRAP in a heterogeneously viscous environment ... " studies the MSD of the particles in a medium with heterogeneous viscous patches which I find the most novel section of the work. As with the section on inter-particle interaction, this needs further analysis.

      We thank the reviewer for their appreciation. In presenting these three sections discussing the effects of diffusive lensing, we intend to broadly outline the scope of this phenomenon in influencing a range of behaviors. Exploring the directions further comprise promising future directions of study that lie beyond the scope of this manuscript.

      To summarise, as this is a theory paper, just showing MSD or in silico FRAP data is not sufficient. Unlike experiments where one is trying to understand the systems, here one has full access to the dynamics either analytically or in simulation. So just stating that the MSD in heterogeneous and homogeneous environments are not the same is not sufficient. With further analysis, this work can be of theoretical interest. Finally, just as a matter of personal taste, I am not in favor of the analogy with optical lensing. I don't see the connection.

      We value the reviewer’s interest in investigating the causes underlying the differences in the MSDs and agree that it represents a promising future area of study. The main point of this section of the manuscript was to make a connection to experimentally measurable quantities.

      Reviewer #2 (Public Review):

      Summary:

      The authors study through theory and simulations the diffusion of microscopic particles and aim to account for the effects of inhomogeneous viscosity and diffusion - in particular regarding the intracellular environment. They propose a mechanism, termed "Diffusive lensing", by which particles are attracted towards high-viscosity regions where they remain trapped. To obtain these results, the authors rely on agent-based simulations using custom rules performed with the Ito stochastic calculus convention, without spurious drift. They acknowledge the fact that this convention does not describe equilibrium systems, and that their results would not hold at equilibrium - and discard these facts by invoking the fact that cells are out-of-equilibrium. Finally, they show some applications of their findings, in particular enhanced clustering in the high-viscosity regions. The authors conclude that as inhomogeneous diffusion is ubiquitous in life, so must their mechanism be, and hence it must be important.

      Strengths:

      The article is well-written, and clearly intelligible, its hypotheses are stated relatively clearly and the models and mathematical derivations are compatible with these hypotheses.

      We thank the reviewer for their appreciation.

      Weaknesses:

      The main problem of the paper is these hypotheses. Indeed, it all relies on the Ito interpretation of the stochastic integrals. Stochastic conventions are a notoriously tricky business, but they are both mathematically and physically well-understood and do not result in any "dilemma" [some citations in the article, such as (Lau and Lubensky) and (Volpe and Wehr), make an unambiguous resolution of these]. Conventions are not an intrinsic, fixed property of a system, but a choice of writing; however, whenever going from one to another, one must include a "spurious drift" that compensates for the effect of this change - a mathematical subtlety that is entirely omitted in the article: if the drift is zero in one convention, it will thus be non-zero in another in the presence of diffusive gradients. It is well established that for equilibrium systems obeying fluctuation-dissipation, the spurious drift vanishes in the anti-Ito stochastic convention (which is not "anticipatory", contrarily to claims in the article, are the "steps" are local and infinitesimal). This ensures that the diffusion gradients do not induce currents and probability gradients, and thus that the steady-state PDF is the Gibbs measure. This equilibrium case should be seen as the default: a thermal system NOT obeying this law should warrant a strong justification (for instance in the Volpe and Wehr review this can occur through memory effects in robotic dynamics, or through strong fluctuation-dissipation breakdown). In near-equilibrium thermal systems such as the intracellular medium (where, although out-of-equilibrium, temperature remains a relevant and mostly homogeneous quantity), deviations from this behavior must be physically justified and go to zero when going towards equilibrium.

      Considering that the physical phenomena underlying diffusion span a range of timescales (particle relaxation, noise, environmental correlation, et cetera), we disagree with the assertion that all types of subcellular diffusion processes can be modeled as occurring at thermal equilibrium: for example, one can easily imagine memory effects arising in the presence of an appropriate hierarchy of timescales. We have added references that describe in more detail the way in which the comparison of timescales can dictate the applicability of different conventions. We also refer the referee to the common rebuttal section of our response in which we discuss factors that govern the choice of the interpretation. The adiabatic elimination arguments highlighted in (Kupferman et al. 2004) provide a clear description of how relevant particle and environment-related timescales can inform the choice of stochastic calculus to use.

      With regards to the use of the term “anticipatory” to refer to the isothermal interpretation, we refer to the comment in (Volpe and Wehr 2016) of the Itô interpretation “not looking into the future”. In any case, whether anticipatory or otherwise, the interpretation’s effect on our model remains unchanged, as highlighted in the section in the Appendix on the conversion between different conventions; this section has been added to minimize confusion about the effects of the choice of convention on lensing.

      Here, drifts are arbitrarily set to zero in the Ito convention (the exact opposite of the equilibrium anti-Ito), which is the equilibrium equivalent to adding a force (with drift $- grad D$) exactly compensating the spurious drift. If we were to interpret this as a breakdown of detailed balance with inhomogeneous temperature, the "hot" region would be effectively at 4x higher temperature than the cold region (i.e. 1200K) in Fig 1A.

      Our work is based on existing observations of space-dependent diffusivity in cells (Garner et al., 2023; Huang et al., 2021; Parry et al., 2014; Śmigiel et al., 2022; Xiang et al., 2020). These papers support a definitive model for the existence of space-dependent diffusivity without invoking space-dependent temperature.

      It is the effects of this arbitrary force (exactly compensating the Ito spurious drift) that are studied in the article. The fact that it results in probability gradients is trivial once formulated this way (and in no way is this new - many of the references, for instance, Volpe and Wehr, mention this).

      Addressed in the common rebuttal.

      Enhanced clustering is also a trivial effect of this probability gradient (the local concentration is increased by this force field, so phase separation can occur). As a side note the "neighbor sensing" scheme to describe interactions is very peculiar and not physically motivated - it violates stochastic thermodynamics laws too, as the detailed balance is apparently not respected.

      The neighbor-sensing scheme used here is just one possible model of an effective attractive potential between particles. Other models that lead to density-dependent attraction between particles should also provide qualitatively similar results as ours; this offers an interesting prospect for future research.

      Finally, the "anomalous diffusion" discussion is at odds with what the literature on this subject considers anomalous (the exponent does not appear anomalous).

      This has been addressed in the common rebuttal, and the relevant part of the manuscript has been modified to avoid confusion.

      The authors make no further justification of their choice of convention than the fact that cells are out-of-equilibrium, leaving the feeling that this is a detail. They make mentions of systems (eg glycogen, prebiotic environment) for which (near-)equilibrium physics should mostly prevail, and of fluctuation-dissipation ("Diffusivity varies inversely with viscosity", in the introduction). Yet the "phenomenon" they discuss is entirely reliant on an undiscussed mechanism by which these assumptions would be completely violated (the citations they make for this - Gnesotto '18 and Phillips '12 - are simply discussions of the fact that cells are out-of-equilibrium, not on any consequences on the convention).

      Finally, while inhomogeneous diffusion is ubiquitous, the strength of this effect in realistic conditions is not discussed (this would be a significant problem if the effect were real, which it isn't). Gravitational attraction is also an ubiquitous effect, but it is not important for intracellular compartmentalization.

      The manuscript text has been supplemented with additional references that detail the ways in which the comparison of timescales can dictate how one can apply different conventions. We refer the reviewer to the common rebuttal section of our response where we detail factors that dictate the choice of the convention to use. As previously noted, the adiabatic elimination arguments highlighted in (Kupferman et al., 2004) provide a prescription for how different timescales are to be considered in deciding the choice of stochastic calculus to use.

      With regards to the strength of space-dependent diffusivity in subcellular milieu, various measurements of heterogeneous diffusivity have been made both across different model systems and via different modalities, as cited in our manuscript. (Garner et al. 2023) used single-particle tracking to determine over 100-fold variability in diffusivity within individual S. pombe cells. Single-molecule measurements in (Xiang et al. 2020) and (Śmigiel et al. 2022) reveal an order-of-magnitude variation in tracer diffusion in mammalian cells and multi-fold variation in E. coli cytoplasm respectively. Fluorescence correlation spectroscopy measurements in (Huang et al. 2022) have found a two-fold increase in short-range diffusion of protein-sized tracers in X. laevis extracts. We have also added a reference to a study that uses 3D single particle tracking in the cytosol of a multinucleate fungus, A. gossypii, to identify regions of low-diffusivity near nuclei and hyphal tips (McLaughlin et al. 2020). Many of these references deploy particle tracking and investigate how mesoscale-sized particles (i.e. tracers spanning biologically relevant size scales) are directly impacted by space-dependent diffusivity. Therefore, we base our model on not only space-dependent diffusivity being a well-recognized feature of the cellular interior, but also on these observations pertaining to mesoscale-sized particles’ motion along relevant timescales.

      These measurements are also relevant to the reviewer’s question about the strength of the effect, which depends directly on the variability in diffusivity: for ten- or a hundred-fold diffusivity variations, the effect would be expected to be significant. In case of using the Itô convention directly, the contrast in concentration gradient is, in fact, that of the diffusivity gradient.

      To conclude, the "diffusive lensing" effect presented here is not a deep physical discovery, but a well-known effect of sticking to the wrong stochastic convention.

      As detailed in the various responses above, we respectfully disagree with the notion that there exists a singular correct stochastic convention that is applicable for all cases of subcellular heterogeneous diffusion. Further, as detailed in (Volpe and Wehr 2016) and as detailed in the Appendix, it is possible to convert between conventions and that an isothermal-abiding stochastic differential equation may be suitably altered, by means of adding a drift term, to an Itô-abiding stochastic differential equation; therefore, one can observe diffusive lensing without discarding the isothermal convention if the latter were modified. Indeed, it is only the driftless (or canonical) isothermal convention that does not allow for diffusive lensing.

    1. eLife assessment

      This fundamental study reports differential expression of key genes in full-term placenta between Tibetans and Han Chinese at high elevations, which are more pronounced in the placenta of male fetus than in female fetus. The gene expression data were collected and analyzed using solid and validated methodology, although there is limited support for hypoxia-specific responses due to a lack of low-altitude samples. Several of the placental genes found in this study have been previously reported to show signatures of positive selection in Tibetans, pointing to a potential mechanism of how human populations adapt to high elevation by mitigating the negative effects of low oxygen on fetal growth. The work will be of interest to evolutionary and population geneticists as well as researchers working on human hypoxic response.

    2. Joint Public Review:

      This manuscript by Yue et al. aims to understand the molecular mechanisms underlying the better reproductive outcomes of Tibetans at high altitude by characterizing the transcriptome and histology of full-term placenta of Tibetans and compare them to those Han Chinese at high elevations.

      The approach is innovative, and the data collected are valuable for testing hypotheses regarding the contribution of the placenta to better reproductive success of populations that adapted to hypoxia. The authors identified hundreds of differentially expressed genes (DEGs) between Tibetans and Han, including the EPAS1 gene that harbors the strongest signals of genetic adaptation. The authors also found that such differential expression is more prevalent and pronounced in the placentas of male fetuses than those of female fetuses, which is particularly interesting, as it echoes with the more severe reduction in birth weight of male neonates at high elevation observed by the same group of researchers (He et al., 2022).

      Comments on latest version:

      The revised manuscript has incorporated the suggested changes and weakened conclusions regarding natural selection. Limitations of the study are also clearly stated in the Discussion section.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Public Review:

      This manuscript by Yue et al. aims to understand the molecular mechanisms underlying the better reproductive outcomes of Tibetans at high altitude by characterizing the transcriptome and histology of full-term placenta of Tibetans and compare them to those Han Chinese at high elevations.

      The approach is innovative, and the data collected are valuable for testing hypotheses regarding the contribution of the placenta to better reproductive success of populations that adapted to hypoxia. The authors identified hundreds of differentially expressed genes (DEGs) between Tibetans and Han, including the EPAS1 gene that harbors the strongest signals of genetic adaptation. The authors also found that such differential expression is more prevalent and pronounced in the placentas of male fetuses than those of female fetuses, which is particularly interesting, as it echoes with the more severe reduction in birth weight of male neonates at high elevation observed by the same group of researchers (He et al., 2022).

      This revised manuscript addressed several concerns raised by reviewers in last round. However, we still find the evidence for natural selection on the identified DEGs--as a group--to be very weak, despite more convincing evidence on a few individual genes, such as EPAS1 and EGLN1.

      The authors first examined the overlap between DEGs and genes showing signals of positive selection in Tibetans and evaluated the significance of a larger overlap than expected with a permutation analysis. A minor issue related to this analysis is that the p-value is inflated, as the authors are counting permutation replicates with MORE genes in overlap than observed, yet the more appropriate way is counting replicates with EQUAL or MORE overlapping genes. Using the latter method of p-value calculation, the "sex-combined" and "female-only" DEGs will become non-significantly enriched in genes with evidence of selection, and the signal appears to solely come from male-specific DEGs. A thornier issue with this type of enrichment analysis is whether the condition on placental expression is sufficient, as other genomic or transcriptomic features (e.g., expression level, local sequence divergence level) may also confound the analysis.

      According to the suggested methods, we counted the replicates with equal or more overlapping genes than observed (≥4 for the “combined” set; ≥9 for the “male-only” set; ≥0 for the “female-only” set). We found that the overlaps between DEGs and TSNGs were significantly enriched only in the “male-only” set (p-value < 1e-4, counting 0 time from 10,000 permutations), but not in the “female-only” set (p-value = 1, counting 10,000 time from 10,000 permutations), or “combined” set (p-value = 0.0603, counting 603 time from 10,000 permutations) (see Table R1 below).

      We updated this information in the revised manuscript, including Results, Methods, and Figure S9.

      Author response table 1.

      Permutation analysis of the overlapped genes between DEGs and TSNGs.

      The authors next aimed to detect polygenic signals of adaptation of gene expression by applying the PolyGraph method to eQTLs of genes expressed in the placenta (Racimo et al 2018). This approach is ambitious but problematic, as the method is designed for testing evidence of selection on single polygenic traits. The expression levels of different genes should be considered as "different traits" with differential impacts on downstream phenotypic traits (such as birth weight). As a result, the eQTLs of different genes cannot be naively aggregated in the calculation of the polygenic score, unless the authors have a specific, oversimplified hypothesis that the expression increase of all genes with identified eQTL will improve pregnancy outcome and that they are equally important to downstream phenotypes. In general, PolyGraph method is inapplicable to eQTL data, especially those of different genes (but see Colbran et al 2023 Genetics for an example where the polygenic score is used for testing selection on the expression of individual genes).

      We would recommend removal of these analyses and focus on the discussion of individual genes with more compelling evidence of selection (e.g., EPAS1, EGLN1).

      According to the suggestion, we removed these analyses in the revised manuscript.

    1. eLife assessment

      This study aggregates across five fMRI datasets and reports that a network of brain areas previously associated with response inhibition processes, including several in the basal ganglia, are more active on failed stop than successful stop trials. This study is valuable as a well-powered investigation of fMRI measures of stopping. However, evidence for the authors' conclusions regarding the role of subcortical nodes in stopping is incomplete, due to the limitations in the fMRI analysis.

    2. Reviewer #1 (Public Review):

      This study is one in a series of excellent papers by the Forstmann group focusing on the ability of fMRI to reliably detect activity in small subcortical nuclei - in this case, specifically those purportedly involved in the hyper- and indirect inhibitory basal ganglia pathways. I have been very fond of this work for a long time, beginning with the demonstration of De Hollander, Forstmann et al. (HBM 2017) of the fact that 3T fMRI imaging (as well as many 7T imaging sequences) do not afford sufficient signal to noise ratio to reliably image these small subcortical nuclei. This work has done a lot to reshape my view of seminal past studies of subcortical activity during inhibitory control, including some that have several thousand citations.

      Comments on revised version:

      This is my second review of this article, now entitled "Multi-study fMRI outlooks on subcortical BOLD responses in the stop-signal paradigm" by Isherwood and colleagues.

      The authors have been very responsive to the initial round of reviews.

      I still think it would be helpful to see a combined investigation of the available 7T data, just to really drive the point home that even with the best parameters and a multi-study sample size, fMRI cannot detect any increases in BOLD activity on successful stop compared to go trials. However, I agree with the authors that these "sub samples still lack the temporal resolution seemingly required for looking at the processes in the SST."

      As such, I don't have any more feedback.

    3. Reviewer #2 (Public Review):

      This work aggregates data across 5 openly available stopping studies (3 at 7 tesla and 2 at 3 tesla) to evaluate activity patterns across the common contrasts of Failed Stop (FS) > Go, FS > stop success (SS), and SS > Go. Previous work has implicated a set of regions that tend to be positively active in one or more of these contrasts, including the bilateral inferior frontal gyrus, preSMA, and multiple basal ganglia structures. However, the authors argue that upon closer examination, many previous papers have not found subcortical structures to be more active on SS than FS trials, bringing into question whether they play an essential role in (successful) inhibition. In order to evaluate this with more data and power, the authors aggregate across five datasets and find many areas that are *more* active for FS than SS, including bilateral preSMA, GPE, thalamus, and VTA. They argue that this brings into question the role of these areas in inhibition, based upon the assumption that areas involved in inhibition should be more active on successful stop than failed stop trials, not the opposite as they observed.

      Since the initial submission, the authors have improved their theoretical synthesis and changed their SSRT calculation method to the more appropriate integration method with replacement for go omissions. They have also done a better job of explaining how these fMRI results situate within the broader response inhibition literature including work using other neuroscience methods.

      They have also included a new Bayes Factor analysis. In the process of evaluating this new analysis, I recognized the following comments that I believe justify additional analyses and discussion:

      First, if I understand the author's pipeline, for the ROI analyses it is not appropriate to run FSL's FILM method on the data that were generated by repeating the same time series across all voxels of an ROI. FSL's FILM uses neighboring voxels in parts of the estimation to stabilize temporal correlation and variance estimates and was intended and evaluated for use on voxelwise data. Instead, I believe it would be more appropriate to average the level 1 contrast estimates over the voxels of each ROI to serve as the dependent variables in the ROI analysis.

      Second, for the group-level ROI analyses there seems to be inconsistencies when comparing the z-statistics (Figure 3) to the Bayes Factors (Figure 4) in that very similar z-statistics have very different Bayes Factors within the same contrast across different brain areas, which seemed surprising (e.g., a z of 6.64 has a BF of .858 while another with a z of 6.76 has a BF of 3.18). The authors do briefly discuss some instances in the frequentist and Bayesian results differ, but they do not ever explain by similar z-stats yield very different bayes factors for a given contrast across different brain areas. I believe a discussion of this would be useful.

      Third, since the Bayes Factor analysis appears to be based on repeated measures ANOVA and the z-statistics are from Flame1+2, the BayesFactor analysis model does not pair with the frequentist analysis model very cleanly. To facilitate comparison, I would recommend that the same repeated measures ANOVA model should be used in both cases. My reading of the literature is that there is no need to be concerned about any benefits of using Flame being lost, since heteroscedasticity does not impact type I errors and will only potentially impact power (Mumford & Nichols, 2009 NeuroImage).

      Fourth, though frequentist statistics suggest that many basal ganglia structures are significantly more active in the FS > SS contrast (see 2nd row of Figure 3), the Bayesian analyses are much more equivocal, with no basal ganglia areas showing Log10BF > 1 (which would be indicative of strong evidence). The authors suggest that "the frequentist and Bayesian analyses are monst in line with one another", but in my view, this frequentist vs. Bayesian analysis for the FS > SS contrast seems to suggest substantially different conclusions. More specifically, the frequentist analyses suggest greater activity in FS than SS in most basal ganglia ROIs (all but 2), but the Bayesian analysis did not find *any* basal ganglia ROIs with strong evidence for the alternative hypothesis (or a difference), and several with more evidence for the null than the alternative hypothesis. This difference between the frequentist and Bayesian analyses seems to warrant discussion, but unless I overlooked it, the Bayesian analyses are not mentioned in the Discussion at all. In my view, the frequentist analyses are treated as the results, and the Bayesian analyses were largely ignored.

      Overall, I think this paper makes a useful and mostly solid contribution to the literature. I have made some suggestions for adjustments and clarification of the neuroimaging pipeline and Bayesian analyses that I believe would strengthen the work further.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1: 

      This is my first review of the article entitled "The canonical stopping network: Revisiting the role of the subcortex in response inhibition" by Isherwood and colleagues. This study is one in a series of excellent papers by the Forstmann group focusing on the ability of fMRI to reliably detect activity in small subcortical nuclei - in this case, specifically those purportedly involved in the hyper- and indirect inhibitory basal ganglia pathways. I have been very fond of this work for a long time, beginning with the demonstration of De Hollander, Forstmann et al. (HBM 2017) of the fact that 3T fMRI imaging (as well as many 7T imaging sequences) do not afford sufficient signal to noise ratio to reliably image these small subcortical nuclei. This work has done a lot to reshape my view of seminal past studies of subcortical activity during inhibitory control, including some that have several thousand citations.

      In the current study, the authors compiled five datasets that aimed to investigate neural activity associated with stopping an already initiated action, as operationalized in the classic stop-signal paradigm. Three of these datasets are taken from their own 7T investigations, and two are datasets from the Poldrack group, which used 3T fMRI.

      The authors make six chief points: 

      (1) There does not seem to be a measurable BOLD response in the purportedly critical subcortical areas in contrasts of successful stopping (SS) vs. going (GO), neither across datasets nor within each individual dataset. This includes the STN but also any other areas of the indirect and hyperdirect pathways.

      (2) The failed-stop (FS) vs. GO contrast is the only contrast showing substantial differences in those nodes.

      (3) The positive findings of STN (and other subcortical) activation during the SS vs. GO contrast could be due to the usage of inappropriate smoothing kernels.

      (4) The study demonstrates the utility of aggregating publicly available fMRI data from similar cognitive tasks. 

      (5) From the abstract: "The findings challenge previous functional magnetic resonance (fMRI) of the stop-signal task" 

      (6) and further: "suggest the need to ascribe a separate function to these networks." 

      I strongly and emphatically agree with points 1-5. However, I vehemently disagree with point 6, which appears to be the main thrust of the current paper, based on the discussion, abstract, and - not least - the title.

      To me, this paper essentially shows that fMRI is ill-suited to study the subcortex in the specific context of the stop-signal task. That is not just because of the issues of subcortical small-volume SNR (the main topic of this and related works by this outstanding group), but also because of its limited temporal resolution (which is unacknowledged, but especially impactful in the context of the stop-signal task). I'll expand on what I mean in the following.

      First, the authors are underrepresenting the non-fMRI evidence in favor of the involvement of the subthalamic nucleus (STN) and the basal ganglia more generally in stopping actions. 

      - There are many more intracranial local field potential recording studies that show increased STN LFP (or even single-unit) activity in the SS vs. FS and SS vs. GO contrast than listed, which come from at least seven different labs. Here's a (likely non-exhaustive) list of studies that come to mind:

      Ray et al., NeuroImage 2012 <br /> Alegre et al., Experimental Brain Research 2013 <br /> Benis et al., NeuroImage 2014 <br /> Wessel et al., Movement Disorders 2016 <br /> Benis et al., Cortex 2016 <br /> Fischer et al., eLife 2017 <br /> Ghahremani et al., Brain and Language 2018 <br /> Chen et al., Neuron 2020 <br /> Mosher et al., Neuron 2021 <br /> Diesburg et al., eLife 2021 

      - Similarly, there is much more evidence than cited that causally influencing STN via deep-brain stimulation also influences action-stopping. Again, the following list is probably incomplete: 

      Van den Wildenberg et al., JoCN 2006 <br /> Ray et al., Neuropsychologia 2009 <br /> Hershey et al., Brain 2010 <br /> Swann et al., JNeuro 2011 <br /> Mirabella et al., Cerebral Cortex 2012 <br /> Obeso et al., Exp. Brain Res. 2013 <br /> Georgiev et al., Exp Br Res 2016 <br /> Lofredi et al., Brain 2021 <br /> van den Wildenberg et al, Behav Brain Res 2021 <br /> Wessel et al., Current Biology 2022 

      - Moreover, evidence from non-human animals similarly suggests critical STN involvement in action stopping, e.g.: 

      Eagle et al., Cerebral Cortex 2008 <br /> Schmidt et al., Nature Neuroscience 2013 <br /> Fife et al., eLife 2017 <br /> Anderson et al., Brain Res 2020 

      Together, studies like these provide either causal evidence for STN involvement via direct electrical stimulation of the nucleus or provide direct recordings of its local field potential activity during stopping. This is not to mention the extensive evidence for the involvement of the STN - and the indirect and hyperdirect pathways in general - in motor inhibition more broadly, perhaps best illustrated by their damage leading to (hemi)ballism. 

      Hence, I cannot agree with the idea that the current set of findings "suggest the need to ascribe a separate function to these networks", as suggested in the abstract and further explicated in the discussion of the current paper. For this to be the case, we would need to disregard more than a decade's worth of direct recording studies of the STN in favor of a remote measurement of the BOLD response using (provably) sub ideal imaging parameters. There are myriads of explanations of why fMRI may not be able to reveal a potential ground-truth difference in STN activity between the SS and FS/GO conditions, beginning with the simple proposition that it may not afford sufficient SNR, or that perhaps subcortical BOLD is not tightly related to the type of neurophysiological activity that distinguishes these conditions (in the purported case of the stop-signal task, specifically the beta band). But essentially, this paper shows that a specific lens into subcortical activity is likely broken, but then also suggests dismissing existing evidence from superior lenses in favor of the findings from the 'broken' lens. That doesn't make much sense to me.

      Second, there is actually another substantial reason why fMRI may indeed be unsuitable to study STN activity, specifically in the stop-signal paradigm: its limited time resolution. The sequence of subcortical processes on each specific trial type in the stop-signal task is purportedly as follows: at baseline, the basal ganglia exert inhibition on the motor system. During motor initiation, this inhibition is lifted via direct pathway innervation. This is when the three trial types start diverging. When actions then have to be rapidly cancelled (SS and FS), cortical regions signal to STN via the hyperdirect pathway that inhibition has to be rapidly reinstated (see Chen, Starr et al., Neuron 2020 for direct evidence for such a monosynaptic hyperdirect pathway, the speed of which directly predicts SSRT). Hence, inhibition is reinstated (too late in the case of FS trials, but early enough in SS trials, see recordings from the BG in Schmidt, Berke et al., Nature Neuroscience 2013; and Diesburg, Wessel et al., eLife 2021). 

      Hence, according to this prevailing model, all three trial types involve a sequence of STN activation (initial inhibition), STN deactivation (disinhibition during GO), and STN reactivation (reinstantiation of inhibition during the response via the hyperdirect pathway on SS/FS trials, reinstantiation of inhibition via the indirect pathway after the response on GO trials). What distinguishes the trial types during this period is chiefly the relative timing of the inhibitory process (earliest on SS trials, slightly later on FS trials, latest on GO trials). However, these temporal differences play out on a level of hundreds of milliseconds, and in all three cases, processing concludes well under a second overall. To fMRI, given its limited time resolution, these activations are bound to look quite similar. 

      Lastly, further building on this logic, it's not surprising that FS trials yield increased activity compared to SS and GO trials. That's because FS trials are errors, which are known to activate the STN (Cavanagh et al., JoCN 2014; Siegert et al. Cortex 2014) and afford additional inhibition of the motor system after their occurrence (Guan et al., JNeuro 2022). Again, fMRI will likely conflate this activity with the abovementioned sequence, resulting in a summation of activity and the highest level of BOLD for FS trials. 

      In sum, I believe this study has a lot of merit in demonstrating that fMRI is ill-suited to study the subcortex during the SST, but I cannot agree that it warrants any reappreciation of the subcortex's role in stopping, which are not chiefly based on fMRI evidence. 

      We would like to thank reviewer 1 for their insightful and helpful comments. We have responded point-by-point below and will give an overview of how we reframed the paper here.  

      We agree that there is good evidence from other sources for the presence of the canonical stopping network (indirect and hyperdirect) during action cancellation, and that this should be reflected more in the paper. However, we do not believe that a lack of evidence for this network during the SST makes fMRI ill-suited for studying this task, or other tasks that have neural processes occurring in quick succession. What we believe the activation patterns of fMRI reflect during this task, is the large of amount of activation caused by failed stops. That is, that the role of the STN in error processing may be more pronounced that its role in action cancellation. Due to the replicability of fMRI results, especially at higher field strengths, we believe the activation profile of failed stop trials reflects a paramount role for the STN in error processing. Therefore, while we agree we do not provide evidence against the role of the STN in action cancellation, we do provide evidence that our outlook on subcortical activation during different trial types of this task should be revisited. We have reframed the article to reflect this, and discuss points such as fMRI reliability, validity and the complex overlapping of cognitive processes in the SST in the discussion. Please see all changes to the article indicated by red text.

      A few other points: 

      - As I said before, this team's previous work has done a lot to convince me that 3T fMRI is unsuitable to study the STN. As such, it would have been nice to see a combination of the subsamples of the study that DID use imaging protocols and field strengths suitable to actually study this node. This is especially true since the second 3T sample (and arguably, the Isherwood_7T sample) does not afford a lot of trials per subject, to begin with.

      Unfortunately, this study already comprises of the only 7T open access datasets available for the SST. Therefore, unless we combined only the deHollander_7T and Miletic_7T subsamples there is no additional analysis we can do for this right now. While looking at just the sub samples that were 7T and had >300 trials would be interesting, based on the new framing of the paper we do not believe it adds to the study, as the sub samples still lack the temporal resolution seemingly required for looking at the processes in the SST.

      - What was the GLM analysis time-locked to on SS and FS trials? The stop-signal or the GO-signal? 

      SS and FS trials were time-locked to the GO signal as this is standard practice. The main reason for this is that we use contrasts to interpret differences in activation patterns between conditions. By time-locking the FS and SS trials to the stop signal, we are contrasting events at different time points, and therefore different stages of processing, which introduces its own sources of error. We agree with the reviewer, however, that a separate analysis with time-locking on the stop-signal has its own merit, and now include results in the supplementary material where the FS and SS trials are time-locked to the stop signal as well.

      - Why was SSRT calculated using the outdated mean method? 

      We originally calculated SSRT using the mean method as this was how it was reported in the oldest of the aggregated studies. We have now re-calculated the SSRTs using the integration method with go omission replacement and thank the reviewer for pointing this out. Please see response to comment 3.

      - The authors chose 3.1 as a z-score to "ensure conservatism", but since they are essentially trying to prove the null hypothesis that there is no increased STN activity on SS trials, I would suggest erring on the side of a more lenient threshold to avoid type-2 error. 

      We have used minimum FDR-corrected thresholds for each contrast now, instead of using a blanket conservative threshold of 3.1 over all contrasts. The new thresholds for each contrast are shown in text. Please see below (page 12):

      “The thresholds for each contrast are as follows: 3.01 for FS > GO, 2.26 for FS > SS and 3.1 for SS > GO.”

      - The authors state that "The results presented here add to a growing literature exposing inconsistencies in our understanding of the networks underlying successful response inhibition". It would be helpful if the authors cited these studies and what those inconsistencies are. 

      We thank reviewer 1 for their detailed and thorough evaluation of our paper. Overall, we agree that there is substantial direct and indirect evidence for the involvement of the cortico-basal-ganglia pathways in response inhibition. We have taken the vast constructive criticism on board and agree with the reviewer that the paper should be reframed. We would like to thank the reviewer for the thoroughness of their helpful comments aiding the revising of the paper.

      (1) I would suggest reframing the study, abstract, discussion, and title to reflect the fact that the study shows that fMRI is unsuitable to study subcortical activity in the SST, rather than the fact that we need to question the subcortical model of inhibition, given the reasons in my public review.

      We agree with the reviewer that the article should be reframed and not taken as direct evidence against the large sum of literature pointing towards the involvement of the cortico-basal-ganglia pathway in response inhibition. We have significantly rewritten the article in light of this.

      (2) I suggest combining the datasets that provide the best imaging parameters and then analyzing the subcortical ROIs with a more lenient threshold and with regressors time-locked to the stop-signals (if that's not already the case). This would make the claim of a null finding much more impactful. Some sort of power analysis and/or Bayes factor analysis of evidence for the null would also be appreciated. 

      Instead of using a blanket conservative threshold of 3.1, we instead used only FDR-corrected thresholds. The threshold level is therefore different for each contrast and noted in the figures. We have also added supplementary figures including the group-level SPMs and ROI analyses when the FS and SS trials were time-locked to the stop signal instead of the GO signal (Supplementary Figs 4 & 5). But as mentioned above, due to the difference in time points when contrasting, we believe that time-locking to the GO signal for all trial types makes more sense for the main analysis.

      We have now also computed BFs on the first level ROI beta estimates for all contrasts using the BayesFactor package as implemented in R. We add the following section to the methods and updated the results section accordingly (page 8):

      “In addition to the frequentist analysis we also opted to compute Bayes Factors (BFs) for each contrast per ROI per hemisphere. To do this, we extracted the beta weights for each individual trial type from our first level model. We then compared the beta weights from each trial type to one another using the ‘BayesFactor’ package as implement in R (Morey & Rouder, 2015). We compared the full model comprising of trial type, dataset and subject as predictors to the null model comprising of only the dataset and subject as predictor. The datasets and subjects were modeled as random factors. We divided the resultant BFs from the full model by the null model to provide evidence for or against a significant difference in beta weights for each trial type. To interpret the BFs, we used a modified version of Jeffreys’ scale (Jeffreys, 1939; Lee & Wagenmakers, 2014).”

      (3) I suggest calculating SSRT using the integration method with the replacement of Go omissions, as per the most recent recommendation (Verbruggen et al., eLife 2019).

      We agree we should have used a more optimal method for SSRT estimation. We have replaced our original estimations with that of the integration method with go omissions replacement, as suggested and adapted the results in table 3.

      We have also replaced text in the methods sections to reflect this (page 5):

      “For each participant, the SSRT was calculated using the mean method, estimated by subtracting the mean SSD from median go RT (Aron & Poldrack, 2006; Logan & Cowan, 1984).”

      Now reads:

      “For each participant, the SSRT was calculated using the integration method with replacement of go omissions (Verbruggen et al., 2019), estimated by integrating the RT distribution and calculating the point at which the integral equals p(respond|signal). The completion time of the stop process aligns with the nth RT, where n equals the number of RTs in the RT distribution of go trials multiplied by the probability of responding to a signal.”

      Reviewer #2:

      This work aggregates data across 5 openly available stopping studies (3 at 7 tesla and 2 at 3 tesla) to evaluate activity patterns across the common contrasts of Failed Stop (FS) > Go, FS > stop success (SS), and SS > Go. Previous work has implicated a set of regions that tend to be positively active in one or more of these contrasts, including the bilateral inferior frontal gyrus, preSMA, and multiple basal ganglia structures. However, the authors argue that upon closer examination, many previous papers have not found subcortical structures to be more active on SS than FS trials, bringing into question whether they play an essential role in (successful) inhibition. In order to evaluate this with more data and power, the authors aggregate across five datasets and find many areas that are *more* active for FS than SS, specifically bilateral preSMA, caudate, GPE, thalamus, and VTA, and unilateral M1, GPi, putamen, SN, and STN. They argue that this brings into question the role of these areas in inhibition, based upon the assumption that areas involved in inhibition should be more active on successful stop than failed stop trials, not the opposite as they observed. 

      As an empirical result, I believe that the results are robust, but this work does not attempt a new theoretical synthesis of the neuro-cognitive mechanisms of stopping. Specifically, if these many areas are more active on failed stop than successful stop trials, and (at least some of) these areas are situated in pathways that are traditionally assumed to instantiate response inhibition like the hyperdirect pathway, then what function are these areas/pathways involved in? I believe that this work would make a larger impact if the author endeavored to synthesize these results into some kind of theoretical framework for how stopping is instantiated in the brain, even if that framework may be preliminary. 

      I also have one main concern about the analysis. The authors use the mean method for computing SSRT, but this has been shown to be more susceptible to distortion from RT slowing (Verbruggen, Chambers & Logan, 2013 Psych Sci), and goes against the consensus recommendation of using the integration with replacement method (Verbruggen et al., 2019). Therefore, I would strongly recommend replacing all mean SSRT estimates with estimates using the integration with replacement method. 

      I found the paper clearly written and empirically strong. As I mentioned in the public review, I believe that the main shortcoming is the lack of theoretical synthesis. I would encourage the authors to attempt to synthesize these results into some form of theoretical explanation. I would also encourage replacing the mean method with the integration with replacement method for computing SSRT. I also have the following specific comments and suggestions (in the approximate order in which they appear in the manuscript) that I hope can improve the manuscript: 

      We would like to thank reviewer 2 for their insightful and interesting comments. We have adapted our paper to reflect these comments. Please see direct responses to your comments below. We agree with the reviewer that some type of theoretical synthesis would help with the interpretability of the article. We have substantially reworked the discussion and included theoretical considerations behind the newer narrative. Please see all changes to the article indicated by red text.

      (1) The authors say "performance on successful stop trials is quantified by the stop signal reaction time". I don't think this is technically accurate. SSRT is a measure of the average latency of the stop process for all trials, not just for the trials in which subjects successfully stop. 

      Thank you for pointing this technically incorrect statement. We have replaced the above sentence with the following (page 1):

      “Inhibition performance in the SST as a whole is quantified by the stop signal reaction time (SSRT), which estimates the speed of the latent stopping process (Verbruggen et al., 2019).”

      (2) The authors say "few studies have detected differences in the BOLD response between FS and SS trials", but then do not cite any papers that detected differences until several sentences later (de Hollander et al., 2017; Isherwood et al., 2023; Miletic et al., 2020). If these are the only ones, and they only show greater FS than SS, then I think this point could be made more clearly and directly. 

      We have moved the citations to the correct place in the text to be clearer. We have also rephrased this part of the introduction to make the points more direct (page 2).

      “In the subcortex, functional evidence is relatively inconsistent. Some studies have found an increase in BOLD response in the STN in SS > GO contrasts (Aron & Poldrack, 2006; Coxon et al., 2016; Gaillard et al., 2020; Yoon et al., 2019), but others have failed to replicate this (Bloemendaal et al., 2016; Boehler et al., 2010; Chang et al., 2020; B. Xu et al., 2015). Moreover, some studies have actually found higher STN, SN and thalamic activation in failed stop trials, not successful ones (de Hollander et al., 2017; Isherwood et al., 2023; Miletić et al., 2020).

      (3) Unless I overlooked it, I don't believe that the author specified the criterion that any given subject is excluded based upon. Given some studies have significant exclusions (e.g., Poldrack_3T), I think being clear about how many subjects violated each criterion would be useful. 

      This is indeed interesting and important information to include. We have added the number of participants who were excluded for each criterion. Please see added text below (page 4):

      “Based on these criteria, no subjects were excluded from the Aron_3T dataset. 24 subjects were excluded from the Poldrack_3T dataset (3 based on criterion 1, 9 on criterion 2, 11 on criterion 3, and 8 on criterion 4). Three subjects were excluded from the deHollander_7T dataset (2 based on criterion 1 and 1 on criterion 2). Five subjects were excluded from the Isherwood_7T dataset (2 based on criterion 1, 1 on criterion 2, and 2 on criterion 4). Two subjects were excluded from the Miletic_7T dataset (1 based on criterion 2 and 1 on criterion 4). Note that some participants in the Poldrack_3T study failed to meet multiple inclusion criteria.”

      (4) The Method section included very exhaustive descriptions of the neuroimaging processing pipeline, which was appreciated. However, it seems that much of what is presented is not actually used in any of the analyses. For example, it seems that "functional data preprocessing" section may be fMRIPrep boilerplate, which again is fine, but I think it would help to clarify that much of the preprocessing was not used in any part of the analysis pipeline for any results. For example, at first blush, I thought the authors were using global signal regression, but after a more careful examination, I believe that they are only computing global signals but never using them. Similarly with tCompCor seemingly being computed but not used. If possible, I would recommend that the authors share code that instantiates their behavioral and neuroimaging analysis pipeline so that any confusion about what was actually done could be programmatically verified. At a minimum, I would recommend more clearly distinguishing the pipeline steps that actually went into any presented analyses.

      We thank the reviewer for finding this inconsistency. The methods section indeed uses the fMRIprep boilerplate text, which we included so to be as accurate as possible when describing the preprocessing steps taken. While we believe leaving the exact boilerplate text that fMRIprep gives us is the most accurate method to show our preprocessing, we have adapted some of the text to clarify which computations were not used in the subsequent analysis. As a side-note, for future reference, we’d like to add that the fmriprep authors expressly recommend users to report the boilerplate completely and unaltered, and as such, we believe this may become a recurring issue (page 7).

      “While many regressors were computed in the preprocessing of the fMRI data, not all were used in the subsequent analysis. The exact regressors used for the analysis can be found above. For example, tCompCor and global signals were calculated in our generic preprocessing pipeline but not part of the analysis. The code used for preprocessing and analysis can be found in the data and code availability statement.”

      (5) What does it mean for the Poldrack_3T to have N/A for SSD range? Please clarify. 

      Thank you for pointing out this omission. We had not yet found the possible SSD range for this study. We have replaced this value with the correct value (0 – 1000 ms).

      (6) The SSD range of 0-2000ms for deHollander_7T and Miletic_7T seems very high. Was this limit ever reached or even approached? SSD distributions could be a useful addition to the supplement. 

      Thank you for also bringing this mistake to light. We had accidentally placed the max trial duration in these fields instead of the max allowable SSD value. We have replaced the correct value (0 – 900 ms).

      (7) The author says "In addition, median go RTs did not correlate with mean SSRTs within datasets (Aron_3T: r = .411, p = .10, BF = 1.41; Poldrack_3T: r = .011, p = .91, BF = .23; deHollander_7T: r = -.30, p = .09, BF = 1.30; Isherwood_7T: r = .13, p = .65, BF = .57; Miletic_7T: r = .37, p = .19, BF = 1.02), indicating independence between the stop and go processes, an important assumption of the horse-race model (Logan & Cowan, 1984)." However, the independent race model assumes context independence (the finishing time of the go process is not affected by the presence of the stop process) and stochastic independence (the duration of the go and stop processes are independent on a given trial). This analysis does not seem to evaluate either of these forms of independence, as it correlates RT and SSRT across subjects, so it was unclear how this analysis evaluated either of the types of independence that are assumed by the independent race model. Please clarify or remove. 

      Thank you for this comment. We realize that this analysis indeed does not evaluate either context or stochastic independence and therefore we have removed this from the manuscript.

      (8) The RTs in Isherwood_7T are considerably slower than the other studies, even though the go stimulus+response is the same (very simple) stimulus-response mapping from arrows to button presses. Is there any difference in procedure or stimuli that might explain this difference? It is the only study with a visual stop signal, but to my knowledge, there is no work suggesting visual stop signals encourage more proactive slowing. If possible, I think a brief discussion of the unusually slow RTs in Isherwood_7T would be useful. 

      We have included the following text in the manuscript to reflect this observed difference in RT between the Isherwood_7T dataset and the other datasets (page 9).

      “Longer RTs were found in the Isherwood_7T dataset in comparison to the four other datasets. The only difference in procedure in the Isherwood_7T dataset is the use of a visual stop signal as opposed to an auditory stop signal. This RT difference is consistent with previous research, where auditory stop signals and visual go stimuli have been associated with faster RTs compared to unimodal visual presentation (Carrillo-de-la-Peña et al., 2019; Weber et al., 2024). The mean SSRTs and probability of stopping are within normal range, indicating that participants understood the task and responded in the expected manner.”

      (9) When the authors included both 3T and 7T data, I thought they were preparing to evaluate the effect of magnet strength on stop networks, but they didn't do this analysis. Is this because the authors believe there is insufficient power? It seems that this could be an interesting exploratory analysis that could improve the paper.

      We thank the reviewer for this interesting comment. As our dataset sample contains only two 3T and three 7T datasets we indeed believe there is insufficient power to warrant such an analysis. In addition, we wanted the focus of this paper to be how fMRI examines the SST in general, and not differences between acquisition methods. With a greater number of datasets with different imaging parameters (especially TE or resolution) in addition to field strength, we agree such an analysis would be interesting, although beyond the scope of this article.

      (10) The authors evaluate smoothing and it seems that the conclusion that they want to come to is that with a larger smoothing kernel, the results in the stop networks bleed into surrounding areas, producing false positive activity. However, in the absence of a ground truth of the true contributions of these areas, it seems that an alternative interpretation of the results is that the denser maps when using a larger smoothing kernel could be closer to "true" activation, with the maps using a smaller smoothing kernel missing some true activity. It seems worth entertaining these two possible interpretations for the smoothing results unless there is clear reason to conclude that the smoothed results are producing false positive activity. 

      We agree with the view of the reviewer on the interpretation of the smoothing results. We indeed cannot rule this out as a possible interpretation of the results, due to a lack of ground truth. We have added text to the article to reflect this view and discuss the types of errors we can expect for both smaller and larger smoothing kernels (page 15).

      “In the absence of a ground truth, we are not able to fully justify the use of either larger or smaller kernels to analyse such data. On the one hand, aberrantly large smoothing kernels could lead to false positives in activation profiles, due to bleeding of observed activation into surrounding tissues. On the other side, too little smoothing could lead to false negatives, missing some true activity in surrounding regions. While we cannot concretely validate either choice, it should be noted that there is lower spatial uncertainty in the subcortex compared to the cortex, due to the lower anatomical variability. False positives from smoothing spatially unmatched signal, are more likely than false negatives. It may be more prudent for studies to use a range of smoothing kernels, to assess the robustness of their fMRI activation profiles.”

    1. eLife assessment

      This important study provides a new perspective on why preparatory activity occurs before the onset of movement. The authors report that when there is a cost on the inputs, the optimal inputs should start before the desired network output for a wide variety of recurrent networks. The authors present convincing evidence by combining mathematically tractable analyses in linear networks and numerical simulation in nonlinear networks.

    2. Reviewer #1 (Public Review):

      In this work, the authors investigate an important question - under what circumstances should a recurrent neural network optimised to produce motor control signals receive preparatory input before the initiation of a movement, even though it is possible to use inputs to drive activity just-in-time for movement?

      This question is important because many studies across animal models have show that preparatory activity is widespread in neural populations close to motor output (e.g. motor cortex / M1), but it isn't clear under what circumstances this preparation is advantageous for performance, especially since preparation could cause unwanted motor output during a delay.

      They show that networks optimised under reasonable constraints (speed, accuracy, lack of pre-movement) will use input to seed the state of the network before movement, and that these inputs reduce the need for ongoing input during the movement. By examining many different parameters in simplified models they identify a strong connection between the structure of the network and the amount of preparation that is optimal for control - namely, that preparation has the most value when nullspaces are highly observable relative to the readout dimension and when the controllability of readout dimensions is low. They conclude by showing that their model predictions are consistent with the observation in monkey motor cortex that even when a sequence of two movements is known in advance, preparatory activity only arises shortly before movement initiation.

      Overall, this study provides valuable theoretical insight into the role of preparation in neural populations that generate motor output, and by treating input to motor cortex as a signal that is optimised directly this work is able to sidestep many of the problematic questions relating to estimating the potential inputs to motor cortex.

    3. Reviewer #2 (Public Review):

      This work clarifies neural mechanisms that can lead to a phenomenology consistent with motor preparation in its broader sense. In this context, motor preparation refers to activity that occurs before the corresponding movement. Another property often associated with preparatory activity is a correlation with global movement characteristics such as reach speed (Churchland et al., Neuron 2006), reach angle (Sun et al., Nature 2022), or grasp type (Meirhaeghe et al., Cell Reports 2023). Such activity has notably been observed in premotor and primary motor cortices, and it has been hypothesized to serve as an input to a motor execution circuit. The timing and mechanisms by which such 'preparatory' inputs are made available to motor execution circuits remain however unclear in general, especially in light of the presence of a 'trigger-like' signal that appears to relate to the transition from preparatory dynamics to execution activity (Kaufman et al. eNeuron 2016, Iganaki et al., Cell 2022, Zimnik and Churchland, Nature Neuroscience 2021).

      The preparatory inputs have been hypothesized to fulfill one or several (non-mutually-exclusive) possible objectives. Two notable hypotheses are that these inputs could be shaped to maximize output accuracy under regularization of the input magnitude; or that they may help the flexible re-use of the neural machinery involved in the control of movements in different contexts.

      Here, the authors investigate in detail how the former hypothesis may be compatible with the presence of early inputs in recurrent network models driving arm movements, and compare models to data.

      Strengths:

      The authors are able to deploy an in-depth evaluation of inputs that are optimized for producing an accurate output at a pre-defined time while using a regularization term on the input magnitude, in the case of movements that are thought to be controlled in a quasi-open loop fashion such as reaches.

      First, the authors have identified that optimal control theory is a great framework to study this question as it provides methods to find and analyze exact solutions to this cost function in the case of models with linear dynamics. The authors not only use this framework to get an exact assessment of how much pre-movement input arises in large recurrent networks, but also give insight into the mechanisms by which it happens by dissecting in detail low-dimensional networks. The authors find that two key network properties - observability of the readout's nullspace and limited controllability - give rise to optimal inputs that are large before the start of the movement (while the corresponding network activity lies in the nullspace of the readout). Further, the authors numerically investigate the timing of optimized inputs in models with nonlinear dynamics, and find that pre-movement inputs can also arise in these more general networks. The authors also explore how some variations on their model's constraints - such as penalizing the input roughness or changing task contingencies about the go cue timing - affect their results. Finally, the authors point out some coarse-grained similarities between the pre-movement activity driven by the optimized inputs in some of the models they studied, and the phenomenology of preparation observed in the brain during single reaches and reach sequences. Overall, the authors deploy an impressive arsenal of tools and a very in-depth analysis of their models.

      Limitations:

      (1) Though the optimal control theory framework is ideal to determine inputs that minimize output error while regularizing the input norm or other simple input features, it cannot easily account for some other varied types of objectives - especially those that may lead to a complex optimization landscape. For instance, the reusability of parts of the circuit, sparse use of additional neurons when learning many movements, and ease of planning (especially under uncertainty about when to start the movement), may be alternative or additional reasons that could help explain the preparatory activity observed in the brain. It is interesting to note that inputs that optimize the objective chosen by the authors arguably lead to a trade-off in terms of other desirable objectives. Specifically, the inputs the authors derive are time-dependent, so a recurrent network would be needed to produce them and it may not be easy to interpolate between them to drive new movement variants. In addition, these inputs depend on the desired time of output and therefore make it difficult to plan, e.g. in circumstances when timing should be decided depending on sensory signals. Finally, these inputs are specific to the full movement chain that will unfold, so they do not permit reuse of the inputs e.g. in movement sequences of different orders. Of note, the authors have pointed out in the discussion how their framework may be extended in future work to account for some additional objectives, such as inputs' temporal smoothness or some strategies for dealing with go cue timing uncertainty.

      (2) Relatedly, if the motor circuits were to balance different types of objectives, the activity and inputs occurring before each movement may be broken down into different categories that may each specialize into their own objective. For instance, previous work (Kaufman et al. eNeuron 2016, Iganaki et al., Cell 2022, Zimnik and Churchland, Nature Neuroscience 2021) has suggested that inputs occurring before the movement could be broken down into preparatory inputs 'stricto sensu' - relating to the planned characteristics of the movement - and a trigger signal, relating to the transition from planning to execution - irrespective of whether the movement is internally timed or triggered by an external event. The current work does not address which type(s) of early input may be labeled as 'preparatory' or may be thought of as a part of 'planning' computations, or whether these inputs may come from several different source circuits.

      (3) While the authors rightly point out some similarities between the inputs that they derive and observed preparatory activity in the brain, notably during motor sequences, there are also some differences. For instance, while both the derived inputs and the data show two peaks during sequences, the data reproduced from Zimnik and Churchland show preparatory inputs that have a very asymmetric shape that really plummets before the start of the next movement, whereas the derived inputs have larger amplitude during the movement period - especially for the second movement of the sequence. In addition, the data show trigger-like signals before each of the two reaches. Finally, while the data show a very high correlation between the pattern of preparatory activity of the second reach in the double reach and compound reach conditions, the derived inputs appear to be more different between the two conditions. Note that the data would be consistent with separate planning of the two reaches even in the compound reach condition, as well as the re-use of the preparatory input between the compound and double reach conditions. Therefore, different motor sequence datasets - notably, those that would show even more coarticulation between submovements - may be more promising to find a tight match between the data and the author's inputs. Further analyses in these datasets could help determine whether the coarticulation could be due to simple filtering by the circuits and muscles downstream of M1, planning of movements with adjusted curvature to mitigate the work performed by the muscles while permitting some amount of re-use across different sequences, or - as suggested by the authors - inputs fully tailored to one specific movement sequence that maximize accuracy and minimize the M1 input magnitude.

      (4) Though iLQR is a powerful optimization method to find inputs optimizing the author's cost function, it also has some limitations. First, given that it relies on a linearization of the dynamics at each timestep, it has a limited ability to leverage potential advantages of nonlinearities in the dynamics. Second, the iLQR algorithm is not a biologically plausible learning rule and therefore it might be difficult for the brain to learn to produce the inputs that it finds. Therefore, when observing differences between model and data, this can confound the question of whether it comes from a difference of assumed objective or a difference of optimization procedure. It remains unclear whether using alternative algorithms with different limitations - for instance, using variants of BPTT to train a separate RNN to produce the inputs in question - could impact some of the results.

      (5) Under the objective considered by the authors, the amount of input occurring before the movement might be impacted by the presence of online sensory signals for closed-loop control. Even if considering that the inputs could include some sensory activity and/or that the RNN activity could represent general variables whose states can be decoded from M1, the model would not include mechanisms that process imperfect (delayed, noisy) sensory feedback to adapt the output in a trial-specific manner. It is therefore an open question whether the objective and network characteristics suggested by the authors could also explain the presence of preparatory activity before e.g. grasping movements that are thought to be more sensory-driven (Meirhaeghe et al., Cell Reports 2023).

    4. Reviewer #3 (Public Review):

      I remain enthusiastic about this study. The manuscript is well-written, logical, and conceptually clear. To my knowledge, no prior modeling study has tackled the question of 'why prepare before executing, why not just execute?' Prior studies have simply assumed, to emulate empirical findings, that preparatory inputs precede execution. They never asked why. The authors show that, when there are constraints on inputs, preparation becomes a natural strategy. In contrast, with no constraint on inputs, there is no need for preparation as one could get anything one liked just via the inputs during movement. For the sake of tractability, the authors use a simple magnitude constraint: the cost function punishes the integral of the squared inputs. Thus, if small inputs before movement can reduce the size of the inputs needed during movement, preparation is a good strategy. This occurs if (and only if) the network has strong dynamics (otherwise feeding it preparatory activity would not produce anything interesting). All of this is sensible and clarifying.

      As discussed in the prior round of reviews, the central constraint that the authors use is a mathematically tractable stand-in for a range of plausible (but often trickier to define and evaluate) constraints, such as simplicity of inputs (or inputs being things that other areas could provide). The manuscript now embraces this fact more explicitly, and also gives some results showing that other constraints (such as on the derivative of activity, which is one component of complexity) can have the same effect. The manuscript also now discusses and addresses a modest weakness of the previous manuscript: the preparatory activity in their simulations is often overly complex temporally, lacking the (rough) plateau typically seen for data. Depending on your point of view, this is simply 'window dressing', but from my perspective it was important to know that their approach could yield more realistic-looking preparatory activity. Both these additions (the new constraint, and the more realistic temporal profile of preparatory activity) are added simply as supplementary figures rather than in the main text, and are brought up only in the Discussion. At first this struck me as slightly odd, but in the end I think this is appropriate. These are really Discussion-type issues, and dealing with them there makes sense. The 'different constraints' issue in particular is deep, tricky to explore for technical reasons, and could thus support a small research program. I think it is fair to talk about it thoughtfully (as the Discussion now does) and then just mention some simple results.

      My remaining comments largely pertain to some subtle (but to me important) nuances at a few locations in the text. These should be easy for the authors to address, in whatever way they see fit.

      Specific comments:

      (1) The authors state the following on line 56: "For preparatory processes to avoid triggering premature movement, any pre-movement activity in the motor and dorsal pre-motor (PMd) cortices must carefully exclude those pyramidal tract neurons."<br /> This constraint is overly restrictive. PT neurons absolutely can change their activity during preparation in principle (and appear to do so in practice). The key constraint is looser: those changes should have no net effect on the muscles. E.g., if d is the vector of changes in PT neuron firing rates, and b is the vector of weights, then the constraint is that b'd = 0. d = 0 is one good way of doing this, but only one. Half the d's could go up and half could go down. Or they all go up, but half the b's are negative. Put differently, there is no reason the null space has to be upstream of the PT neurons. It could be partly, or entirely, downstream.<br /> In the end, this doesn't change the point the authors are making. It is still the case that d has to be structured to avoid causing muscle activity, which raises exactly the point the authors care about: why risk this unless preparation brings benefits? However, this point can be made with a more accurate motivation. This matters, because people often think that a null-space is a tricky thing to engineer, when really it is quite natural. With enough neurons, preparing in the null space is quite simple.

      (2) Line 167: 'near-autonomous internal dynamics in M1'.<br /> It would be good if such statements, early in the paper, could be modified to reflect the fact that the dynamics observed in M1 may depend on recurrence that is NOT purely internal to M1. A better phrase might be 'near-autonomous dynamics that can be observed in M1'. A similar point applies on line 13. This issue is handled very thoughtfully in the Discussion, starting on line 713. Obviously it is not sensible to also add multiple sentences making the same point early on. However, it is still worth phrasing things carefully, otherwise the reader may have the wrong impression up until the Discussion (i.e. they may think that both the authors, and prior studies, believe that all the relevant dynamics are internal to M1). If possible, it might also be worth adding one sentence, somewhere early, to keep readers from falling into this hole (and then being stuck there till the Discussion digs them out).

      (3) The authors make the point, starting on line 815, that transient (but strong) preparatory activity empirically occurs without a delay. They note that their model will do this but only if 'no delay' means 'no external delay'. For their model to prepare, there still needs to be an internal delay between when the first inputs arrive and when movement generating inputs arrive.

      This is not only a reasonable assumption, but is something that does indeed occur empirically. This can be seen in Figure 8c of Lara et al. Similarly, Kaufman et al. 2016 noted that "the sudden change in the CIS [the movement triggering event] occurred well after (~150 ms) the visual go cue... (~60 ms latency)" Behavioral experiments have also argued that internal movement-triggering events tend to be quite sluggish relative to the earliest they could be, causing RTs to be longer than they should be (Haith et al. Independence of Movement Preparation and Movement Initiation). Given this empirical support, the authors might wish to add a sentence indicating that the data tend to justify their assumption that the internal delay (separating the earliest response to sensory events from the events that actually cause movement to begin) never shrinks to zero.

      While on this topic, the Haith and Krakauer paper mentioned above good to cite because it does ponder the question of whether preparation is really necessary. By showing that they could get RTs to shrink considerably before behavior became inaccurate, they showed that people normally (when not pressured) use more preparation time than they really need. Given Lara et al, we know that preparation does always occur, but Haith and Krakauer were quite right that it can be very brief. This helped -- along with neural results -- change our view of preparation from something more cognitive that had to occur, so something more mechanical that was simply a good network strategy, which is indeed the authors current point. Working a discussion of this into the current paper may or may not make sense, but if there is a place where it is easy to cite, it would be appropriate.

    5. Author response:

      The following is the authors’ response to the original reviews.

      General response:

      We thank all the reviewers for their detailed reviews.

      All reviewers made a number of valuable comments, in particular by highlighting several points that would benefit from additional clarifications and discussion. We really appreciate the time and effort that went into the reviews. We have updated the paper to reflect the changes we have made in response to the reviewers' comments (largely by including more discussion regarding the model limitations and the effect of various modeling choices). We have also included several new supplementary figures (S7, S8, S9, S10) that provide further details of the model behavior, and show the effect of changing some of the terms in the cost. Below, we go through the individual comments, and highlight the places in which we have made changes to address the reviewers’ comments.

      Reviewer 1:

      Thank you for your review and pointing out multiple things to be discussed and clarified! Below, we go through the various limitations you pointed out and refer to the places where we have tried to address them.

      (1) It's important to keep in mind that this work involves simplified models of the motor system, and often the terminology for 'motor cortex' and 'models of motor cortex' are used interchangeably, which may mislead some readers. Similarly, the introduction fails in many cases to state what model system is being discussed (e.g. line 14, line 29, line 31), even though these span humans, monkeys, mice, and simulations, which all differ in crucial ways that cannot always be lumped together.

      That is a good point. We have clarified this in the text (Introduction and Discussion), to highlight the fact that our model isn’t necessarily meant to just capture M1. We have also updated the introduction to make it more clear which species the experiments which motivate our investigation were performed in.

      (2) At multiple points in the manuscript thalamic inputs during movement (in mice) is used as a motivation for examining the role of preparation. However, there are other more salient motivations, such as delayed sensory feedback from the limb and vision arriving in the motor cortex, as well as ongoing control signals from other areas such as the premotor cortex.

      Yes – the motivation for thalamic inputs came from the fact that those have specifically been shown to be necessary for accurate movement generation in mice. However, it is true that the inputs in our model are meant to capture any signals external to the dynamical system modeled, and as such are likely to represent a mixture of sensory signals, and feedback from other areas. We have clarified this in the Discussion, and have added this additional motivation in the Introduction.

      (3) Describing the main task in this work as a delayed reaching task is not justified without caveats (by the authors' own admission: line 687), since each network is optimized with a fixed delay period length. Although this is mentioned to the reader, it's not clear enough that the dynamics observed during the delay period will not resemble those in the motor cortex for typical delayed reaching tasks.

      Yes, we completely agree that the terminology might be confusing. While the task we are modeling is a delayed reaching task, it does differ from the usual setting since the network has knowledge of the delay period, and that is indeed a caveat of the model. We have added a brief paragraph just after the description of the optimal control objective to highlight this limitation.

      We have also performed additional simulations using two different variants of a model-predictive control approach that allow us to relax the assumption that the go-cue time is known in advance. We show that these modifications of the optimal controller yield results that remain consistent with our main conclusions, and can in fact in some settings lead to preparatory activity plateaus during the preparation epoch as often found in monkey M1 (e.g in Elsayed et al. 2016). We have modified the Discussion to explain these results and their limitations, which are summarized in a new Supplementary Figure (S9).

      (4) A number of simplifications in the model may have crucial consequences for interpretation.

      a) Even following the toy examples in Figure 4, all the models in Figure 5 are linear, which may limit the generalisability of the findings.

      While we agree that linear models may be too simplistic, much prior analyses of M1 data suggest that it is often good enough to capture key aspects of M1 dynamics; for example, the generative model underlying jPCA is linear, and Sussillo et al. (2015) showed that the internal activity of nonlinear RNN models trained to reproduce EMG data aligned best with M1 activity when heavily regularized; in this regime, the RNN dynamics were close to linear. Nevertheless, this linearity assumption is indeed convenient from a modeling viewpoint: the optimal control problem is more easily solved for linear network dynamics and the optimal trajectories are more consistent across networks. Indeed, we had originally attempted to perform the analyses of Figure 5 in the nonlinear setting, but found that while the results were overall similar to what we report in the linear regime, iLQR was occasionally trapped into local minimal, resulting in more variable results especially for inhibition-stabilized network in the strongly connected end of the spectrum. Finally, Figure 5 is primarily meant to explore to what extent motor preparation can be predicted from basic linear control-theoretic properties of the Jacobian of the dynamics; in this regard, it made sense to work with linear RNNs (for which the Jacobian is constant).

      b) Crucially, there is no delayed sensory feedback in the model from the plant. Although this simplification is in some ways a strength, this decision allows networks to avoid having to deal with delayed feedback, which is a known component of closed-loop motor control and of motor cortex inputs and will have a large impact on the control policy.

      This comment resonates well with Reviewer 3's remark regarding the autonomous nature (or not) of M1 during movement. Rather than thinking of our RNN models as anatomically confined models of M1 alone, we think of them as models of the dynamics which M1 implements possibly as part of a broader network involving “inter-area loops and (at some latency) sensory feedback”, and whose state appears to be near-fully decodable from M1 activity alone. We have added a paragraph of Discussion on this important point.

      (5) A key feature determining the usefulness of preparation is the direction of the readout dimension. However, all readouts had a similar structure (random Gaussian initialization). Therefore, it would be useful to have more discussion regarding how the structure of the output connectivity would affect preparation, since the motor cortex certainly does not follow this output scheme.

      We agree with this limitation of our model — indeed one key message of Figure 4 is that the degree of reliance on preparatory inputs depends strongly on how the dynamics align with the readout. However, this strong dependence is somewhat specific to low-dimensional models; in higher-dimensional models (most of our paper), one expects that any random readout matrix C will pick out activity dimensions in the RNN that are sufficiently aligned with the most controllable directions of the dynamics to encourage preparation.

      We did consider optimizing C away (which required differentiating through the iLQR optimizer, which is possible but very costly), but the question inevitably arises what exactly should C be optimized for, and under what constraints (e.g fixed norm or not). One possibility is to optimize C with respect to the same control objective that the control inputs are optimized for, and constrain its norm (otherwise, inputs to the M1 model, and its internal activity, could become arbitrarily small as C can grow to compensate). We performed this experiment (new Supplementary Figure S7) and obtained a similar preparation index; there was one notable difference, namely that the optimized readout modes led to greater observability compared to a random readout; thus, the same amount of “muscle energy” required for a given movement could now be produced by a smaller initial condition. In turn, this led to smaller control inputs, consistent with a lower control cost overall.

      Whilst we could have systematically optimized C away, we reasoned that (i) it is computationally expensive, and (ii) the way M1 affects downstream effectors is presumably “optimized” for much richer motor tasks than simple 2D reaching, such that optimizing C for a fixed set of simple reaches could lead to misleading conclusions. We therefore decided to stick with random readouts.

      Additional comments :

      (1) The choice of cost function seems very important. Is it? For example, penalising the square of u(t) may produce very different results than penalising the absolute value.

      Yes, the choice of cost function does affect the results, at least qualitatively. The absolute value of the inputs is a challenging cost to use, as iLQR relies on a local quadratic approximation of the cost function. However, we have included additional experiments in which we penalized the squared derivative of the inputs (Supplementary Figure S8; see also our response to Reviewer 3's suggestion on this topic), and we do see differences in the qualitative behavior of the model (though the main takeaway, i.e. the reliance on preparation, continues to hold). This is now referred to and discussed in the Discussion section.

      (2) In future work it would be useful to consider the role of spinal networks, which are known to contribute to preparation in some cases (e.g. Prut and Fetz, 1999).

      (3) The control signal magnitude is penalised, but not the output torque magnitude, which highlights the fact that control in the model is quite different from muscle control, where co-contraction would be a possibility and therefore a penalty of muscle activation would be necessary. Future work should consider the role of these differences in control policy.

      Thank you for pointing us to this reference! Regarding both of these concerns, we agree that the model could be greatly improved and made more realistic in future work (another avenue for this would be to consider a more realistic biophysical model, e.g. using the MotorNet library). We hope that the current Discussion, which highlights the various limitations of our modeling choices, makes it clear that a lot of these choices could easily be modified depending on the specific assumptions/investigation being performed.

      Reviewer 2:

      Thank you for your positive review! We very much agree with the limitations you pointed out, some of which overlapped with the comments of the other reviewers. We have done our best to address them through additional discussion and new supplementary figures. We briefly highlight below where those changes can be found.

      (1) Though the optimal control theory framework is ideal to determine inputs that minimize output error while regularizing the input norm, it however cannot easily account for some other varied types of objectives especially those that may lead to a complex optimization landscape. For instance, the reusability of parts of the circuit, sparse use of additional neurons when learning many movements, and ease of planning (especially under uncertainty about when to start the movement), may be alternative or additional reasons that could help explain the preparatory activity observed in the brain. It is interesting to note that inputs that optimize the objective chosen by the authors arguably lead to a trade-off in terms of other desirable objectives. Specifically, the inputs the authors derive are time-dependent, so a recurrent network would be needed to produce them and it may not be easy to interpolate between them to drive new movement variants. In addition, these inputs depend on the desired time of output and therefore make it difficult to plan, e.g. in circumstances when timing should be decided depending on sensory signals. Finally, these inputs are specific to the full movement chain that will unfold, so they do not permit reuse of the inputs e.g. in movement sequences of different orders.

      Yes, that is a good point! We have incorporated further Discussion related to this point. We have additionally included a new example in which we regularize the temporal complexity of the inputs (see also our response to Reviewer 3's suggestion on this topic), which leads to more slowly varying inputs, and may indeed represent a more realistic constraint and lead to simpler inputs that can more easily be interpolated between. We also agree that uncertainty about the upcoming go cue may play an important role in the strategy adopted by the animals. While we have not performed an extensive investigation of the topic, we have included a Supplementary Figure (S9) in which we used Model Predictive Control to investigate the effect of planning under uncertainty about the go cue arrival time. We hope that this will give the reader a better sense of what sort of model extensions are possible within our framework.

      (2) Relatedly, if the motor circuits were to balance different types of objectives, the activity and inputs occurring before each movement may be broken down into different categories that may each specialize into one objective. For instance, previous work (Kaufman et al. eNeuron 2016, Iganaki et al., Cell 2022, Zimnik and Churchland, Nature Neuroscience 2021) has suggested that inputs occurring before the movement could be broken down into preparatory inputs 'stricto sensu' - relating to the planned characteristics of the movement - and a trigger signal, relating to the transition from planning to execution - irrespective of whether the movement is internally timed or triggered by an external event. The current work does not address which type(s) of early input may be labeled as 'preparatory' or may be thought of as a part of 'planning' computations.

      Yes, our model does indeed treat inputs in a very general way, and does not distinguish between the different types of processes they may be composed of. This is partly because we do not explicitly model where the inputs come from, such that our inputs likely englobe multiple processes. We have added discussion related to this point.

      (3) While the authors rightly point out some similarities between the inputs that they derive and observed preparatory activity in the brain, notably during motor sequences, there are also some differences. For instance, while both the derived inputs and the data show two peaks during sequences, the data reproduced from Zimnik and Churchland show preparatory inputs that have a very asymmetric shape that really plummets before the start of the next movement, whereas the derived inputs have larger amplitude during the movement period - especially for the second movement of the sequence. In addition, the data show trigger-like signals before each of the two reaches. Finally, while the data show a very high correlation between the pattern of preparatory activity of the second reach in the double reach and compound reach conditions, the derived inputs appear to be more different between the two conditions. Note that the data would be consistent with separate planning of the two reaches even in the compound reach condition, as well as the re-use of the preparatory input between the compound and double reach conditions. Therefore, different motor sequence datasets - notably, those that would show even more coarticulation between submovements - may be more promising to find a tight match between the data and the author's inputs. Further analyses in these datasets could help determine whether the coarticulation could be due to simple filtering by the circuits and muscles downstream of M1, planning of movements with adjusted curvature to mitigate the work performed by the muscles while permitting some amount of re-use across different sequences, or - as suggested by the authors - inputs fully tailored to one specific movement sequence that maximize accuracy and minimize the M1 input magnitude.

      Regarding the exact shape of the occupancy plots, it is important to note that some of the more qualitative aspects (e.g the relative height of the two peaks) will change if we change the parameters of the cost function. Right now, we have chosen the parameters to ensure that both reaches would be performed at roughly the same speed (as a way to very loosely constrain the parameters based on the observed behavior). However, small changes to the hyperparameters can lead to changes in the model output (e.g one of the two consecutive reaches being performed using greater acceleration than the other), and since our biophysical model is fairly simple, changes in the behavior are directly reflected in the network activity. Essentially, what this means is that while the double occupancy is a consistent feature of the model, the exact shape of the peaks is more sensitive to hyperparameters, and we do not wish to draw any strong conclusions from them, given the simplicity of the biophysical model. However, we do agree that our model exhibits some differences with the data. As discussed above, we have included additional discussion regarding the potential existence of separate inputs for planning vs triggering the movement in the context of single reaches.

      Overall, we are excited about the suggestions made by the Reviewer here about using our approach to analyze other motor sequence datasets, but we think that in order to do this properly, one would need to adopt a more realistic musculo-skeletal model (such as one provided by MotorNet).

      (4) Though iLQR is a powerful optimization method to find inputs optimizing the author's cost function, it also has some limitations. First, given that it relies on a linearization of the dynamics at each timestep, it has a limited ability to leverage potential advantages of nonlinearities in the dynamics. Second, the iLQR algorithm is not a biologically plausible learning rule and therefore it might be difficult for the brain to learn to produce the inputs that it finds. It remains unclear whether using alternative algorithms with different limitations - for instance, using variants of BPTT to train a separate RNN to produce the inputs in question - could impact some of the results.

      We agree that our choice of iLQR has limitations: while it offers the advantage of convergence guarantees, it does indeed restrict the choice of cost function and dynamics that we can use. We have now included extensive discussion of how the modeling choices affect our results.

      We do not view the lack of biological plausibility of iLQR as an issue, as the results are agnostic to the algorithm used for optimization. However, we agree that any structure imposed on the inputs (e.g by enforcing them to be the output of a self-contained dynamical system) would likely alter the results. A potentially interesting extension of our model would be to do just what the reviewer suggested, and try to learn a network that can generate the optimal inputs. However, this is outside the scope of our investigation, as it would then lead to new questions (e.g what brain region would that other RNN represent?).

      (5) Under the objective considered by the authors, the amount of input occurring before the movement might be impacted by the presence of online sensory signals for closed-loop control. It is therefore an open question whether the objective and network characteristics suggested by the authors could also explain the presence of preparatory activity before e.g. grasping movements that are thought to be more sensory-driven (Meirhaeghe et al., Cell Reports 2023).

      It is true that we aren’t currently modeling sensory signals explicitly. However, some of the optimal inputs we infer may be capturing upstream information which could englobe some sensory information. This is currently unclear, and would likely depend on how exactly the model is specified. We have added new discussion to emphasize that our dynamics should not be understood as just representing M1, but more general circuits whose state can be decoded from M1.

      Reviewer #2 (Recommendations For The Authors):

      Additionally, thank you for pointing out various typos in the manuscript, we have fixed those!

      Reviewer 3:

      Thank you very much for your review, which makes a lot of very insightful points, and raises several interesting questions. In summary, we very much agree with the limitations you pointed out. In particular, the choice of input cost is something we had previously discussed, but we had found it challenging to decide on what a reasonable cost for “complexity” could be. Following your comment, we have however added a first attempt at penalizing “temporal complexity”, which shows promising behavior. We have only included those additional analyses as supplementary figures, and we have included new discussion, which hopefully highlights what we meant by the different model components, and how the model behavior may change as we vary some of our choices. We hope this can be informative for future models that may use a similar approach. Below, we highlight the changes that we have made to address your comments.

      The main limitation of the study is that it focuses exclusively on one specific constraint - magnitude - that could limit motor-cortex inputs. This isn't unreasonable, but other constraints are at least as likely, if less mathematically tractable. The basic results of this study will probably be robust with regard such issues - generally speaking, any constraint on what can be delivered during execution will favor the strategy of preparing - but this robustness cuts both ways. It isn't clear that the constraint used in the present study - minimizing upstream energy costs - is the one that really matters. Upstream areas are likely to be limited in a variety of ways, including the complexity of inputs they can deliver. Indeed, one generally assumes that there are things that motor cortex can do that upstream areas can't do, which is where the real limitations should come from. Yet in the interest of a tractable cost function, the authors have built a system where motor cortex actually doesn't do anything that couldn't be done equally well by its inputs. The system might actually be better off if motor cortex were removed. About the only thing that motor cortex appears to contribute is some amplification, which is 'good' from the standpoint of the cost function (inputs can be smaller) but hardly satisfying from a scientific standpoint.

      The use of a term that punishes the squared magnitude of control signals has a long history, both because it creates mathematical tractability and because it (somewhat) maps onto the idea that one should minimize the energy expended by muscles and the possibility of damaging them with large inputs. One could make a case that those things apply to neural activity as well, and while that isn't unreasonable, it is far from clear whether this is actually true (and if it were, why punish the square if you are concerned about ATP expenditure?). Even if neural activity magnitude an important cost, any costs should pertain not just to inputs but to motor cortex activity itself. I don't think the authors really wish to propose that squared input magnitude is the key thing to be regularized. Instead, this is simply an easily imposed constraint that is tractable and acts as a stand-in for other forms of regularization / other types of constraints. Put differently, if one could write down the 'true' cost function, it might contain a term related to squared magnitude, but other regularizing terms would by very likely to dominate. Using only squared magnitude is a reasonable way to get started, but there are also ways in which it appears to be limiting the results (see below).

      I would suggest that the study explore this topic a bit. Is it possible to use other forms of regularization? One appealing option is to constrain the complexity of inputs; a long-standing idea is that the role of motor cortex is to take relatively simple inputs and convert them to complex time-evolving inputs suitable for driving outputs. I realize that exploring this idea is not necessarily trivial. The right cost-function term is not clear (should it relate to low-dimensionality across conditions, or to smoothness across time?) and even if it were, it might not produce a convex cost function. Yet while exploring this possibility might be difficult, I think it is important for two reasons.

      First, this study is an elegant exploration of how preparation emerges due to constraints on inputs, but at present that exploration focuses exclusively on one constraint. Second, at present there are a variety of aspects of the model responses that appear somewhat unrealistic. I suspect most of these flow from the fact that while the magnitude of inputs is constrained, their complexity is not (they can control every motor cortex neuron at both low and high frequencies). Because inputs are not complexity-constrained, preparatory activity appears overly complex and never 'settles' into the plateaus that one often sees in data. To be fair, even in data these plateaus are often imperfect, but they are still a very noticeable feature in the response of many neurons. Furthermore, the top PCs usually contain a nice plateau. Yet we never get to see this in the present study. In part this is because the authors never simulate the situation of an unpredictable delay (more on this below) but it also seems to be because preparatory inputs are themselves strongly time-varying. More realistic forms of regularization would likely remedy this.

      That is a very good point, and it mirrors several concerns that we had in the past. While we did focus on the input norm for the sake of simplicity, and because it represents a very natural way to regularize our control solutions, we agree that a “complexity cost” may be better suited to models of brain circuits. We have addressed this in a supplementary investigation. We chose to focus on a cost that penalizes the temporal complexity of the inputs, as ||u(t+1) - u(t)||^2. Note that this required augmenting the state of the model, making the computations quite a bit slower; while it is doable if we only penalize the first temporal derivative, it would not scale well to higher orders.

      Interestingly, we did find that the activity in that setting was somewhat more realistic (see new Supplementary Figure S8), with more sustained inputs and plateauing activity. While we have kept the original model for most of the investigations, the somewhat more realistic nature of the results under that setting suggests that further exploration of penalties of that sort could represent a promising avenue to improve the model.

      We also found the idea of a cost that would ensure low-dimensionality of the inputs across conditions very interesting. However, it is challenging to investigate with iLQR as we perform the optimization separately for each condition; nevertheless, it could be investigated using a different optimizer.

      At present, it is also not clear whether preparation always occurs even with no delay. Given only magnitude-based regularization, it wouldn't necessarily have to be. The authors should perform a subspace-based analysis like that in Figure 6, but for different delay durations. I think it is critical to explore whether the model, like monkeys, uses preparation even for zero-delay trials. At present it might or might not. If not, it may be because of the lack of more realistic constraints on inputs. One might then either need to include more realistic constraints to induce zero-delay preparation, or propose that the brain basically never uses a zero delay (it always delays the internal go cue after the preparatory inputs) and that this is a mechanism separate from that being modeled.

      I agree with the authors that the present version of the model, where optimization knows the exact time of movement onset, produces a reasonably realistic timecourse of preparation when compared to data from self-paced movements. At the same time, most readers will want to see that the model can produce realistic looking preparatory activity when presented with an unpredictable delay. I realize this may be an optimization nightmare, but there are probably ways to trick the model into optimizing to move soon, but then forcing it to wait (which is actually what monkeys are probably doing). Doing so would allow the model to produce preparation under the circumstances where most studies have examined it. In some ways this is just window-dressing (showing people something in a format they are used to and can digest) but it is actually more than that, because it would show that the model can produce a reasonable plateau of sustained preparation. At present it isn't clear it can do this, for the reasons noted above. If it can't, regularizing complexity might help (and even if this can't be shown, it could be discussed).

      In summary, I found this to be a very strong study overall, with a conceptually timely message that was well-explained and nicely documented by thorough simulations. I think it is critical to perform the test, noted above, of examining preparatory subspace activity across a range of delay durations (including zero) to see whether preparation endures as it does empirically. I think the issue of a more realistic cost function is also important, both in terms of the conceptual message and in terms of inducing the model to produce more realistic activity. Conceptually it matters because I don't think the central message should be 'preparation reduces upstream ATP usage by allowing motor cortex to be an amplifier'. I think the central message the authors wish to convey is that constraints on inputs make preparation a good strategy. Many of those constraints likely relate to the fact that upstream areas can't do things that motor cortex can do (else you wouldn't need a motor cortex) and it would be good if regularization reflected that assumption. Furthermore, additional forms of regularization would likely improve the realism of model responses, in ways that matter both aesthetically and conceptually. Yet while I think this is an important issue, it is also a deep and tricky one, and I think the authors need considerable leeway in how they address it. Many of the cost-function terms one might want to use may be intractable. The authors may have to do what makes sense given technical limitations. If some things can't be done technically, they may need to be addressed in words or via some other sort of non-optimization-based simulation.

      Specific comments

      As noted above, it would be good to show that preparatory subspace activity occurs similarly across delay durations. It actually might not, at present. For a zero ms delay, the simple magnitude-based regularization may be insufficient to induce preparation. If so, then the authors would either have to argue that a zero delay is actually never used internally (which is a reasonable argument) or show that other forms of regularization can induce zero-delay preparation.

      Yes, that is a very interesting analysis to perform, which we had not considered before! When investigating this, we found that the zero-delay strategy does not rely on preparation in the same way as is seen in the monkeys. This seems to be a reflection of the fact that our “Go cue” corresponds to an “internal” go cue which would likely come after the true, “external go cue” – such that we would indeed never actually be in the zero delay setting. This is not something we had addressed (or really considered) before, although we had tried to ensure we referred to “delta prep” as the duration of the preparatory period but not necessarily the delay period. We have now included more discussion on this topic, as well as a new Supplementary Figure S10.

      I agree with the authors that prior modeling work was limited by assuming the inputs to M1, which meant that prior work couldn't address the deep issue (tackled here) of why there should be any preparatory inputs at all. At the same time, the ability to hand-select inputs did provide some advantages. A strong assumption of prior work is that the inputs are 'simple', such that motor cortex must perform meaningful computations to convert them to outputs. This matters because if inputs can be anything, then they can just be the final outputs themselves, and motor cortex would have no job to do. Thus, prior work tried to assume the simplest inputs possible to motor cortex that could still explain the data. Most likely this went too far in the 'simple' direction, yet aspects of the simplicity were important for endowing responses with realistic properties. One such property is a large condition-invariant response just before movement onset. This is a very robust aspect of the data, and is explained by the assumption of a simple trigger signal that conveys information about when to move but is otherwise invariant to condition. Note that this is an implicit form of regularization, and one very different from that used in the present study: the input is allowed to be large, but constrained to be simple. Preparatory inputs are similarly constrained to be simple in the sense that they carry only information about which condition should be executed, but otherwise have little temporal structure. Arguably this produces slightly too simple preparatory-period responses, but the present study appears to go too far in the opposite direction. I would suggest that the authors do what they can to address these issue via simulations and/or discussion. I think it is fine if the conclusion is that there exist many constraints that tend to favor preparation, and that regularizing magnitude is just one easy way of demonstrating that. Ideally, other constraints would be explored. But even if they can't be, there should be some discussion of what is missing - preparatory plateaus, a realistic condition-invariant signal tied to movement onset - under the present modeling assumptions.

      As described above, we have now included two additional figures. In the first one (S8, already discussed above), we used a temporal smoothness prior, and we indeed get slightly more realistic activity plateaus. In a second supplementary figure (S9), we have also considered using model predictive control (MPC) to optimize the inputs under an uncertain go cue arrival time. There, we found that removing the assumption that the delay period is known came with new challenges: in particular, it requires the specification of a “mental model” of when the Go cue will arrive. While it is reasonable to expect that monkeys will have a prior over the go time arrival cue that will be shaped by the design of the experiment, some assumptions must be made about the utility functions that should be used to weigh this prior. For instance, if we imagine that monkeys carry a model of the possible arrival time of the go cue that is updated online, they could nonetheless act differently based on this information, for instance by either preparing so as to be ready for the earliest go cue possible or alternatively to be ready for the average go cue. This will likely depend on the exact task design and reward/penalty structure. Here, we added simulations with those two cases (making simplifying assumptions to make the problem tractable/solvable using model predictive control), and found that the “earliest preparation” strategy gives rise to more realistic plateauing activity, while the model where planning is done for the “most likely go time” does not. We suspect that more realistic activity patterns could be obtained by e.g combining this framework with the temporal smoothness cost. However, the main point we wished to make with this new supplementary figure is that it is possible to model the task in a slightly more realistic way (although here it comes at the cost of additional model assumptions). We have now added more discussion related to those points. Note that we have kept our analyses on these new models to a minimum, as the main takeaway we wish to convey from them is that most components of the model could be modified/made more realistic. This would impact the qualitative behavior of the system and match to data but – in the examples we have so far considered – does not appear to modify the general strategy of networks relying on preparation.

      On line 161, and in a few other places, the authors cite prior work as arguing for "autonomous internal dynamics in M1". I think it is worth being careful here because most of that work specifically stated that the dynamics are likely not internal to M1, and presumably involve inter-area loops and (at some latency) sensory feedback. The real claim of such work is that one can observe most of the key state variables in M1, such that there are periods of time where the dynamics are reasonably approximated as autonomous from a mathematical standpoint. This means that you can estimate the state from M1, and then there is some function that predicts the future state. This formal definition of autonomous shouldn't be conflated with an anatomical definition.

      Yes, that is a good point, thank you for making it so clearly! Indeed, as previous work, we do not think of our “M1 dynamics” as being internal to M1, but they may instead include sensory feedback / inter-area loops, which we summarize into the connectivity, that we chose to have dynamics that qualitatively resemble data. We have now incorporated more discussion regarding what exactly the dynamics in our model represent.

    1. eLife assessment

      The valuable findings by Dasgupta et al demonstrate the role of Sema7a in fine tuning the morphology of the microcircuit between afferent axons and sensory hair cells in the lateral line organ. The loss and gain of function evidence provides solid support for a role for Sema7a in this process. Additional work is needed to determine the role for different isoforms in Sema7a-mediated synapse formation and chemoattraction as well as cell type specificity.

    2. Reviewer #1 (Public Review):

      Dasguta et al. have dissected the role of Sema7a in fine tuning of a sensory microcircuit in the posterior lateral line organ of zebrafish. They attempt to also outline the different roles of a secreted verses membrane-bound form of Sema7a in this process. Using genetic perturbations and axonal network analysis, the authors show that loss of both Sema7a isoforms causes abnormal axon terminal structure with more bare terminals and fewer loops in contact with presynaptic sensory hair cells. Further, they show that loss of Sema7a causes decreased number and size of both the pre- and post-synapse. Finally, they show that overexpression of the secreted form of Sema7a specifically can elicit axon terminal outgrowth to an ectopic Sema7a expressing cell. Together, the analysis of Sema7a loss of function and overexpression on axon arbor structure is fairly thorough and revealed a novel role for Sema7a in axon terminal structure.

    3. Reviewer #2 (Public Review):

      In this work, Dasgupta et al. investigate the role of Sema7a in the formation of peripheral sensory circuit in the lateral line system of zebrafish. They show that Sema7a protein is present during neuromast maturation and localized, in part, to the base of hair cells (HCs). This would be consistent with pre-synaptic Sema7a mediating formation and/or stabilization of the synapse. They use sema7a loss-of-function strain to show that lateral line sensory terminals display abnormal arborization. They provide highly quantitative analysis of the lateral line terminal arborization to show that a number of specific topological parameters are affected in mutants. Next, they ectopically express a secreted form of Sema7a to show that lateral line terminals can be ectopically attracted to the source. Finally, they also demonstrate that the synaptic assembly is impaired in the sema7a mutant. Overall, the data are of high quality and properly controlled. The availability of Sema7a antibody is a big plus, as it allows to address the endogenous protein localization as well to show the signal absence in the sema7a mutant. The quantification of the arbor topology should be useful to people in the field who are looking at the lateral line as well as other axonal terminals.

    4. Reviewer #3 (Public Review):

      The data reported here demonstrate that Sema7a defines the local behavior of growing axons in the developing zebrafish lateral line. The analysis is sophisticated and convincingly demonstrates effects on axon growth and synapse architecture. Collectively, the findings point to the idea that the diffusible form of sema7a may influence how axons grow within the neuromast and that the GPI-linked form of sema7a may subsequently impact how synapses form, though additional work is needed to strongly link each form to its' proposed effect on circuit assembly.

      Comments on latest version:

      The authors comprehensively and appropriately addressed most of the reviewers' concerns. In particular, they added evidence that hair cells express both Sema7A isoforms, showed that membrane bound Sema7A does not have long range effects on guidance, demonstrated how axons behave close to ectopic Sema7A, and analyzed other features of the hair cells that revealed no strong phenotypes. The authors also softened the language in many, but not all places. Overall, I am satisfied with the study as a whole.

    5. Reviewer #4 (Public Review):<br /> <br /> This study provides direct evidence showing that Sema7a plays a role in the axon growth during the formation of peripheral sensory circuits in the lateral-line system of zebrafish. This is a valuable finding because the molecules for axon growth in hair-cell sensory systems are not well understood. The majority of the experimental evidence is convincing, and the analysis is rigorous. The evidence supporting Sema7a's juxtracrine vs. secreted role and involvement in synapse formation in hair cells is less conclusive. The study will be of interest to cell, molecular and developmental biologists, and sensory neuroscientists.

    6. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment 

      Dasgupta and colleagues make a valuable contribution to the understanding how the guidance factor Sema7a promotes connections between mechanosensory hair cells and afferent neurons of the zebrafish lateral line system. The authors provide solid evidence that loss of Sema7a function results in fewer contacts between hair cells and afferents through comprehensive quantitative analysis. Additional work is needed to distinguish the effects of different isoforms of Sema7a to determine whether there are specific roles of secreted and membrane bound forms. 

      Public Reviews:

      Reviewer #1 (Public Review):

      Dasguta et al. have dissected the role of Sema7a in fine tuning of a sensory microcircuit in the posterior lateral line organ of zebrafish. They attempt to also outline the different roles of a secreted verses membrane-bound form of Sema7a in this process. Using genetic perturbations and axonal network analysis, the authors show that loss of both Sema7a isoforms causes abnormal axon terminal structure with more bare terminals and fewer loops in contact with presynaptic sensory hair cells. Further, they show that loss of Sema7a causes decreased number and size of both the pre- and post-synapse. Finally, they show that overexpression of the secreted form of Sema7a specifically can elicit axon terminal outgrowth to an ectopic Sema7a expressing cell. Together, the analysis of Sema7a loss of function and overexpression on axon arbor structure is fairly thorough and revealed a novel role for Sema7a in axon terminal structure. However, the connection between different isoforms of Sema7a and the axon arborization needs to be substantiated. Furthermore, the effect of loss of Sema7a on the presynaptic cell is not ruled out as a contributing factor to the synaptic and axon structure phenotypes. These issues weaken the claims made by the authors including the statement that they have identified dual roles for the GPI-anchored verses secreted forms of Sema7a on synapse formation and as a chemoattractant for axon arborization respectively. 

      Reviewer #2 (Public Review):

      In this work, Dasgupta et al. investigates the role of Sema7a in the formation of peripheral sensory circuit in the lateral line system of zebrafish. They show that Sema7a protein is present during neuromast maturation and localized, in part, to the base of hair cells (HCs). This would be consistent with pre-synaptic Sema7a mediating formation and/or stabilization of the synapse. They use sema7a loss-of-function strain to show that lateral line sensory terminals display abnormal arborization. They provide highly quantitative analysis of the lateral line terminal arborization to show that a number of specific topological parameters are affected in mutants. Next, they ectopically express a secreted form of Sema7a to show that lateral line terminals can be ectopically attracted to the source. Finally, they also demonstrate that the synaptic assembly is impaired in the sema7a mutant. Overall, the data are of high quality and properly controlled. The availability of Sema7a antibody is a big plus, as it allows to address the endogenous protein localization as well to show the signal absence in the sema7a mutant. The quantification of the arbor topology should be useful to people in the field who are looking at the lateral line as well as other axonal terminals. I think some results are overinterpreted though. The authors state: "Our findings demonstrate that Sema7A functions both as a juxtracrine and as a secreted cue to pattern neural circuitry during sensory organ development." However, they have not actually demonstrated which isoform functions in HCs (also see comments below). In addition, they have to be careful in interpreting their topology analysis, as they cannot separate individual axons. Thus, such analysis can generate artifacts. They can perform additional experiments to address these issues or adjust their interpretations. 

      Reviewer #3 (Public Review):

      The data reported here demonstrate that Sema7a defines the local behavior of growing axons in the developing zebrafish lateral line. The analysis is sophisticated and convincingly demonstrates effects on axon growth and synapse architecture. Collectively, the findings point to the idea that the diffusible form of sema7a may influence how axons grow within the neuromast and that the GPI-linked form of sema7a may subsequently impact how synapses form, though additional work is needed to strongly link each form to its' proposed effect on circuit assembly. 

      The revised manuscript is significantly improved. The authors comprehensively and appropriately addressed most of the reviewers' concerns. In particular, they added evidence that hair cells express both Sema7A isoforms, showed that membrane bound Sema7A does not have long range effects on guidance, demonstrated how axons behave close to ectopic Sema7A, and analyzed other features of the hair cells that revealed no strong phenotypes. The authors also softened the language in many, but not all places. Overall, I am satisfied with the study as a whole. 

      Reviewer #4 (Public Review):

      This study provides direct evidence showing that Sema7a plays a role in the axon growth during the formation of peripheral sensory circuits in the lateral-line system of zebrafish. This is a valuable finding because the molecules for axon growth in hair-cell sensory systems are not well understood. The majority of the experimental evidence is convincing, and the analysis is rigorous. The evidence supporting Sema7a's juxtracrine vs. secreted role and involvement in synapse formation in hair cells is less conclusive. The study will be of interest to cell, molecular and developmental biologists, and sensory neuroscientists. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      In their revised manuscript, Dasgupta et al. have provided further experiments to address the role of Sema7a (sec and GPI-anchored) in regulating axon guidance in the lateral line system. Specifically, the inclusion of the heat shock controls and FM labeling to show hair cell mechanotransduction were crucial to interpretation of the results. However, there are still concerns about the specificity of the results. My primary concern is if the change in axon patterning is specifically due to loss of Sema7a in the mutant hair cells. These animals are morphologically very abnormal and, in the rebuttal, the authors state that hair cell number is reduced. This is not quantified in the manuscript and should be included. 

      Thank you for this suggestion. We have included the data in the manuscript in lines 137-139, in Figure 2—figure supplement 1B, and in the source data for Figure 2 and Figure 2-figure supplements.

      If there is not a function for Sema7a in hair cells themselves, why is the number reduced? 

      The sema7a-/- homozygous mutants are not viable and they die by 6 dpf. The loss of Sema7A protein produce other developmental defects including brain edema and a curved body axis. We believe a slight but not significant decrease in hair cell number may arise from a minute developmental delay in the morphogenesis of the neuromast. We have accordingly quantified our data at three distinct developmental stages-at 2 dpf, 3 dpf, and 4 dpf-and have incorporated them in the revised manuscript.

      Additionally, FM data should be quantified and presented in animals without a transgene in the same excitation/emission spectra for clearer interpretation of the staining.

      We have quantified the intensities of labeling with FM 4-64 styryl dye from the control and the sema7a-/- mutant larvae and incorporated the data in lines 139-146, in Figure 2—figure supplement 1D, and in source data for Figure 2 and Figure 2-figure supplements. We Kept the transgenes to concurrently show the arborization phenotype, hair cell morphology, and the FM 4-64 incorporation between the genotypes. 

      Rescue analysis using the myo6d promotor would allow the authors to ensure that the axon deficits can be rescued by putting Sema7a back into the sensory hair cells. Transient transgenesis could be useful for this approach and would not require the creation of a stable line. This could be done with both forms of Sema7a allowing the true assessment of whether or not the secreted and GPI-anchored form have disparate functions as claimed in lines 418424. 

      Although we recognize the importance of the rescue of the sema7a-/- mutant phenotype with the sema7asec and the sema7aGPI transcripts, it is not possible for us to perform that experiment at the moment, for the first author will leave the lab next week.  However, he plans to continue work on this project as an independent investigator to dissect the individual roles of the transcript variants in specifying the pattern of sensory arborization, a project that includes generation of transcript-specific knockout animals and rescue experiments with stable transgenic fish lines. 

      Other concerns:

      (1) The timeline of the heat shock experiment is confusing to me and, therefore, it makes me question the specificity of those results. Based on the speed of axon outgrowth and the time necessary for transcription and translation after heat shock induction of the transgene, it is unclear to me how the axon growth defects could occur in the timeline provided. Imaging two hours after the start of the heat shock is very rapid and speaks to either an indirect effect of the transgenesis on the axon growth or a leaky promotor/induction paradigm. It is possible I am just misunderstanding the set up but, from what I could gather, the imaging is being done 2 hrs after the start of the heat shock. This should be clarified. 

      The axons of the zebrafish posterior lateral line migrate relatively fast. The pioneering axons migrate at around 120 μm/hour (Sato et. al., 2010) and the follower axons migrate at almost 30-80 μm/hour (Sato et. al., 2010). The heat-shock promoter that we have utilized, hsp70l, is highly effective in inducing gene expression and subsequent protein formation within 30 to 60 mins. We believe an hour of heat shock and an hour of incubation post heat shock is sufficient to induce directed axon migration to a distance that spans from 27 μm to 140 μm. 

      We strongly believe that the directed arborization of the sensory axons towards the Sema7Asec source is not due to an indirect effect of transgenesis or leaky promoter induction, as in all 18 of the injected but not heat-shocked control larvae we did not observe ectopic Sema7Asec expression, and no aberrant projection was formed from the sensory arbor network. We highlight this observation in lines 297-299 and in Figure 4E.

      Sato et. al., 2010: Single-cell analysis of somatotopic map formation in the zebrafish lateral line system. Developmental Dynamics 239:2058–2065, 2010.

      Similarly, it would help to clarify if t(0) in the figure is the onset of the heat shock or onset of imaging two hours after the heat shock is started. 

      The t=0 hour in the Figure 4I denotes the onset of imaging two hours after the heat shock began. We have clarified this in the manuscript in lines 1155-1156.

      (2) In the rebuttal, the line numbers cited do not match up with the appropriate text, I believe.

      We have corrected this and updated the manuscript.

      (3) Some of the supplemental figures are not mentioned in the text, or I could not find them. For example: Figure 1 supplement 2J. 

      Thank you for pointing this. We have corrected the manuscript, and the new information is added in line 114.  

      (4) Table 1 statistics: were these adjusted for multiple comparisons using a bonferroni correction or something similar? This is necessary for statistical significance to be meaningful. 

      We did not adjust the p-values for multiple comparisons because the values correspond to only three or four statistical tests per experiment, strongly indicating the unlikelihood of erroneous significance due solely to multiple tests.

      (5) Figure 1I and 1-S3 - The legend states a positive correlation between axonal signal and sema7A signal. Correlations are 0.5, 0.6, and 0.4 (2,3, 4dpf). This is not a convincing positive correlation. At best this is no to a very weak positive correlation. 

      In lines 122-126 we mention that the basal association of the sensory arbors shows a positive correlation with Sema7A accumulation. We never emphasize on the strength of the correlation. However, a consistent positive correlation at three different developmental stages suggests that progressive Sema7A accumulation at the base of the hair cells may guide the sensory arbors to increasingly associate themselves with the hair cells.    

      Reviewer #2 (Recommendations For The Authors):

      I am a bit disappointed that the authors elected not to experimentally address the issue raised by all reviewers: whether the secreted or membrane bound isoform is active in hair cells. They rather decided to change their interpretation in the text. It is fine, given the eLife review structure. However, that would make the manuscript much stronger. Other issues were adequately addressed through textual changes as well. 

      Although we recognize the importance of the rescue of the sema7a-/- mutant phenotype with the sema7asec and the sema7aGPI transcripts, it is not possible for us to perform that experiment at the moment, for the first author will leave the lab next week.  However, he plans to continue work on this project as an independent investigator to dissect the individual roles of the transcript variants in specifying the pattern of sensory arborization, a project that includes generation of transcript-specific knockout animals and rescue experiments with stable transgenic fish lines. 

      Reviewer #3 (Recommendations For The Authors):

      Overall, I am satisfied with the study as a whole and just have a few minor comments that remain to be addressed. 

      (1) Although the authors say that they added appropriate no plasmid/heatshock-only and plasmid-only/no heatshock controls, these results need to be presented more clearly, as they are separated in the paper and only one was quantified (i.e. 100% of embryos showed no defect). Please just make it clear that no defects were observed in either control for either experiment (both secreted and membrane bound ectopic expression). 

      We have clearly stated this information in lines 297-299 and 343-345.

      (2) Please add a compass to Fig. 1A to indicate the orientation of the neuromast. It would also be helpful to add labels for developmental ages to all of the figures, rather than making the reader look it up in the legend. 

      We have updated the Figure 1A and the corresponding figure legend in lines 882883 . We have denoted the larval age in the figure legends to keep the individual images uncluttered.  

      (3) For the RT-PCR experiments in Figure 1, no negative control was included to show that supporting cell or neuronal genes are not detected in the purified hair cells and v.v. that neither isoform is detected in supporting cells or neurons. I ask only because there is a lot of immune-signal outside of the hair cells and I am curious whether that is secreted or might come from other cell types. For neurons and supporting cells, simply demonstrating absence of Sema7a overall would suffice. 

      We have utilized the transgenic line Tg(myo6b:actb1-EGFP) that expresses the fluorophore GFP specifically in the hair cells of the neuromast. Unfortunately, we do not possess a transgenic line that reliably and specifically labels the support cells in the neuromast. Hence, in our sorting experiment the GFP-negative cells that are collected from the trunk segments of the larvae contain all the non-hair cells including epidermal cells, neuronal cells, and immune cells etc. Such a mixture of varied cellular identity may not serve as a reliable negative control. 

      In Figure 7, we have plotted the normalized expression values of the sema7a gene in the neuromast. The plot clearly depicts that the source of Sema7A is the young and the mature hair cells, not the support cells. We further confirm this observation by

      immunostaining where the Sema7A signal is highly restricted to the hair cells and not in any other cell in the neuromast (Figure 1E). Immunostaining further demonstrates that the lateral line sensory arbors also do not produce the Sema7A protein (Figure 1H; Video 1).

      We agree with the reviewer that there are diverse immune cells, including macrophages in and around the neuromast. These macrophages are dynamic and possess highly ramified structure (Denans et. al., 2022). In all our Sema7A immunostainings, we never observed structures that resemble macrophages. Albeit we cannot confirm that Sema7A is not expressed in a distant immune cell, but we highly doubt that signal coming from immune cells is impacting hair cell innervation by the sensory arbors during homeostatic development.

      Denans et. al., 2022: Nature Communications volume 13, Article number: 5356 (2022).

      (4) In Figure 1, Supplement 4, I do not see the immunogen labeled in blue. 

      We have corrected the figure legend. The immunogenic region of the Sema7A protein is now clearly denoted in the figure legend of Figure 1—figure supplement 4.

      (5) In Figure 2, please add a control image as requested, as that enables direct comparison. There is ample room in the figure. 

      We have updated the Figure 2 and made the suggested change.

      (6) In Figure 2, Supplement 1, the FM4-64 data are not presented in a quantified fashion. Please report at least how many embryos showed reliable uptake and preferably how many hair cells per embryo showed reliable uptake. 

      We have quantified the FM 4-64 intensities in control and sema7a-/- mutant larvae. The new data is added to the manuscript in lines 142-146, 577-579 , and in Figure 2—figure supplement 1D.

      (7) In Figure 3, there seems to be a typo in the figure legend: "mutants in the same larvae" does not make sense to me. 

      We have corrected the error. The modified statement is represented in lines 10671068.

      (8) The text should refer more explicitly to the statistical tests reported in Table 1, i.e. as the results are presented. 

      In lines 1105 and 1109, we clearly state the statistical tests that were performed.

      (9) In Figure 6, Supplement 1, please show the raw data points not just the bar graphs

      We have updated the Figure 6—figure supplement 1.

      (10) Minor point: the authors state that they addressed the distance over which secreted Sema7A may act, but this was not evident to me in the text. Please make this finding clearer.

      We have clarified this information in lines 310-311.

      (11) Finally, the discussion contains a statement that is not supported by the data: "We have discovered dual modes of Sema7A function in vivo." They have discovered evidence that there are two isoforms, that loss of both disrupts connectivity, and that overexpression of only the secreted form can elicit growth from a distance. However, there is no direct evidence that the membrane-bound form is responsible for local effects. It is formally possible still that the phenotypes are a result of dual roles for the secreted form. It is clear that another manuscript is forthcoming that will expand on the role of the transmembrane form, but for this manuscript, the authors should make firm conclusions only about the data presented herein.

      Thank you for this suggestion. We have modified the manuscript in lines 425-434.

      Reviewer #4 (Recommendations For The Authors):

      The authors have made significant changes to the manuscript based on the comments of the reviewers. It is now suitable for publication.

    1. eLife assessment

      This important work provides convincing data on neuronal heterogeneity in the dorsal raphe nucleus (DRN), focusing on their electrophysiological properties, morphology, and susceptibility to the neurodegeneration of noradrenaline and dopamine systems in the Parkinsonian state. These findings suggest a significant interplay between catecholaminergic systems in healthy and parkinsonian conditions, as well as neuronal structure and function. Such findings provide a strong foundation for basic scientists as well as pre-clinical researchers interested in the role of dorsal raphe neurons in Parkinson's disease.

    2. Reviewer #1 (Public Review):

      Summary:

      People with Parkinson's disease often experience a variety of nonmotor symptoms, the biological bases of which remain poorly understood. Johansson et al began to study potential roles of the dorsal raphe nucleus (DRN) degeneration in the pathophysiology of neuropsychiatric symptoms in PD.

      Strengths:

      Boi et al validated a transgenic reporter mouse line that can reliably label dopaminergic neurons in the DRN. This brain region shows severe neurodegeneration and has been proposed to contribute to the manifestation of neuropsychiatric symptoms in PD. Using this mouse line (and others), Boi and colleagues characterized electrophysiological and morphological phenotypes of dopaminergic and serotoninergic neurons in the raphe nucleus. This study involved very careful topographical registration of recorded neurons to brain slices for post hoc immunohistochemical validation of cell identity, making it an elegant and thorough piece of work.

      Of relevance to PD pathophysiology, the authors evaluated the physiological and morphological changes of DRN serotoninergic and dopaminergic neurons after a partial loss of nigrostriatal dopamine neurons, which serves as a mouse model of early parkinsonian pathology. Moreover, the authors identified a series of physiological and morphological changes of subtypes of DRN neurons that depend on nigral dopaminergic neurodegeneration, LC noradrenergic neurodegeneration, or both. Indeed this work highlights the importance of LC noradrenergic degeneration in PD pathophysiology.

      Overall, this is a well-designed study with high significance to the Parkinson's research field.

    3. Reviewer #2 (Public Review):

      In this paper, Boi et al. thoroughly classified the electrophysiological and morphological characteristics of serotonergic and dopaminergic neurons in the DRN and examined the alterations of these neurons in the 6-OHDA-induced mouse PD model. Using whole-cell patch clamp recording, they found that 5-HT and dopamine (DA) neurons in the DRN are electrophysiologically distinct from each other. Additionally, they characterized distinct morphological features of 5-HT and DA neurons in the DRN. Notably, these specific features of 5-HT and DA neurons in the DRN exhibited different changes in the 6-OHDA-induced PD model. Then the authors utilized desipramine (DMI) to separate the effects of nigrostriatal DA depletion and noradrenaline (NA) depletion induced by 6-OHDA. Interestingly, protection from NA depletion by DMI pretreatment reversed the changes in 5-HT neurons, while having a minor impact on the changes in DA neurons in the DRN. These data indicate that the role of NA lesion in the altered properties of DRN 5-HT neurons by 6-OHDA is more critical than that of DA lesions.

      Overall, this study provides foundational data on the 5-HT and DA neurons in the DRN and their potential involvement in PD symptoms. Given the deficits of the DRN in PD, this paper may offer insights into the cellular mechanisms underlying non-motor symptoms associated with PD.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I have no more experiment to ask but the following errors should be corrected prior.

      (1) L. 183-198: Figure 3 panels were erroneously referred in several places.

      This has been corrected.

      (2) L.182-183: description of active/total cell numbers in main text does not match numbers in Figure 3B

      This has been corrected.

      (3) L.185-187: Figure 3C indicates significant changes of rheobase only between DMI+6OHDA versus 6-OHDA group. Statistical comparison between sham and DMI+6-OHDA was not provided, which may change the interpretation of the data in Figure 3B, C: "...these findings suggest that the 6-OHDA induced lesion of midbrain dopaminergic neurons evoked the increased firing of DRN5-HT neurons" (L.185-187).

      We thank the reviewer for highlighting this point. Indeed, a Kruskal-Wallis test comparing all three groups revealed a significantly lower rheobase in DMI + 6-OHDA mice compared to Sham while the 6-OHDA injected group was not affected. Therefore, the increased firing of DRN5-HT neurons recorded in 6-OHDA injected mice pretreated with DMI also critically involves the noradrenergic system. This is now included in the revised results section of the manuscript (lines 190-197).

      (4) L. 188: The description of "While the excitability of DRN5-HT neurons was not affected in 6-OHDA mice..." does not match the clearly increased cellular excitability shown in Figure 3G-I.

      This has been corrected and we are now referring more specifically to the rheobase, which is not affected in 6-OHDA mice.

      (5) Mann-Whitney tests were inappropriately used for statistics in Figures 3-6: Multiple comparisons (>=3 groups) should be performed one-way ANOVA or the Kruskal-Wallis test for nonparametric data.

      We thank the reviewer for the comment. We now applied the one-way ANOVA/KruskalWallis tests and the text has been modified accordingly.

      (6) It seems that the data points in some panels of Figure 4C represented a cell, but others were averaged within a mouse (Figure 4D). This needs to be clarified or corrected.

      None of the data in Figure 4 was averaged within a mouse. In the the type of chosen graph (aligned dot plot) the equal data are overlapped.

      Reviewer #2 (Recommendations For The Authors):

      The authors' revised manuscript has addressed most of my concerns. However, I'm not convinced by the authors' claim regarding Figure 5B. It would be great if the authors at least discuss in their manuscript why the DMI pretreatment group alone, not the 6OHDA group, significantly lowers the firing rate of DRN (DA) and increases the Erest of DRN (DA), compared to the sham-lesion group. These statistically significant data are not explained at all in the revised manuscript (This effect can be explained by the neuroprotection of NA-neurons from 6-OHDA toxicity?).

      We thank the reviewer for this comment. Since using a one-way ANOVA or a KruskalWallis test for comparing the three groups (as suggested by reviewer 1), the changes previously shown in Figure 5B are not significant.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This manuscript represents a cleanly designed experiment for assessing biological motion processing in children (mean age = 9) with and without ADHD. The group differences concerning accuracy in global and local motion processing abilities are solid, but the analyses suggesting dissociable relationships between global and local processing and social skills, age, and IQ are inconclusive. The results are useful in terms of understanding ADHD and the ontogenesis of different components of the processing of biological motion.

      We thank the editors and reviewers for their valuable feedback and constructive comments. We have carefully considered each point raised by the reviewers and made the necessary revisions to the manuscript. Regarding the relationships between global and local BM processing, the accumulated evidence from previous studies has converged on the dissociation of the two BM components, e.g., while global BM processing is susceptible to learning and practice, local BM processing does not show a learning trend (Chang and Troje, 2009; Grossman et al., 2004), and the brain activations in response to local and global BM cues are different (Chang et al., 2018; Duarte et al., 2022). Nevertheless, we concurred with reviewers that the evidence for such dissociation from the current study by itself is not strong enough. Therefore, we have toned down on this point and no longer claimed the dissociation (including the title). Based on the current results, we focused our discussion on the different aspects of BM processing in children with and without ADHD.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The paper presents a nice study investigating the impairments of biological motion perception in individuals with ADHD in comparison with neurotypical controls. Motivated by the idea that there is a relationship between biological motion perception and social capabilities, the authors investigated the impairments of local and global (holistic) biological motion perception, the diagnosis status, and several additional behavioral variables that are affected in ADHS (IQ, social responsiveness, and attention / impulsivity). As well local as global biological motion perception is impaired in ADHD individuals. In addition, the study demonstrates a significant correlation between local biological motion perception skills and the social responsiveness score in the ADHD group, but not in controls. A path analysis in the ADHD group suggests that general performance in biological motion perception is influenced mainly by global biological motion perception performance and attentional and perceptual reasoning skills.

      Strengths:

      It is true that there exists not much work on biological motion perception and ADHD. Therefore, the presented study contributes an interesting new result to the biological motion literature, and adds potentially also new behavioral markers for this clinical group. The design of the study is straightforward and technically sound, and the drawn conclusions are supported by the presented results.

      Thanks for this positive assessment of our work.

      Weaknesses:

      Some of the claims about the relationship between genetic factors and ADHD and the components of biological motion processing have to remain speculative at this point because genetic influences were not explicitly tested in this paper. Specifically, the hypothesis that the perception of human social interaction is critically based on a local mechanism for the detection of asymmetry in foot trajectories of walkers (this is what 'BL-local' really measures), or on the detection of live agents in cluttered scenes seems not very plausible.

      Thanks for these comments. We agree that the relationship between genetic factors and BM perception remains to be further examined, as we did not test the genetic influences in this study. We have deleted relavant discussion about genetics. Based on our results, we discuss the possible mechanisms behind the relationship between local BM processing and social interaction in the revised manuscript as follows:

      “As mentioned above, we found a significant negative correlation between the SRS total score and the accuracy of local BM processing, specifically in the ADHD group. This could be due to decreased visual input related to atypical local BM processing, which further impairs global BM processing. According to the two-process theory of biological motion processing61, local BM cues guide visual attention towards BM stimuli55,62. Consequently, the visual input of BM stimuli increases, facilitating the development of the ability to process global BM cues through learning21,63. The latter is a prerequisite for attributing intentions to others and facilitating social interactions with other individuals20,64,65. Thus, atypical local BM processing may contribute to impaired social interaction through altered visual inputs. Further empirical studies are required to confirm these hypotheses.” (lines 417 - 428)

      Based on my last comments, now the discussion has been changed in a way that tries to justify the speculative claims by citing a lot of other speculative papers, which does not really address the problem. For example, the fact that chicks walk towards biological motion stimuli is interesting. To derive that this verifies a fundamental mechanism in human biological motion processing is extremely questionable, given that birds do not even have a cortex. Taking the argumentation of the authors serious, one would have to assume that the 'Local BM' mechanism is probably located in the mesencephalon in humans, and then would have to interact in some way with social perception differences of ADHD children. To me all this seems to make very strong (over-)claims. I suggest providing a much more modest interpretation of the interesting experimental result, based on what has been really experimentally shown by the authors and closely related other data, rather than providing lots of far-reaching speculations.

      In the same direction, in my view, go claims like 'local BM is an intrinsic trait' (L. 448) , which is not only imprecise (maybe better 'mechanisms of processing of local BM cues') but also rather questionable. Likely, this' local processing of BM' is a lower level mechanisms, located probably in early and mid-levels of the visual cortex, with a possible influence of lower structures. It seems not really plausible that this is related to a classical trait variables in the sense of psychology, like personality, as seems to be suggested here. Also here I suggest a much more moderate and less speculative interpretation of the results.

      We thank the reviewer for pointing out these issues. According to these comments, we have carefully revised the discussion to avoid strong (over-) claims. We have deleted the example of chicks, but substituted with more empirical studies to explain our results. We agree that the Local BM mechanism is probably located in subcortical regions in humans, which were reported by some MRI studies (Chang et al., 2018; Hirai and Senju, 2020; Loula et al., 2005). We have added some evidence that atypical local BM processing may decrease visual inputs related to social information as follows:

      “According to the two-process theory of biological motion processing61, local BM cues guide visual attention towards BM stimuli55,62. Consequently, the visual input of BM stimuli increases, facilitating the development of the ability to process global BM cues through learning21,63. The latter is a prerequisite for attributing intentions to others and facilitating social interactions with other individuals20,64,65. Thus, atypical local BM processing may contribute to impaired social interaction through altered visual inputs.” (lines 421 - 427)

      We have also deleted the clarims of 'local BM is an intrinsic trait' (originally L. 448) and related discussion as it was not conclusive based on the current study.

      Reviewer #2 (Public Review):

      Summary:

      Tian et al. aimed to assess differences in biological motion (BM) perception between children with and without ADHD, as well as relationships to indices of social functioning and possible predictors of BM perception (including demographics, reasoning ability and inattention). In their study, children with ADHD showed poorer performance relative to typically developing children in three tasks measuring local, global, and general BM perception. The authors further observed that across the whole sample, performance in all three BM tasks was negatively correlated with scores on the social responsiveness scale (SRS), whereas within groups a significant relationship to SRS scores was only observed in the ADHD group and for the local BM task. Local and global BM perception showed a dissociation in that global BM processing was predicted by age, while local BM perception was not. Finally, general (local & global combined) BM processing was predicted by age and global BM processing, while reasoning ability mediated the effect of inattention on BM processing.

      Strengths:

      Overall, the manuscript is presented in a clear fashion and methods and materials are presented with sufficient detail so the study could be reproduced by independent researchers. The study uses an innovative, albeit not novel, paradigm to investigate two independent processes underlying BM perception. The results are novel and have the potential to have wide-reaching impact on multiple fields.

      We appreciate the reviewer’s positive feedback very much.

      Weaknesses:

      The manuscript has greatly improved in clarity and methodological considerations in response to the review. There are only a few minor points which deserve the authors' attention:

      When outlining the moviation for the current study, results from studies in ADHD and ASD are used too interchangeably. The authors use a lack of evidence for contributing (psychological/developmental) factors on BM processing in ASD to motivate the present study and refer to evidence for differences between typical and non-typical BM processing using studies in both ASD and ADHD. While there are certainly overlapping features between the two conditions/neurotypes, they are not to be considered identical and may have distinct etiologies, therefore the distinction between the two should be made clearer.

      We thank the reviewer for pointing out this issue. We have removed some unnecessary citations about ASD and referred to studies about social cognition in ADHD to elaborate the motivation of this study:

      “Further exploration of a diverse range of social cognitions (e.g., biological motion perception) can provide a fresh perspective on the impaired social function observed in ADHD. Moreover, recent studies have indicated that the social cognition in ADHD may vary depending on different factors at the cognitive, pathological, or developmental levels, such as general cognitive impairment5, symptoms severity8, or age5. Nevertheless, understanding how these factors relate to social cognitive dysfunction of in ADHD is still in its infancy. Bridging this gap is crucial as it can help depict the developmental trajectory of social cognition and identify effective interventions for impaired social interaction in individuals with ADHD.” (lines 53 - 62)

      In the first/main analysis, is unclear to me why in the revised manuscript the authors changed the statistical method from ANOVA/ANCOVA to independent samples t-tests (unless the latter were only used for post-hoc comparisons, then this needs to be stated). Furthermore, although p-values look robust, for this analysis too it should be indicated whether and how multiple comparison problems were accounted for.

      Thanks for the reviewer’s comments. According to the suggestions from reviewer #3, it may be inapposite to regard gender as a covariate in ANOVA, which may violate the assumptions of ANCOVA. To ensure that gender does not influence the results, firstly, we separated boys and girls on the plots with different coloured individual data points, and there are no signs of a gender effect in their TD group. Secondly, we use t-tests to examine the difference between TD and ADHD groups. Finally, we conducted a subsampling analysis with balanced data, and the results remained consistent.

      In part 1 of the results, we aimed to compare the task accuracies between the TD and ADHD groups in three independent tasks, which assess the participants’ abilities to process three types of BM cues. We assumed that individuals with ADHD show poorer performance in three tasks compared to TD individuals. With regard to that, we consider that multiple comparisons may not be necessary.

      Reviewer #3 (Public Review):

      Strengths:

      The authors present differences between ADHD and TD children in biological motion processing, and this question has not received as much attention as equivalent processing capabilities in autism. They use a task that appears well controlled. They raise some interesting mechanistic possibilities for differences in local and global motion processing, which are distinctions worth exploring. The group differences will therefore be of interest to those studying ADHD, as well as other developmental conditions, and those examining biological motion processing mechanisms in general.

      We appreciate the reviewer’s positive assessment of this work.

      Weaknesses:

      The data are not strong enough to support claims about differences between global and lobal processing wrt social communication skills and age. The mechanistic possibilities for why these abilities may dissociate in such a way are interesting, but the crucial tests of differences between correlations do not present a clear picture. Further empirical work would be needed to test the authors' claims. Specifics:

      The authors state frequently that it was the local BM task that related to social communication skills (SRS) and not the global tasks. However, the results section shows a correlation between SRS and all three tasks. The only difference is that when looking specifically within the ADHD group, the correlation is only significant for the local task. The supplementary materials demonstrate that tests of differences between correlations present an incomplete picture. Currently they have small samples for correlations, so this is unsurprising.

      Thanks for this comment. We agree with the reviewer that the relationship between local and global processing with social communication and age needs more expirical work. Based on our results, there are only possible dissociable roles of local and global BM processing. The accumulated evidence from previous studies has converged on this dissociation, e.g., whild global BM processing is susceptible to learning and practice, local BM processing does not show a learning trend (Chang and Troje, 2009; Grossman et al., 2004), and the brain activations in response to local and global BM cues are different (Chang et al., 2018; Duarte et al., 2022). We concurred with reviewers that the evidence for such dissociation from the current study by itself is not strong enough. Therefore, we have toned down on this point and no longer emphasized the dissociation. Based on the current results, we focused our discussion on the different aspects of BM processing in children with and without ADHD. Future studies with larger sample sizes are needed to confirm this disociable relationship.

      Theoretical assumptions. The authors make some statements about local vs global biological motion processing that should still be made more tentatively. They assume that local processing is specifically genetically whereas global processing is a product of experience. These data in newborn chicks are controversial and confounded - I cannot remember the specifics but I think there an upper vs lower visual field complexity difference here.

      We appreciate the reviewer’s suggestion. We agree that the relationship between genetic factors and BM perception remains to be further examined as we didn’t perform any genetic analysis in the current study. Some speculative papers have been removed, so do the statement about newborn chicks given the controversial and confounded results. We have toned down our claims and povided a moderate interpretation of the results:

      “Sensitivity to local BM cues emerges early in life54,55 and involves rapid processing in the subcortical regions16,56-58. As a basic pre-attentive feature23, local BM cues can guide visual attention spontaneously59,60. In contrary, the ability to process global BM cues is related to slow cortical BM processing and is influenced by many factors such as attention25,26 and visual experience21,51. As mentioned above, we found a significant negative correlation between the SRS total score and the accuracy of local BM processing, specifically in the ADHD group. This could be due to decreased visual input related to atypical local BM processing, which further impairs global BM processing. According to the two-process theory of biological motion processing61, local BM cues guide visual attention towards BM stimuli55,62. Consequently, the visual input of BM stimuli increases, facilitating the development of the ability to process global BM cues through learning21,63. The latter is a prerequisite for attributing intentions to others and facilitating social interactions with other individuals20,64,65. Thus, atypical local BM processing may contribute to impaired social interaction through altered visual inputs.” (lines 413 - 427)

      “Few developmental studies have been conducted on local BM processing. The ability to process local BM cues remained stable and did not exhibit a learning trend21,25. A reasonable interpretation may be that local BM processing is a low-level mechanism, probably performed by the primary visual cortex and subcortical regions such as the superior colliculus, pulvinar, and ventral lateral nucleus14,56,61.” (lines 441- 446)

      Readability. The manuscript needs very careful proofreading and correction for grammar. There are grammatical errors throughout.

      Thank the reviewer for this feedback. We have performed thorough proofreading and corrected grammatical errors throughout the manuscript.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I thank the authors for their revisions that address several of the minor points that I raised in my last review. A number of requests are still not sufficiently answered:

      L. 290 ff.: These model 'BM-local = age + gender etc ' is a pretty sloppy notation. I think what is meant that a GLM was used that uses the predictors genderetc. time appropriate beta_i values. This formulas should be corrected or one just says that a GLM was run with the predictors gender

      The same criticism applies to these other models that follow.

      This was corrected.

      However, the corrected text remains sloppy: example: 'BM-locaL = ...' What exacty is 'BM-Local' the accuracy? etc. Here a precise notation shoudl be given that clearly names which variables are used here as predictors and target variables.

      We appreciate the reviewer’s suggestion. We clarified which variables are used in our model and gived them precise notations:

      “Three linear models were built to investigate the contributing factors: (a) ACClocal = β0 + β1 * age + β2 * gender + β3 * FIQ + β4 * QbInattention, (b) ACCglobal = β0 + β1 * age + β2 * gender + β3 * FIQ + β4 * QbInattention, and (c) ACCgeneral = β0 + β1 * age + β2 * gender + β3 * FIQ + β4 * QbInattention + β5 * ACClocal + β6 * ACCglobal. ACClocal, ACCglobal and ACCgeneral refer to the response accuracies of the three tasks in the ADHD group, and QbInattention is the standardised score for sustained attention function.” (lines 337 - 343)

      All these models assume linearity of the combination of the predictors. was this assumption verified?

      We referred to the previous study of BM perception in children. They found main predictor variables, including IQ (Rutherford et al., 2012; Jones et al., 2011) and age (Annaz et al., 2010; van et al., 2016), have a linear relation with the ability of BM processing.

      This answer is insufficient and not convincing. Because a variable Y depends linearly on predictor A and B in some other study, this does not imply that is is also linear in predictor C, or does not show interactions with such predictors in the present study.

      What is needed here is the testing of models with interaction terms and verifying that such models are not better predictors. If authors do not want to do this, they need at least to clearly point out that they made the strong assumption of linearity of their model, which might be wrong and thus be a substantial limitation of their analysis.

      Thanks for the suggestion. We tried to compare each possible mode with and without relative interactions. The results showed that the change of Coefficient of Determination (R-squared, R2) between the two models was not statistically significant.

      L. 296ff.: For model (b) it looks like general BM performance is strongly driven by the predictor global BM performance in the ADHD group. Does the same observation also apply to the controls?

      The same phenomenon was not observed in TD children. We have briefly discussed this point in the Discussion section of the revised manuscript (lines 449 - 459).

      Was such a path analysis also done for the TD subjects or not? If yes, was then also predicted that the variable BM-Global largely and directedly influences the variable BM-General? (The answer refers to the general discussion section, where no such analysis is presented, as far as I understand.)

      Thank you for your comment. We also conduct a path analysis similar to that in the ADHD group. There is no statistically significant mediator effect in the TD group. Please see Figure S3 for complete statistics.

      Reviewer #2 (Recommendations For The Authors):

      (1) Please add public access to the data repository so data availability can be assessed.

      The data analyzed during the study is available at https://osf.io/37p5s/.

      (2) Lines 119-115: The differences observed in ADHD participants in the studies referenced here were relative to what group? The last sentence here also refers to two groups, and it is difficult to gather which specific groups are meant, also because the two references relate to both ADHD and ASD samples. Please clarify.

      The suggestion is well taken. We have clarified the expressions accordingly:

      “Specifically, compared with the typically developing (TD) group, children with ADHD showed reduced activity of motion-sensitive components (N200) while watching biological and scrambled motions, although no behavioural differences were observed. Another study found that children with ADHD performed worse in BM detection with moderate noise ratios than the TD group32.” (lines 100 - 105)

      (3) Line 116: I'm not sure what is meant by 'despite initial indications' - please briefly specify/summarise here why the investigation into BM processing in ADHD is warranted.

      Thank the reviewer for pointing out this issue. We rephrase this part and briefly specify “why the investigation into BM processing in ADHD is warranted”:

      “Despite initial findings about atypical BM perception in ADHD, previous studies on ADHD treated BM perception as a single entity, which may have led to misleading or inconsistent findings28. Hence, it is essential to deconstruct BM processing into multiple components and motion features.” (lines 108 -111)

      (4) Lines 290-293: Please complete the sentence.

      Thank the reviewer for pointing out this issue. Th sentence has been completed:

      “For Task 2 and 3, where children were asked to detect the presence or discriminate the facing direction of the target walker, TD group have higher accuracies than the ADHD group (Task 2 - TD: 0.70 ± 0.12, ADHD: 0.59 ± 0.12, t73 = 3.677, p < 0.001, Cohen's d = 0.861; Task 3 - TD: 0.79 ± 0.12, ADHD: 0.63 ± 0.17, t73 = 4.702, p < 0.001, Cohen's d = 1.100).” (lines 284 - 288)

      Reviewer #3 (Recommendations For The Authors):

      (1) Conclusions concerning differences between the local and global tasks wrt SRS and age (see above). I believe the authors need to reword throughout to reflect that the tests of differences between these crucial correlations did not present a clear picture.

      We have reworded throughout the paper to reflect the inconclusiveness with regard to the relationship between local and global processing with social communication based on this study only. Future studies with larger sample sizes are needed to confirm this conclusion. The mechanism for this dissociable relationship should be validated by more psychologial tests in the future studies.

      (2) I would again tone down the discussion of genetic specification of local processing, given it is highly controversial.

      We thank the reviewer for pointing out the issue. We agree the point about the genetic specification of local processing remains controversial. The interpretation of results about local BM processing has been rephrased. Please refer to our response to the point #2 mentioned.

      (3) The manuscript needs very careful proofreading and grammatical correction throughout.

      Thanks for the suggestion to check the grammar. We have carefully proofread the manuscript to correct grammatical errors

    2. eLife assessment

      The authors use point light displays to measure biological motion (BM) perception in children (mean = 9 years) with and without ADHD, and relate it to IQ, social responsiveness scale (SRS) scores and age. They report that children with ADHD were worse at all three BM tasks, but that those tasks loading more heavily on local processing relate to social interaction skills and those loading on global processing relate to age. There are still some elements of the results that are unclear, but nevertheless, the important and solid findings extend our limited knowledge of BM perception in ADHD, as well as biological motion processing mechanisms in general.

    3. Reviewer #2 (Public Review):

      Summary:

      Tian et al. aimed to assess differences in biological motion (BM) perception between children with and without ADHD, as well as relationships to indices of social functioning and possible predictors of BM perception (including demographics, reasoning ability and inattention). In their study, children with ADHD showed poorer performance relative to typically developing children in three tasks measuring local, global, and general BM perception. The authors further observed that across the whole sample, performance in all three BM tasks was negatively correlated with scores on the social responsiveness scale (SRS), whereas within groups a significant relationship to SRS scores was only observed in the ADHD group and for the local BM task. Local and global BM perception showed a dissociation in that global BM processing was predicted by age, while local BM perception was not. Finally, general (local & global combined) BM processing was predicted by age and global BM processing, while reasoning ability mediated the effect of inattention on BM processing.

      Strengths:

      Overall, the manuscript is presented in a clear fashion and methods and materials are presented with sufficient detail so the study could be reproduced by independent researchers. The study uses an innovative, albeit not novel, paradigm to investigate two independent processes underlying BM perception. The results are novel and have the potential to have wide-reaching impact on multiple fields.

      Weaknesses:

      The manuscript has improved in clarity and conceptual and methodological considerations in response to the last review. However, the reported results still provide incomplete support for the claims the authors make in the paper.

      In relation to other reviewers' earlier comments, the model notation used is still not consistent and model results are reported incompletely, which make it difficult to gain a full picture of the data and how they support the authors' secondary claims. For instance, across the models in the supplementary materials, ß coefficients are only reported selectively which makes it difficult to assess the model as a whole. Furthermore, different terms (task 1, task 2 vs. BM-Local, BM-global) are used to refer to the same levels of a variable, and it is unclear which levels of a dummy variable correspond to which task, making it overall very difficult to comprehend the modelling procedure.

    4. Reviewer #3 (Public Review):

      The authors presented point light displays of human walkers to children (mean = 9 years) with and without ADHD to compare their biological motion perception abilities, and relate them to IQ, social responsiveness scale (SRS) scores and age. They report that children with ADHD were worse at all three biological motion tasks, but that those loading more heavily on local processing related to social interaction skills and global processing to age. The valuable and solid findings are informative for understanding this complex condition, as well as biological motion processing mechanisms in general. However, the correlations present a pattern that needs further examination in future studies because many of the differences between correlations are not significant.

      Strengths:

      The authors present differences between ADHD and TD children in biological motion processing, and this question has not received as much attention as equivalent processing capabilities in autism. They use a task that appears well controlled. They raise some interesting mechanistic possibilities for differences in local and global motion processing, which are distinctions worth exploring. The group differences will therefore be of interest to those studying ADHD, as well as other developmental conditions, and those examining biological motion processing mechanisms in general.

      Weaknesses:

      The data are not strong enough to support claims about differences between global and lobal processing wrt social communication skills and age. The mechanistic possibilities for why these abilities may dissociate in such a way are interesting, but the crucial tests of differences between correlations do not present a clear picture. Further empirical work would be needed to test this further. Specifics:

      The authors state frequently that it was the local BM task that related to social communication skills (SRS) and not the global tasks. However, the results section shows a correlation between SRS and all three tasks. The only difference is that when looking specifically within the ADHD group, the correlation is only significant for the local task. The supplementary materials demonstrate that tests of differences between correlations present an incomplete picture. Currently they have small samples for correlations, so this is unsurprising.

      Theoretical assumptions. The authors make some statements about local vs global biological motion processing that may have been made in previous studies, but would appear controversial and not definitive. E.g., that local BM processing does not improve with age and is uninfluenced by attention.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      Following synaptic vesicle fusion events at release sites, vesicle remnants will need to be cleared in order to allow new rounds of vesicle docking and fusion. This fundamental study of Mahapatra and Takahashi examines the role of release site clearance in synaptic transmission during repetitive activity in two types of central synapses, the giant calyx of Held and hippocampal CA1 synapses. The study uses pharmacological approaches to interfere with release site clearance by blocking membrane retrieval (endocytosis). They compare the effects on short-term plasticity with those obtained by pharmacologically inhibiting scaffold protein activity. The data presented make a compelling case for fast endocytosis as necessary for rapid site clearance and vesicle recruitment to active zones. The data reveal an unexpected, fast role for local site clearance in counteracting synaptic depression.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The study examines the role of release site clearance in synaptic transmission during repetitive activity under physiological conditions in two types of central synapses, calyx of Held and hippocampal CA1 synapses. After acute block of endocytosis by pharmacology, deeper synaptic depression or less facilitation was observed in two types of synapses. Acute block of CDC42 and actin polymerization, which possibly inhibits the activity of Intersectin, affected synaptic depression at the calyx synapse, but not at CA1 synapses. The data suggest an unexpected, fast role of the site clearance in counteracting synaptic depression.

      Strengths:

      The study uses acute block of the molecular targets with pharmacology together with precise electrophysiology. The experimental results are clear cut and convincing. The study also examines the physiological roles of the site clearance using action potential-evoked transmission at physiological Ca and physiological temperature at mature animals. This condition has not been examined.

      Weaknesses:

      Pharmacology may have some off-target effects, though acute manipulation should be appreciated and the authors have tried several reagents to verify the overall conclusions.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Mahapatra and Takahashi report on the physiological consequences of pharmacologically blocking either clathrin and dynamin function during compensatory endocytosis or of the cortical actin scaffold both in the calyx of Held synapse and hippocampal boutons in acute slice preparations

      Strengths:

      Although many aspects of these pharmacological interventions have been studied in detail during the past decades, this is a nice comprehensive and comparative study, which reveals some interesting differences between a fast synapse (Calyx of Held) tuned to reliably transmit at several 100 Hz and a more slow hippocampal CA1 synapse. In particular the authors find that acute disturbance of the synaptic actin network leads to a marked frequency-dependent enhancement of synaptic depression in the Calyx, but not in the hippocampal synapse This striking difference between both preparations is the most interesting and novel finding.

      Weaknesses:

      Unfortunately, however, these findings concerning the different consequences of actin depolymerization are not sufficiently discussed in comparison to the literature. My only criticism concerns the interpretation of the ML 141 and Lat B data. With respect to the Calyx data, I am missing a detailed discussion of the effects observed here in light of the different RRP subpools SRP and FRP. This is very important since Lee at al. (2012, PNAS 109 (13) E765-E774) showed earlier that disruption of actin inhibits the rapid transition of SRP SVs to the FRP at the AZ. The whole literature on this important concept is missing. Likewise, the role of actin for the replacement pool at a cerebellar synapse (Miki et al., 2016) is only mentioned in half a sentence. There is quite some evidence that actin is important both at the AZ (SRP to FRP transition, activation of replacement pool) and at the peri-active zone for compensatory endocytosis and release site clearance. Both possible underlying mechanisms (SRP to FRP transition or release site clearance) should be better dissected.

      We dissected the latrunculin effect further by referring to the related literature within the scope of this study in the revised Discussion section (last paragraph).

      Reviewer #3 (Public Review):

      The manuscript by Mahapatra and Takahashi addresses the role of presynaptic release site clearance during sustained synaptic activity. The authors characterize the effects of pharmacologically interfering with SV endocytosis (pre-incubation with Dynasore or Pitstop-2) on synaptic short-term plasticity (STP) at two different CNS synapses (calyx of Held synapses and hippocampal SC to CA1 synapses) using patch-clamp recordings in acute slices under experimental conditions designed to closely mimic a physiological situation (37{degree sign}C and 1.3 mM external [Ca2+]). Endocytosis blocker-induced changes in STP and in the recovery from short-term depression (STD) are compared to those seen after pharmacologically inhibiting actin filament assembly (pre-incubation with Latrunculin-B or the selective Cdc42 GTPase inhibitor ML-141). Presynaptic capacitance (Cm) recordings in calyx terminals were used to establish the effects of the pharmacological maneuvers on SV endocytosis.

      Latrunculin-B and ML-141 neither affect SV endocytosis (assayed by Cm recordings) nor EPSC recovery following conditioning trains, but strongly enhances STD at calyx synapses. No changes in STP were observed at Latrunculin-B- or ML-141-treated SC to CA1 synapses.

      Dynasore and Pitstop-2 slow down endocytosis, limit the total amount of exocytosis in response to long stimuli, enhance STD in response to 100 Hz stimulation, but profoundly accelerate EPSC recovery following conditioning 100 Hz trains at calyx synapses. At SC to CA1 synapses, Dynasore and Pitstop-2 reduce the extend of facilitation and lower relative steady-state EPSCs suggesting a change in the facilitation-depression balance in favor of the latter.

      The authors use state-of-the art techniques and their data, which is clearly presented, leads to authors to conclude that endocytosis is universally important for clearance of release sites while the importance of scaffold protein-mediated site clearance is limited to 'fast synapses'.

      Unfortunately, and perhaps not completely unexpected in view of the pharmacological tools chosen, there are several observations which remain difficult to understand:

      (1) Blocking site clearance affects release sites that have previously been used, i.e. sites at which SV fusion has occurred and which therefore need to be cleared. Calyces use at most 20% of all release sites during a single AP, likely fewer at 1.3 mM external [Ca2+]. Even if all those 20% of release sites become completely unavailable due to a block of release site clearance, the 2nd EPSC in a train should not be reduced by >20% because ~80% of the sites cannot be affected. However, ~50% EPSC reduction was observed (Fig. 2B1, lower right panel) raising the possibility that Dynasore does more than specifically interfering with SVs endocytosis (and possibly Pitstop as well). Non-specific effects are also suggested by the observed two-fold increase in initial EPSC size in SC to CA1 synapses after Dynasore pre-incubation.

      This study compares different experimental conditions to conclude the physiological role of endocytosis on rapid neurotransmission at the large calyceal synapse in mice. A related study at the Drosophila neuromuscular junction (Kawasaki et al., Nat. Neuroscience 2000) reported similar findings in comparable experimental settings (physiological conditions and acute block of endocytosis).

      (2) More severe depression was observed at calyx synapses after blocking endocytosis which the authors attribute to a presynaptic mechanism affecting pool replenishment. When probing EPSC recovery after conditioning 100 Hz trains, a speed up was observed mediated by an "unknown mechanism" which is "masked in 2 mM [Ca2+]". These two observations, deeper synaptic depression during 100 Hz but faster recovery from depression following 100 Hz, are difficult to align and no attempt was made to find an explanation.

      By varying temperature (PT vs RT), calcium concentration (1.3 mM vs 2.0 mM), and stimulation frequency (10, 100, and 200 Hz; some data are not shown), the effect of endocytosis block on EPSC STD and recovery from STD kinetics at the post-hearing calyx were compared in these settings: (PT, 1.3 mM [Ca2+]), (PT, 2.0 mM Ca2+), and (RT, 2.0 mM [Ca2+]), to dissect their respective role.

      (3) To reconcile previous data reporting a block of Ca2+-dependent recovery (CDR) by Dynasore or Latrunculin (measured at 2 mM external [Ca2+]) with the data presented here (using 1.3 mM external [Ca2+]) reporting no effect or a speed up of recovery from depression, the authors postulate that "CDR may operate only when excessive Ca2+ enters during massive presynaptic activation" (page 10 line 244). While that is possible, such explanation ignores plenty of calyx studies demonstrating fiber stimulation-induced CDR and elucidating molecular pathways mediating fiber stimulation-induced CDR, and it also completely dismisses the strong change in recovery time course after 10 Hz conditioning (single exponential) as compared to 100 Hz conditioning (double exponential with a pronounced fast component).

      Strong presynaptic stimuli such as those illustrated in Figs. 1B,C induce massive exocytosis. The illustrated Cm increase of 2 to 2.5 pF represents fusion of 25,000 to 30,000 SVs (assuming a single SV capacitance of 80 aF) corresponding to a 12 to 15% increase in whole terminal membrane surface (assuming a mean terminal capacitance of ~16 pF). Capacitance measurements can only be considered reliable in the absence of marked changes in series and membrane conductance. Documentation of the corresponding conductance traces is therefore advisable for such massive Cm jumps and merely mentioning that the first 450 ms after stimulation were skipped during analysis or referring to previous publications showing conductance traces is insufficient.

      All bar graphs in Figures 1 through 6 and Figures S3 through S6 compare three or even four (Fig. 5C) conditions, i.e. one control and at least two treatment data sets. It appears as if repeated t-tests were used to run multiple two-group comparisons (i.e. using the same control data twice for two different comparisons). Either a proper multiple comparison test should be used or a Bonferroni correction or similar multiple-comparison correction needs to be applied.

      We updated the statistical analysis of all data using one-way ANOVA and t-test with BonferroniHolm method of p level correction and rectified one analysis in Fig 1 and 3, all major conclusions are unchanged.

      Finally, the terminology of contrasting "fast-signaling" (calyx synapses) and "slow-plastic" (SC synapses) synapses seems to imply that calyx synapses lack plasticity, as does the wording "conventional bouton-type synapses involved in synaptic plasticity" (page 11, line 251). I assume, the authors primarily refer to the maximum frequencies these two synapse types typically transmit (fast-signaling vs slow-signaling)?

      Properties of these two synapses described explicitly in updated text and they are renamed as fast and slow synapes.

      Recommendations for the authors:

      Reviewer #3 (Recommendations For The Authors):

      'SV replenishment' and 'site clearance' should not be used synonymously as it seems to be done sometimes here.

      In this revision, we described them more explicitly.

      The data presented in Fig. S6 are detached from the rest of the manuscript, not relevant and should be removed. page 4 line 95 "... to ensure sufficient Ca2+ currents to induce exo-endocytosis." ICa is large enough to induce exocytosis also at 1.3 mM Ca2+. Please clarify.

      We updated the relevant section.

      page 5, line 108 "... this slow endocytosis showed a strongly prolonged time course without accompanied by the change of Cm or presynaptic Ca2+ currents" Please fix.

      Fixed.

      page 5, line 121 "Thus, at calyces of Held, bath-application of Dynasore or Pitstop-2 can block both fast and slow endocytosis without perturbing presynaptic intracellular milieu." Bath-application never perturbs the intracellular milieu. Please clarify.

      Rephrased.

      page 6 line 128 "... physiological aCSF" is a misnomer (= physiological artificial CSF). Please fix.

      In the introduction section, it is clearly described.

      page 11, line 252 "... from hippocampal SC-CA1 pyramidal neurons" There are no "SC-CA1 pyramidal neurons". Please fix.

      Fixed.

      page 12, line 285 "In acute slices optimized to physiological conditions" The conditions are optimized, not the slices. Please fix.

      Fixed.

      page 14, line 323 same as above

      Fixed.

      page 14, line 330 LTP at SC-CA1 synapses is postsynaptic. Please clarify.

      Rephrased

      page 16, line 381 "had a series resistance of 3-4 MOhm" versus

      page 17, line 408 "The patch pipettes had a series resistance of 5-15 MOhm (less than 10 MOhm in most cells)" 3-4 is perhaps pipette resistance while 5-15 is perhaps series resistance? Please clarify.

      Fixed.

      page 17, line 398 "Cm traces were averaged at every 10 ms (for 10 Hz train stimulation) or 20 ms (for 5 ms single or 1 Hz train stimulation)." Do you mean to say that Cm traces were smoothed with a moving average using a window size of 10 or 20 ms duration? Please clarify.

      Rephrased to clarify better.

      page 18, "All values are given as mean {plus minus} SEM and significance of difference was evaluated by Student's unpaired t-test, unless otherwise noted." Please check. You cannot simply use repeated t-tests for multiple comparisons. Either a proper multiple comparison test should be used or a Bonferroni correction or similar multiple-comparison correction needs to be applied.

      All statistical analysis are updated using one-way ANOVA and t-test, with Bonferroni-Holm method of p level correction and one analysis is rectified in Fig 1 and 3, with no change in major conclusions.

    2. eLife assessment

      Following synaptic vesicle fusion events at release sites, vesicle remnants will need to be cleared in order to allow new rounds of vesicle docking and fusion. This fundamental study of Mahapatra and Takahashi examines the role of release site clearance in synaptic transmission during repetitive activity in two types of central synapses, the giant calyx of Held and hippocampal CA1 synapses. The study uses pharmacological approaches to interfere with release site clearance by blocking membrane retrieval (endocytosis). The results also show how pharmacological inhibition of scaffold proteins affects short-term plasticity. The data presented make a compelling case for fast endocytosis as necessary for rapid site clearance and vesicle recruitment to active zones. The data reveal an unexpected, fast role for local site clearance in counteracting synaptic depression.

    3. Joint Public Review:

      Mahapatra and Takahashi report on the physiological consequences of pharmacologically blocking either clathrin and dynamin function during compensatory endocytosis or of the cortical actin scaffold both in the calyx of Held synapse and hippocampal boutons in acute slice preparations.

      Although many aspects of these pharmacological interventions have been studied in detail during the past decades, this is a comprehensive and comparative study, which reveals some interesting differences between a fast synapse (Calyx of Held) tuned to reliably transmit at several 100 Hz and a more slow hippocampal CA1 synapse. In particular the authors find that acute disturbance of the synaptic actin network leads to a marked frequency-dependent enhancement of synaptic depression in the Calyx, but not in the hippocampal synapse. This striking difference between both preparations is the most interesting finding.

      Comments on latest version:

      The authors have done a great job revising the paper and only minor revisions are suggested to the Discussion of the paper.

      Two quite relevant and recent papers should be cited and briefly discussed because they relate directly to Pitstop2 effects and actin-myosin-scaffold proteins in the calyx of Held synapse.

      One is: Paksoy A et al, (2022) "Effects of the clathrin inhibitor Pitstop-2 on synaptic vesicle recycling at a central synapse in vivo." Front. Synaptic Neurosci. 14:1056308. doi: 10.3389/fnsyn.2022.1056308. This paper shows with EM that changes caused by PitStop2 perturbation of "clathrin function suggest that clathrin plays a role in SV recycling from both, the plasma membrane and large endosomes, under physiological activity patterns, in vivo."

      Second: A role for actin-myosin and MLCK in short-term plasticity has been shown by Srinivasan G., et al. (2008) "The Pool of Fast Releasing Vesicles Is Augmented by Myosin Light Chain Kinase Inhibition at the Calyx of Held Synapse." J Neurophysiol 99: 1810-1824, 2008. The data here suggests that MLCK plays a crucial role in determining the size of the pool of synaptic vesicles that undergo fast release but not the Pr of the synapse. In other words, MLCK inhibition augments super-priming of vesicles at the calyx of Held synapse.

    1. Reviewer #3 (Public Review):

      In this work, Jarc et al. describe a method to decouple the mechanisms supporting progenitor self-renewal and expansion from feed-forward mechanisms promoting their differentiation.

      The authors aimed at expanding pancreatic progenitor (PP) cells, strictly characterized as PDX1+/SOX9+/NKX6.1+ cells, for several rounds. This required finding the best cell culture conditions that allow sustaining PP cell proliferation along cell passages while avoiding their further differentiation. They achieve this by comparing the transcriptome of PP cells that can be expanded for several passages against the transcriptome of unexpanded (just differentiated) PP cells.

      The optimized culture conditions enabled the selection of PDX1+/SOX9+/NKX6.1+ PP cells and their consistent, 2000-fold, expansion over ten passages and 40-45 days. Transcriptome analyses confirmed the stabilization of PP identity and the effective suppression of differentiation. These optimized culture conditions consisted in substituting the Vitamin A containing B27 supplement with a B27 formulation devoid of vitamin A (to avoid retinoic acid (RA) signaling from an autocrine feed-forward loop), substituting A38-01 with the ALK5 II inhibitor (ALK5i II) that targets primarily ALK5, supplementation of medium with FGF18 (in addition to FGF2) and the canonical Wnt inhibitor IWR-1, and cell culture on vitronectin-N (VTN-N) as a substrate instead of Matrigel.

      The strength of this work relies on a clever approach to identify cell culture modifications that allow expansion of PP cells (once differentiated) while maintaining, if not reinforcing, PP cell identity. Along the work, it is emphasized that PP cell identity is associated to the co-expression of PDX1, SOX9 and NKX6.1. The optimized protocol is unique (among the other datasets used in the comparison shown here) at inducing a strong upregulation of GP2, a unique marker of human fetal pancreas progenitors. Importantly GP2+ enriched hPS cell-derived PP cells are more efficiently differentiating into pancreatic endocrine cells (Aghazadeh et al., 2022; Ameri et al., 2017).

      The unlimited expansion of PP cells reported here would allow scaling-up the generation of beta cells, for the cell therapy of diabetes, by eliminating a source of variability derived from the number of differentiation procedures to be carried out when starting at the hPS cell stage each time. The approach presented here would allow selection of the most optimally differentiated PP cell population for subsequent expansion and storage. Among other conditions optimized, the authors report a role for Vitamin A in activating retinoic acid signaling in an autocrine feed-forward loop, and the supplementation with FGF18 to reinforce FGF2 signaling.

      This is a relevant topic in the field of research, and some of the cell culture conditions reported here for PP expansion might have important implications in cell therapy approaches. Thus, the approach and results presented in this study could be of interest for researchers working in the field of in vitro pancreatic beta cell differentiation from hPSCs. Table S1 and Table S4 are clearly detailed and extremely instrumental to this aim.

    2. Reviewer #2 (Public Review):

      The paper presents a novel approach to expand iPSC-derived pdx1+/nkx6.1+ pancreas progenitors, making them potentially suitable for GMP-compatible protocols. This advancement represents a significant breakthrough for diabetes cell replacement therapies, as one of the current bottlenecks is the inability of expanding PP without compromising their differentiation potential. The study employs a robust dataset and state-of-the-art methodology, unveiling crucial signaling pathways (eg TGF, Notch...) responsible for sustaining pancreas progenitors while preserving their differentiation potential in vitro.

      The current version of the paper has improved, increasing the clarity and providing clear explanations to the comments raised regarding quantifications, functionality of the cells in vivo etc...

      The discussion on challenges adds depth to the study and encourages future research to build upon these important findings

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1:

      Summary:

      The authors aim to test the sensory recruitment theory of visual memory, which assumes that visual sensory areas are recruited for working memory, and that these sensory areas represent visual memories in a similar fashion to how perceptual inputs are represented. To test the overlap between working memory (WM) and perception, the authors use coarse stimulus (aperture) biases that are known to account for (some) orientation decoding in the visual cortex (i.e., stimulus energy is higher for parts of an image where a grating orientation is perpendicular to an aperture edge, and stimulus energy drives decoding). Specifically, the authors show gratings (with a given "carrier" orientation) behind two different apertures: one is a radial modulator (with maximal energy aligned with the carrier orientation) and the other an angular modulator (with maximal energy orthogonal to the carrier orientation). When the subject detects contrast changes in these stimuli (the perceptual task), orientation decoding only works when training and testing within each modulator, but not across modulators, showing the impact of stimulus energy on decoding performance. Instead, when subjects remember the orientation over a 12s delay, orientation decoding works irrespective of the modulator used. The authors conclude that representations during WM are therefore not "sensory-like", given that they are immune to aperture biases. This invalidates the sensory recruitment hypothesis, or at least the part assuming that when sensory areas are recruited during WM, they are recruited in a manner that resembles how these areas are used during perception.

      Strengths:

      Duan and Curtis very convincingly show that aperture effects that are present during perception, do not appear to be present during the working memory delay. Especially when the debate about "why can we decode orientations from human visual cortex" was in full swing, many may have quietly assumed this to be true (e.g., "the memory delay has no stimuli, and ergo no stimulus aperture effects"), but it is definitely not self-evident and nobody ever thought to test it directly until now. In addition to the clear absence of aperture effects during the delay, Duan and Curtis also show that when stimulus energy aligns with the carrier orientation, cross-generalization between perception and memory does work (which could explain why perception-to-memory cross-decoding also works). All in all, this is a clever manipulation, and I'm glad someone did it, and did it well.

      Weaknesses:

      There seems to be a major possible confound that prohibits strong conclusions about "abstractions" into "line-like" representation, which is spatial attention. What if subjects simply attend the endpoints of the carrier grating, or attend to the edge of the screen where the carrier orientation "intersects" in order to do the task? This may also result in reconstructions that have higher bold at areas close to the stimulus/screen edges along the carrier orientation. The question then would be if this is truly an "abstracted representation", or if subjects are merely using spatial attention to do the task.

      Alternatively (and this reaches back to the "fine vs coarse" debate), another argument could be that during memory, what we are decoding is indeed fine-scale inhomogenous sampling of orientation preferences across many voxels. This is clearly not the most convincing argument, as the spatial reconstructions (e.g., Figure 3A and C) show higher BOLD for voxels with receptive fields that are aligned to the remembered orientation (which is in itself a form of coarse-scale bias), but could still play a role.

      To conclude that the spatial reconstruction from the data indeed comes from a line-like representation, you'd need to generate modeled reconstructions of all possible stimuli and representations. Yes, Figure 4 shows that line results in a modeled spatial map that resembles the WM data, but many other stimuli might too, and some may better match the data. For example, the alternative hypothesis (attention to grating endpoints) may very well lead to a very comparable model output to the one from a line. However testing this would not suffice, as there may be an inherent inverse problem (with multiple stimuli that can lead to the same visual field model).

      The main conclusion, and title of the paper, that visual working memories are abstractions of percepts, is therefore not supported. Subjects could be using spatial attention, for example. Furthermore, even if it is true that gratings are abstracted into lines, this form of abstraction would not generalize to any non-spatial feature (e.g., color cannot become a line, contrast cannot become a line, etc.), which means it has limited explanatory power.

      We thank the reviewer for bringing up these excellent questions.

      First, to test the alternative hypothesis of spatial attention, we fed a dot image into the image-computable model. We placed the dot where we suspect one might place their spatial attention, namely, at the edge of the stimulus that is tangent to the orientation of the grating. We generated the model response for three orientations and their combination by rotating and averaging. From Author response image 1 below, one can see that this model does not match the line-like representation we reported. Nonetheless, we would like to avoid making the argument that attention does not play a role. We strongly suspect that if one was attending to multiple places along a path that makes up a line, it would produce the results we observed. But there begins a circularity in the logic, where one cannot distinguish between attention to a line-like representation and a line of attention being the line-like representation.

      Author response image 1.

      Reconstruction maps for the dot image at the edge of 15°, 75°, 135°, and the combined across three orientation conditions.

      Second, we remain agnostic to the question of whether fine-scale inhomogenous sampling of orientation selective neurons may drive some of the decoding results we report here. It is possible that our line-like representations are driven by neurons tuned to the sample orientation that have receptive fields that lie along the line. Here, we instead focus on testing the idea that WM decoding does not depend on aperture biases.

      Finally, we agree with the reviewer that there is much more work to be done in this area. Our working hypothesis, that WM representations are abstractions of percepts, is admittedly based on Occam's razor and an appeal to efficient coding principles. We also agree that these results may not generalize to all forms of WM (eg, color). As always, there is a tradeoff between interpretability (visual spatial formats in retinotopically organized maps) and generalizability. Frankly, we have no idea how one might be able to test these ideas when subjects might be using the most common type of memory reformatting - linguistic representations, which are incredibly efficient.

      Additional context:

      The working memory and perception tasks are rather different. In this case, the perception task does not require the subject to process the carrier orientation (which is largely occluded, and possibly not that obvious without paying attention to it), but attention is paid to contrast. In this scenario, stimulus energy may dominate the signal. In the WM task, subjects have to work out what orientation is shown to do the task. Given that the sensory stimulus in both tasks is brief (1.5s during memory encoding, and 2.5s total in the perceptual task), it would be interesting to look at decoding (and reconstructions) for the WM stimulus epoch. If abstraction (into a line) happens in working memory, then this perceptual part of the task should still be susceptible to aperture biases. It allows the authors to show that it is indeed during memory (and not merely the task or attentional state of the subject) that abstraction occurs.

      Again, this is an excellent question. We used a separate perceptual task instead of the stimulus epoch as control mainly for two reasons. First, we used a control task in which participants had to process the contrast, not orientation, of the grating because we were concerned that participants would reformat the grating into a line-like representation to make the judgments. To avoid this, we used a task similar to the one used when previous researchers first found the stimulus vignetting effect (Roth et al., 2018). Again, our main goal was to try to focus on the bottom-up visual features. Second, because of the sluggishness of the BOLD response, combined with our task design (ie, memory delay always followed the target stimulus), we cannot disentangle the visual and memory responses that co-exist at this epoch. Any result could be misleading.

      What's also interesting is what happens in the passive perceptual condition, and the fact that spatial reconstructions for areas beyond V1 and V2 (i.e., V3, V3AB, and IPS0-1) align with (implied) grating endpoints, even when an angular modulator is used (Figure 3C). Are these areas also "abstracting" the stimulus (in a line-like format)?

      We agree these findings are interesting and replicate what we found in our previous paper (Kwak & Curtis, Neuron, 2022). We believe that these results do imply that these areas indeed store a reformatted line-like WM representation that is not biased by vignetting. We would like to extend a note of caution, however, because the decoding results in the higher order areas (V3AB, IPS0-1, etc) are somewhat poor (especially in comparison to V1, V2, V3) (see Figure 2).

      Reviewer #2:

      Summary:

      According to the sensory recruitment model, the contents of working memory (WM) are maintained by activity in the same sensory cortical regions responsible for processing perceptual inputs. A strong version of the sensory recruitment model predicts that stimulus-specific activity patterns measured in sensory brain areas during WM storage should be identical to those measured during perceptual processing. Previous research casts doubt on this hypothesis, but little is known about how stimulus-specific activity patterns during perception and memory differ. Through clever experimental design and rigorous analyses, Duan & Curtis convincingly demonstrate that stimulus-specific representations of remembered items are highly abstracted versions of representations measured during perceptual processing and that these abstracted representations are immune to aperture biases that contribute to fMRI feature decoding. The paper provides converging evidence that neural states responsible for representing information during perception and WM are fundamentally different, and provides a potential explanation for this difference.

      Strengths:

      (1) The generation of stimuli with matching vs. orthogonal orientations and aperture biases is clever and sets up a straightforward test regarding whether and how aperture biases contribute to orientation decoding during perception and WM. The demonstration that orientation decoding during perception is driven primarily by aperture bias while during WM it is driven primarily by orientation is compelling.

      (2) The paper suggests a reason why orientation decoding during WM might be immune to aperture biases: by weighting multivoxel patterns measured during WM storage by spatial population receptive field estimates from a different task the authors show that remembered but not actively viewed - orientations form "line-like" patterns in retinotopic cortical space.

      We thank the reviewer for noting the strengths in our work.

      Weaknesses:

      (1) The paper tests a strong version of the sensory recruitment model, where neural states representing information during WM are presumed to be identical to neural states representing the same information during perceptual processing. As the paper acknowledges, there is already ample reason to doubt this prediction (see, e.g., earlier work by Kok & de Lange, Curr Biol 2014; Bloem et al., Psych Sci, 2018; Rademaker et al., Nat Neurosci, 2019; among others). Still, the demonstration that orientation decoding during WM is immune to aperture biases known to drive orientation decoding during perception makes for a compelling demonstration.

      We agree with the reviewer, and would add that the main problem with the sensory recruitment model of WM is that it remains underspecified. The work cited above and in our paper, and the results in this report is only the beginning of efforts to fully detail what it means to recruit sensory mechanisms for memory.

      (2) Earlier work by the same group has reported line-like representations of orientations during memory storage but not during perception (e.g., Kwak & Curtis, Neuron, 2022). It's nice to see that result replicated during explicit perceptual and WM tasks in the current study, but I question whether the findings provide fundamental new insights into the neural bases of WM. That would require a model or explanation describing how stimulus-specific activation patterns measured during perception are transformed into the "line-like" patterns seen during WM, which the authors acknowledge is an important goal for future research.

      We agree with the reviewer that perhaps some might see the current results as an incremental step given our previous paper. However, we would point out that researchers have been decoding memorized orientation from the early visual cortex for 15 years, and not one of those highly impactful studies had ever done what we did here, which was to test if decoded WM representations are the product of aperture biases. Not only do our results indicate that decoding memorized orientation is immune to these biases, but they critically suggest a reason why one can decode orientation during WM.

      Reviewer #3:

      Summary:

      In this work, Duan and Curtis addressed an important issue related to the nature of working memory representations. This work is motivated by findings illustrating that orientation decoding performance for perceptual representations can be biased by the stimulus aperture (modulator). Here, the authors examined whether the decoding performance for working memory representations is similarly influenced by these aperture biases. The results provide convincing evidence that working memory representations have a different representational structure, as the decoding performance was not influenced by the type of stimulus aperture.

      Strengths:

      The strength of this work lies in the direct comparison of decoding performance for perceptual representations with working memory representations. The authors take a well-motivated approach and illustrate that perceptual and working memory representations do not share a similar representational structure. The authors test a clear question, with a rigorous approach and provide convincing evidence. First, the presented oriented stimuli are carefully manipulated to create orthogonal biases introduced by the stimulus aperture (radial or angular modulator), regardless of the stimulus carrier orientation. Second, the authors implement advanced methods to decode the orientation information present, in visual and parietal cortical regions, when directly perceiving or holding an oriented stimulus in memory. The data illustrates that working memory decoding is not influenced by the type of aperture, while this is the case in perception. In sum, the main claims are important and shed light on the nature of working memory representations.

      We thank the reviewer for noting the strengths in our work.

      Weaknesses:

      I have a few minor concerns that, although they don't affect the main conclusion of the paper, should still be addressed.

      (1) Theoretical framing in the introduction: Recent work has shown that decoding of orientation during perception does reflect orientation selectivity, and it is not only driven by the stimulus aperture (Roth, Kay & Merriam, 2022).

      Excellent point, and similar to the point made by Reviewer 1. We now adjust our text and cite the paper in the Introduction.

      Below, we paste our response to Reviewer 1:

      “Second, we remain agnostic to the question of whether fine-scale inhomogenous sampling of orientation selective neurons may drive some of the decoding we report here. It is possible that our line-like representations are driven by neurons tuned to the sample orientation that have receptive fields that lie along the line. Here, we instead focus on testing the idea that WM decoding does not depend on aperture biases.”

      (2) Figure 1C illustrates the principle of how the radial and angular modulators bias the contrast energy extracted by the V1 model, which in turn would influence orientation decoding. It would be informative if the carrier orientations used in the experiment were shown in this figure, or at a minimum it would be mentioned in the legend that the experiment used 3 carrier orientations (15{degree sign}, 75{degree sign}, 135{degree sign}) clockwise from vertical. Related, when trying to find more information regarding the carrier orientation, the 'Stimuli' section of the Methods incorrectly mentions that 180 orientations are used as the carrier orientation.

      We apologize for not clearly indicating the stimulus features in the figure. Now, we added the information about the target orientations in Figure 1C legend. Also, we now corrected in the Methods section the mistakes about the carrier orientation and the details of the task. Briefly, participants were asked to use a continuous report over 180 orientations. We now clarify that “We generated 180 orientations for the carrier grating to cover the whole orientation space during the continuous report task.”

      (3) The description of the image computable V1 model in the Methods is incomplete, and at times inaccurate. i) The model implements 6 orientation channels, which is inaccurately referred to as a bandwidth of 60{degree sign} (should be 180/6=30). ii) The steerable pyramid combines information across phase pairs to obtain a measure of contrast energy for a given stimulus. Here, it is only mentioned that the model contains different orientation and spatial scale channels. I assume there were also 2 phase pairs, and they were combined in some manner (squared and summed to create contrast energy). Currently, it is unclear what the model output represents. iii) The spatial scale channel with the maximal response differences between the 2 modulators was chosen as the final model output. What spatial frequency does this channel refer to, and how does this spatial frequency relate to the stimulus?

      (i) First, we thank the reviewer for pointing out this mistake since the range of orientations should be 180deg instead of 360deg. We corrected this in the revised version.

      (ii) Second, we apologize for not being clear. In the second paragraph of the “Simulate model outputs” section, we wrote,

      “For both types of stimuli, we used three target orientations (15°, 75°, and 135° clockwise from vertical), which had two kinds of phases for both the carriers and the modulators. We first generated the model’s responses to each target image separately, then averaged the model responses across all phases for each orientation condition.”

      We have corrected this text by now writing,

      from vertical), two phases for the carrier (0 or π), and two phases for the modulator (sine “For both types of stimuli, we used three target orientations (15°, 75°, and 135° clockwise from vertical), two phases for the carrier (0 or π), and two phases for the modulator (sine or cosine phase). We first generated the model responses to each phase condition separately, then averaged them across all phases for each orientation condition.”

      (iii) Third and again we apologize for the misunderstanding. Since both modulated gratings have the same spatial frequency, the channel with the largest response should be equal to the spatial frequency of the stimulus. We corrected this by now writing,

      “For the final predicted responses, we chose the subband with maximal responses (the 9th level), which corresponds to the spatial frequency of the stimulus (Roth, Heeger, and Merriam 2018).”

      (4) It is not clear from the Methods how the difficulty in the perceptual control task was controlled. How were the levels of task difficulty created?

      Apologies for not being clear. The task difficulty was created by setting the contrast differences between the two stimuli. The easiest level is choosing the first and the last contrast as pairs, while the hardest level is choosing the continuous two contrasts. We added these sentences

      “The contrast for each stimulus was generated from a predefined set of 20 contrasts uniformly distributed between 0.5 and 1.0 (0.025 step size). We created 19 levels of task difficulty based on the contrast distance between the two stimuli. Thus, the difficulty ranged from choosing contrast pairs with the largest difference (0.5, easiest) to contrast pairs with the smallest difference (0.025, hardest). Task difficulty level changed based on an adaptive, 1-up-2-down staircase procedure (Levitt 1971) to maintain performance at approximately 70% correct.”

      Recommendations For The Authors

      (Reviewer #1):

      (1) If the black circle (Fig 3A & C) is the stimulus size, and the stimulus (12º) is roughly half the size of the entire screen (24.8º), then how are spatial reconstructions generated for parts of the visual field that fall outside of the screen? I am asking because in Figure 3 the area over which spatial reconstructions are plotted has a diameter at least 3 times the diameter of that black circle (the stimulus). I'm guessing this is maybe possible when using a very liberal fitting approach to prf's, where the center of a prf can be outside of the screen (so you'd fit a circle to an elongated blob, assuming that blob is the edge of a circle, or something). Can you really reliably estimate that far out into visual space/ extrapolate prf's that exist in a part of the space you did not fully map (because it's outside of the screen)?

      We thank the reviewer for pointing out this confusing issue.

      First, the spatial construction map has a diameter 3 times the diameter of the stimulus because we included voxels whose pRF eccentricities were within 20º in the reconstruction, the same as Kwak & Curtis, 2022. There are reasons for doing so. First, while the height of the screen is 24.8º, the width of the screen is 44º. Thus, it is possible to have voxels whose pRF eccentricities are >20º. Second, for areas outside the height boundaries, there might not be pRF centers, but the whole pRF Gaussian distributions might still cover the area. Moreover, when creating the final map combined across three orientation conditions, we rotated them to be centered vertically, which then required a 20x20º square. Finally, inspecting the reconstruction maps, we noticed that the area that was twice the stimulus size (black circle) made very little contributions to the reconstructions. Therefore, the results depicted in Figure 3A&C are justified, but see the next comment and our response.

      (2) Is the quantification in 3B/C justified? The filter line uses a huge part of visual space outside of the stimulus (and even the screen). For the angular modulator in the "perception" condition, this means that there is no peak at -90/90 degree. But if you were to only use a line that is about the size of the stimulus (a reasonable assumption), it would have a peak at -90/90 degree.

      This is an excellent question. We completely agree that it is more reasonable to use filter lines that have the same size (12º) as the stimulus instead of the whole map size (40º). Based on the feedback from the Reviewer, we redid the spatial reconstruction analyses and now include the following changes to Figure 3.

      (1) We fitted the lines using pixels only within the stimulus. In Figure 3A and Figure 3C, we now replaced the reconstruction maps.

      (2) We added the color bar in Figure 3A.

      (3) We regenerated the filtered responses and calculated the fidelity results by using line filters with the stimulus size. We replaced the filtered responses and fidelity results in Figure 3B and Figure 3D. With the new analysis, as anticipated by the Reviewer, we now found peaks at -90/90 degrees for the angular modulated gratings in the perceptual control task in V1 and V2. Thank you Reviewer 1!!!!

      (4) We also made corresponding changes in the Supplementary Figure S4 and S5, as well as the statistical results in Table S4 and S5.

      (5) In the “Methods” section, we added “within the stimulus size” for both “fMRI data analysis: Spatial reconstruction” and “Quantification and statistical analysis” subsections.

      (3) Figure 4 is nice, but not exactly quantitative. It does not address that the reconstructions from the perceptual task are hugging the stimulus edges much more closely compared to the modeled map. Conversely, the yellow parts of the reconstructions from the delay fan out much further than those of the model. The model also does not seem to dissociate radial/angular stimuli, while in the perceptual data the magnitude of perceptual reconstruction is clearly much weaker for angular compared to radial modulator.

      We thank the reviewer for this question. First, we admit that Figure 4 is more qualitative than quantitative. However, we see no alternative that better depicts the similarity in the model prediction and the fMRI results for the perceptual control and WM tasks. The figure clearly shows the orthogonal aperture bias. Second, we agree that aspects of the observed fMRI results are not perfectly captured by the model. This could be caused by many reasons, including fMRI noise, individual differences, etc. Importantly, different modulators induce orthogonal aperture bias in the perceptual but not the WM task, and therefore does not have a major impact on the conclusions.

      (4) The working memory and perception tasks are rather different. In this case, the perception task does not require the subject to process the carrier orientation (which is largely occluded, and possibly not that obvious without paying attention to it), but attention is paid to contrast. In this scenario, stimulus energy may dominate the signal. In the WM task, subjects have to work out what orientation is shown to do the task. Given that the sensory stimulus in both tasks is brief (1.5s during memory encoding, and 2.5s total in the perceptual task), it would be interesting to look at decoding (and reconstructions) for the WM stimulus epoch. If abstraction (into a line) happens in working memory, then this perceptual part of the task should still be susceptible to aperture biases. It allows the authors to show that it is indeed during memory (and not merely the task or attentional state of the subject) that abstraction occurs.

      We addressed the same point in the response for Reviewer 1, “additional context” section.

      Recommendations for improving the writing:

      (1) The main text had too little information about the Methods. Of course, some things need not be there, but others are crucial to understanding the basics of what is being shown. For example, the main text does not describe how many orientations are used (well... actually the caption to Figure 1 says there are 2: horizontal and vertical, which is confusing), and I had to deduce from the chance level (1/3) that there must have been 3 orientations. Also, given how important the orthogonality of the carrier and modulator are, it would be good to have this explicit (I would even want an analysis showing that indeed the two are independent). A final example is the use of beta weights, and for delay period decoding only the last 6s (of the 12s delay) are modeled and used for decoding.

      We thank the reviewer for identifying aspects of the manuscript that were confusing. We made several changes to the paper to clarify these details.

      First, we added the information about the orientations we used in the caption for Figure 1 and made it clear that Figure 1C is just an illustration using vertical/horizontal orientations. Second, the carrier and the modulator are different in many ways. For example, the carrier is a grating with orientation and contrast information, while the modulator is the aperture that bounds the grating without these features. Their phases are orthogonal, and we added this in the second paragraph of the “Stimuli” section. Last, in the main text and the captions, we now denote “late delay” when writing about our procedures.

      (2) Right under Figure 3, the text reads "angular modulated gratings produced line-like representations that were orthogonal carrier orientation reflecting the influence of stimulus vignetting", but the quantification (Figure 3D) does not support this (there is no orthogonal "bump" in the filtered responses from V1-V3, and one aligned with the carrier orientation in higher areas).

      This point was addressed in the “recommendations for the authors (Reviewer 1), point 2” above.

      Minor corrections to text and figures:

      (1) Abstract: "are WM codes" should probably be "WM codes are".

      We prefer to keep “are WM codes” as it is grammatically correct.

      (2) Introduction: Second sentence 2nd paragraph: representations can be used to decode representations? Or rather voxel patterns can be used...

      Changed to “On the one hand, WM representations can be decoded from the activity patterns as early as primary visual cortex (V1)...”

      (3) Same paragraph: might be good to add more references to support the correlation between V1 decoding and behavior. There's an Ester paper, and Iamchinina et al. 2021. These are not trial-wise, but trial-wise can also be driven by fluctuating arousal effects, so across-subject correlations help fortify this point.

      We added these two papers as references.

      (4) Last paragraph: "are WM codes" should probably be "WM codes are".

      See (1) above.

      (5) Figure 1B & 2A caption: "stimulus presenting epoch" should probably be "stimulus presentation epoch".

      Changed to “stimulus epoch”.

      (6) Figure 1C: So this is very unclear, to say stimuli are created using vertical and horizontal gratings (when none of the stimuli used in the experiment are either).

      We solved and answered this point in response to Reviewer 3, point 2.

      (7) Figure 2B caption "cross" should probably be "across".

      We believe “cross” is fine since cross here means cross-decoding.

      (8) Figure 3A and C are missing a color bar, so it's unclear how these images are generated (are they scaled, or not) and what the BOLD values are in each pixel.

      All values in the map were scaled to be within -1 to 1. We added the color bar in both Figure 3 and Figure 4.

      (9) Figure 3B and D (bottom row) are missing individual subject data.

      We use SEM to indicate the variance across subjects.

      (10) Figure D caption: "early (V1 and V2)" should probably be "early areas (V1 and V2)".

      Corrected.

      (11) Methods, stimuli says "We generated 180 orientations for the carrier grating to cover the whole orientation space." But it looks like only 3 orientations were generated, so this is confusing.

      We solved and answered this point in response to Reviewer 3, point 2.

      (12) Further down (fMRI task) "random jitters" is probably "random jitter"

      Corrected.

    2. eLife assessment

      This paper provides valuable insights into the neural substrates of human working memory. Through clever experimental design and rigorous analyses, the paper provides compelling evidence that the working memory representation of stimulus orientation is a reformatted version of the presented stimulus, though more work is needed to establish more generally that visual working memories are abstractions of percepts. This work will be of broad interest to cognitive neuroscientists working on the neural bases of visual perception and memory.

    3. Reviewer #1 (Public Review):

      Summary:

      The authors aim to test the sensory recruitment theory of visual memory, which assumes that visual sensory areas are recruited for working memory and that these sensory areas represent visual memories in a similar fashion to how perceptual inputs are represented. To test the overlap between working memory (WM) and perception, the authors use coarse stimulus (aperture) biases that are known to account for (some) orientation decoding in visual cortex (i.e., stimulus energy is higher for parts of an image where a grating orientation is perpendicular to an aperture edge, and stimulus energy drives decoding). Specifically, the authors show gratings (with a given "carrier" orientation) behind two different apertures: One is a radial modulator (with maximal energy aligned with the carrier orientation) and the other an angular modulator (with maximal energy orthogonal to the carrier orientation). When subject detect contrast changes in these stimuli (the perceptual task), orientation decoding only works when training and testing within each modulator, but not across modulators, showing the impact of stimulus energy on decoding performance. Instead, when subjects remember the orientation over a 12s delay, orientation decoding works irrespective of the modulator used. The authors conclude that representations during WM are therefore not "sensory-like", given that they are immune to aperture biases. This invalidates the sensory recruitment hypothesis, or at least the part assuming that when sensory areas that are recruited during WM, they are recruited in a manner that resembles how these areas are used during perception.

      Strengths:

      Duan and Curtis very convincingly show that aperture effects that are present during perception, do not appear to be present during the working memory delay. Especially when the debate about "why can we decode orientations from human visual cortex" was in full swing, many may have quietly assumed this to be true (e.g., "the memory delay has no stimuli, and ergo no stimulus aperture effects"), but it is definitely not self-evident and nobody ever thought to test it directly until now. In addition to the clear absence of aperture effects during the delay, Duan and Curtis also show that when stimulus energy aligns with the carrier orientation, cross-generalization between perception and memory does work (which could explain why perception-to-memory cross decoding also works). All in all, this is a clever manipulation, and I'm glad someone did it, and did it well.

      Weaknesses:

      There seems to be a major possible confound that prohibits strong conclusions about "abstractions" into "line-like" representation, which is spatial attention. What if subjects simply attend the end points of the carrier grating, or attend to the edge of the screen where the carrier orientation "intersects" in order to do the task? This may also result in reconstructions that have higher bold at areas close to the stimulus/screen edges along the carrier orientation. The question then would be if this is truly an "abstracted representation", or if subjects are merely using spatial attention to do the task.

      Alternatively (and this reaches back to the "fine vs coarse" debate), another argument could be that during memory, what we are decoding is indeed fine-scale inhomogenous sampling of orientation preferences across many voxels. This is clearly not the most convincing argument, as the spatial reconstructions (e.g., Figure 3A and C) show higher BOLD for voxels with receptive fields that are aligned to the remembered orientation (which is in itself a form of coarse scale bias), but could still play a role.

      To conclude that the spatial reconstruction from the data indeed comes from a line-like representation, you'd need to generate modeled reconstructions of all possible stimuli and representations. Yes, Figure 4 shows that a line results in a modeled spatial map that resembles the WM data, but many other stimuli might too, and some may better match the data. For example, the alternative hypothesis (attention to grating endpoints) may very well lead to a very comparable model output to the one from a line. But testing this would not suffice, as there may be an inherent inverse problem (with multiple stimuli that can lead to the same visual field model).

      The main conclusion, and title of the paper, that visual working memories are abstractions of percepts, is therefore not supported. Subjects could be using spatial attention, for example. Furthermore, even if it is true that gratings are abstracted into lines, this form of abstraction would not generalize to any non-spatial feature (e.g., color cannot become a line, contrast cannot become a line, etc.), which means it has limited explanatory power.

      Additional context:

      The working memory and perception tasks are rather different. In this case, the perception task does not require the subject to process the carrier orientation (which is largely occluded, and possibly not that obvious without paying attention to it), but attention is paid to contrast. In this scenario, stimulus energy may dominate the signal. In the WM task, subjects have to work out what orientation is shown to do the task. Given that the sensory stimulus in both tasks is brief (1.5s during memory encoding, and 2.5s total in the perceptual task), it would be interesting to look at decoding (and reconstructions) for the WM stimulus epoch. If abstraction (into a line) happens in working memory, then this perceptual part of the task should still be susceptible to aperture biases. It allows the authors to show that it is indeed during memory (and not merely the task or attentional state of the subject) that abstraction occurs.

      What's also interesting is what happens in the passive perceptual condition, and the fact that spatial reconstructions for areas beyond V1 and V2 (i.e., V3, V3AB, and IPS0-1) align with (implied) grating endpoints, even when an angular modulator is used (Figure 3C). Are these areas also "abstracting" the stimulus (in a line-like format)?

      Review after revision:

      (1) It's nice of the authors to simulate how a dot stimulus affects the image computable model, but this does not entirely address my concern about attention to endpoints. The assumption that attention can be used in the same manner as a physical stimulus to calculate stimulus energy is questionable. (also, why would a dot at 15º lead to high stimulus energy tangential to that orientation?). This simulation also does not at all address my concern about model mimicry (many possible inputs can lead to a line-like output).

      (2) It's also nice that the authors agree that much more work needs to be done, and these results may not generalize to all forms of memory. Given this agreement, and until that "more work" is done, I strongly believe we should refrain from making hyperbolic claims that might preemptively imply all visual working memories are abstractions of percepts. Time (and much more work) will likely show things to be much more subtle and complex.

      The work presented in this paper is cool, but it uses a specific case: spatial stimuli (gratings) with the task to remember orientation. This limits possible conclusions for several reasons (1) These results are specific to EVC, as visual maps are a prerequisite meaning that these results will not hold up in other, non-retinotopic areas. (2) The fact that subjects are "focusing" along the main stimulus axis (attention or not) can simply be a strategy employed by the majority of (but not all) subjects - a strategy that may not be necessary to do the task, and therefore not a canonical method of Abstraction. It may be a "shared preferred strategy" or something. (3) If subjects had to (for example) remember contrast, and not orientation, results may have been entirely different (I would hypothesize there is no line-like abstraction in this case). Vice versa, if the perceptual task would have been on orientation (instead of contrast), the authors admit that "participants would reformat the grating into a line-like representation to make the judgments" (quote from author's response under "Additional context"). Thus, the results may be entirely about the task/ cognitive state, and not about how perceptual information is abstracted into memory.

      Instead of unveiling *the* working memory Abstraction, this work (very nicely) shows a specific instance of possible abstraction. A more correct (but admittedly, less "sexy") conclusion may be "Visual working memories of orientation can be abstracted into a line in early visual cortex". As it stands, the authors still do not acknowledge any of the alternatives that myself (see above) and the other reviewers have put forth, nor do they acknowledge recent work by Chunharas et al. (2023, BioRxiv), that directly applies principles of efficient coding to address the exact same question of working memory abstraction. The link between a "line-like" representation and efficient coding implied by the authors (in their response) is merely tentative to me, but it would be great if the authors could explain this further.

      These were, and remain, the major weaknesses in the original submission, that in my view have not been adequately addressed by the authors, as many overly broad conclusions about abstractions are currently still present in the manuscript (in for example the title).

    4. Reviewer #2 (Public Review):

      Summary:

      In this work, Duan and Curtis addressed an important issue related to the nature of working memory representations. This work is motivated by findings illustrating that orientation decoding performance for perceptual representations can be biased by the stimulus aperture (modulator). Here, the authors examined whether the decoding performance for working memory representations is similarly influenced by these aperture biases. The results provide convincing evidence that working memory representations have a different representational structure, as the decoding performance was not influenced by the type of stimulus aperture.

      Strengths:

      The strength of this work lies in the direct comparison of decoding performance for perceptual representations with working memory representations. The authors take well-motivated approach and illustrate that perceptual and working memory representations do not share a similar representational structure. The authors test a clear question, with a rigorous approach and provide compelling evidence. First, the presented oriented stimuli are carefully manipulated to create orthogonal biases introduced by the stimulus aperture (radial or angular modulator), regardless of the stimulus carrier orientation. Second, the authors implement advanced methods to decode the orientation information, in visual and parietal cortical regions, when directly perceiving or holding an oriented stimulus in memory. The data illustrates that working memory decoding is not influenced by the type of aperture, while this is the case in perception. In sum, the main claims are important and shed light on the nature of working memory representations.

      Weaknesses:

      After the authors revised the original manuscript, a few of my initial concerns remain.

      (1) Theoretical framing in the introduction. The introduction proposes that decoding of orientation information during perception does not reflect orientation selectivity, and it is instead driven by coarse scale biases. This is an overstatement. Recent work shows that orientation decoding is indeed influenced by coarse biases, but also reflects orientation selectivity (Roth, Kay & Merriam, 2022).

      (2) The description of the image computable V1 model remains incomplete. The steerable pyramid is a model that simulates the responses of V1 neurons. To do so, it incorporates a set of linear receptive fields with varying orientation and spatial frequency tuning. However, the information that is lacking in the Methods is whether the implemented pyramid also included two quadrature phase pairs (odd and even phase Gabor filters making the output phase invariant). The sum of the squares of the responses to these offset phase filters computes the stimulus energy within each orientation and spatial frequency channel. Without this description, it is unclear what the model output represents.

    1. eLife assessment

      Despite the well-known facilitatory effect that integration across the senses has on behavioural measures, standard neuroimaging approaches have not yet produced reliable and precise neural correlates. In this paper, Buhman et al. harness the decoding of EEG responses, beyond univariate approaches, to capture these correlates in a robust, clear fashion. If confirmed, this approach could be important for estimating multisensory integration in humans across a wide range of different domains. However, the strength of evidence to support these claims is still incomplete because of the potentially confounding factor of eye movements, which the authors themselves identify in their data, and because of the discrepancies between the behavioural and EEG data.

    2. Reviewer #1 (Public Review):

      This study presents a novel application of the inverted encoding (i.e., decoding) approach to detect the correlates of crossmodal integration in the human EEG (electrophysiological) signal. The method is successfully applied to data from a group of 41 participants, performing a spatial localization task on auditory, visual, and audio-visual events. The analyses clearly show a behavioural superiority for audio-visual localization. Like previous studies, the results when using traditional univariate ERP analyses were inconclusive, showing once more the need for alternative, more sophisticated approaches. Instead, the principal approach of this study, harnessing the multivariate nature of the signal, captured clear signs of super-additive responses, considered by many as the hallmark of multisensory integration. Unfortunately, the manuscript lacks many important details in the descriptions of the methodology and analytical pipeline. Although some of these details can eventually be retrieved from the scripts that accompany this paper, the main text should be self-contained and sufficient to gain a clear understanding of what was done. (A list of some of these is included in the comments to the authors). Nevertheless, I believe the main weakness of this work is that the positive results obtained and reported in the results section are conditioned upon eye movements. When artifacts due to eye movements are removed, then the outcomes are no longer significant.

      Therefore, whether the authors finally achieved the aims and showed that this method of analysis is truly a reliable way to assess crossmodal integration, does not stand on firm ground. The worst-case scenario is that the results are entirely accounted for by patterns of eye movements in the different conditions. In the best-case scenario, the method might truly work, but further experiments (and/or analyses) would be required to confirm the claims in a conclusive fashion.

      If finally successful, this approach could bring important advances in the many fields where multisensory integration has been shown to play a role, by providing a way to bring much-needed coherence across levels of analysis, from behaviour to single-cell electrophysiology. To achieve this, one would have to make sure that the pattern of super-additive effects, the standard self-imposed by the authors as a proxy for multisensory integration, shows up reliably regardless of eye movement or artifact corrections. One first step toward this goal would be, perhaps, to facilitate the understanding of results in context by reporting both the uncorrected and corrected analyses in the main results section. Second, one could try to support the argument given in the discussion, pointing out the origin of the super-additive effects in posterior electrode sites, by also modelling frontal electrode clusters and showing they aren't informative as to the effect of interest.

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript seeks to reconcile observations in multisensory perception - from behavior and neural responses. It is intuitively obvious that perceiving a stimulus via two senses results in better performance than one alone. In fact, it is not uncommon to observe that for a perceptual task, the percentage of correct responses seen with two senses is higher than the sum of the percentage correct obtained with each modality individually. i.e. the gains are "superadditive". The gains of adding a second sense are typically larger when the performance with the first sense is relatively poor - this effect is often called the principle of inverse effectiveness. More generally, what this tells us is that performance in a multisensory perceptual task is a non-linear sum of performance for each sensory modality alone.

      Despite this abundant evidence of behavioral non-linearity in multisensory integration, evoked responses (EEG) to such sensory stimuli often show little evidence of it - and this is the problem this manuscript tackles. The key assertion made is that univariate analysis of the EEG signal is likely to average out the non-linear effects of integration. This is a reasonable assertion, and their analysis does indeed provide evidence that a multivariate approach can reveal non-linear interactions in the evoked responses.

      Strengths:

      It is of great value to understand how the process of multisensory integration occurs, and despite a wealth of observations of the benefits of perceiving the world with multiple senses, we still lack a reasonable understanding of how the brain integrates information. For example - what underlies the large individual differences in the benefits of two senses over one? One way to tackle this is via brain imaging, but this is problematic if important features of the processing - such as non-linear interactions are obscured by the lack of specificity of the measurements. The approach they take to the analysis of the EEG data allows the authors to look in more detail at the variation in activity across EEG electrodes, which averaging across electrodes cannot.

      This version of the manuscript is well-written and for the most part clear. It shows a good understanding of the non-linear effects described above (where many studies show a poor understanding of "superadditivity" of perceptual performance) and the report of non-linear summation of neural responses is convincing.

      A particular strength of the paper is their use of a statistical model of multisensory integration as their "null" model of neural responses, and the "inverted-encoder" which infers an internal representation of the stimulus which can explain the EEG responses. This encoder generates a prediction of decoding performance, which can be used to generate predictions of multisensory decoding from unisensory decoding, or from a sum of the unisensory internal representations.

      In behavioural performance, it is frequently observed that the performance increase from two senses is close to what is expected from the optimal integration of information across the senses, in a statistical sense. It can be plausibly explained by assuming that people are able to weigh sensory inputs according to their reliability - and somewhat optimally. Critically the apparent "superadditive" effect on performance described above does not require any non-linearity in the sum of information across the senses but can arise from correctly weighting the information according to reliability.

      The authors apply a similar model to predict the neural responses expected to audiovisual stimuli from the neural responses to audio and visual stimuli alone, assuming optimal statistical integration of information. The neural responses to audiovisual stimuli exceed the predictions of this model and this is the main evidence supporting their conclusion, and it is convincing.

      Weaknesses:

      The main weakness of the manuscript is that their behavioural data show no evidence of performance that exceeds the predictions of these statistical models. In fact, the models predict multisensory performance from unisensory performance pretty well. So this manuscript presents the opposite problem to that which motivated the study - neural interactions across the senses which appear to be more non-linear than perception. This makes it hard to interpret their results, as surely if these nonlinear neural interactions underlie the behaviour, then we should be able to see evidence of it in the behaviour? I cannot offer an easy explanation for this.

      Overall, therefore, I applaud the motivation and the sophistication of the analysis method and think it shows great promise for tackling these problems, but the manuscript unfortunately brushes over an important problem specific to the results. It appeals to the higher-level reasoning - that non-linearity is a behavioural hallmark of integration and therefore we should see it in neural responses. Yet it ignores the fact that the behaviour observed here does not exceed the predictions of the "null" model applied to the neural response.

      Part of the problem, I think, is that the authors never explain the difference between superadditivity of perceptual performance (proportion correct) and superadditivity of the underlying processing, which is implied by the EEG results but not their behavior. This is of course a difficult matter to describe succinctly or clearly (I somehow doubt I have). It is however worth addressing. The literature is full of confusing claims of superadditivity. I believe these authors understand this distinction and have an opportunity to represent it clearly for the benefit of all.

    4. Author response:

      Response to Reviewer #1 (Public Review):

      We thank the reviewer for their constructive criticism of our study, their proposed solutions, and for highlighting areas of the methodology and analytical pipeline where explanations were unclear or unsatisfactory. We will take the reviewer’s feedback into account to improve the clarity and readability of the revised manuscript. We acknowledge the importance of ruling out eye movements as a potential confound. We address these concerns briefly below, but a more detailed explanation (and a full breakdown of the relevant analyses, including the corrected and uncorrected results) will be provided in the revised manuscript.

      First, the source of EEG activity recorded from the frontal electrodes is often unclear. Without an external reference, it is challenging to resolve the degree to which frontal EEG activity represents neural or muscular responses1. Thus, as a preventative measure against the potential contribution of eye movement activity, for all our EEG analyses, we only included activity from occipital, temporal, and parietal electrodes (the selected electrodes can be seen in the final inset of Figure 3).

      Second, as suggested by the reviewer, we re-ran our analyses using the activity measured from the frontal electrodes alone. If the source of the nonlinear decoding accuracy in the AV condition was muscular activity produced by eye movements, we would expect to observe better decoding accuracy from sensors closer to the source. Instead, we found that decoding accuracy from the frontal electrodes (peak d' = 0.08) was less than half that of decoding accuracy from the more posterior electrodes (peak d' = 0.18). These results suggest that the source of neural activity containing information about stimulus position was located over occipito-parietal areas, consistent with our topographical analyses (inset of Figure 4).

      Third, we compared the average eye movements between the three main sensory conditions (auditory, visual, and audiovisual). In the visual condition, there was little difference in eye movements corresponding to the five stimulus locations, likely because the visual stimuli were designed to be spatially diffuse. For the auditory and audiovisual conditions, there was more distinction between eye movements corresponding to the stimulus locations. However, these appeared to be the same between auditory and audiovisual conditions. If consistent saccades to audiovisual stimuli had been responsible for the nonlinear decoding we observed, we would expect to find a higher positive correlation between horizontal eye position and stimulus location in the audiovisual condition than in the auditory or visual conditions. Instead, we found no difference in correlation between audiovisual and auditory stimuli, indicating that eye movements were equivalent in these conditions and unlikely to explain better decoding accuracy for audiovisual stimuli.

      Finally, we note that the stricter eye movement criterion acknowledged in the Discussion section of the original manuscript resulted in significantly better audiovisual d' than the MLE prediction, but this difference did not survive cluster correction. This is an important distinction to make as, when combined with the results described above, it seems to support our original interpretation that the stricter criterion combined with our conservative measure of (mass-based) cluster correction2 led to type 2 error.

      References

      (1) Roy, R. N., Charbonnier, S., & Bonnet, S. (2014). Eye blink characterization from frontal EEG electrodes using source separation and pattern recognition algorithms. Biomedical Signal Processing and Control, 14, 256–264.

      (2) Pernet, C. R., Latinus, M., Nichols, T. E., & Rousselet, G. A. (2015). Cluster-based computational methods for mass univariate analyses of event-related brain potentials/fields: A simulation study. Journal of Neuroscience Methods, 250, 85–93.

      Response to Reviewer #2 (Public Review):

      We thank the reviewer for their insight and constructive feedback. As emphasized in the review, an interesting question that arises from our results is that, if the neural data exceeds the optimal statistical decision (MLE d'), why doesn’t the behavioural data? We agree with the reviewer’s suggestion that more attention should be devoted to this question, and plan to provide a deeper discussion of the relationship between behavioural and neural super-additivity in the revised manuscript. We also note that while this discrepancy remains unexplained, our results are consistent with the literature. That is, both non-linear neural responses (single-cell recordings) and behavioural responses that match MLE are reliable phenomenon in multisensory integration1,2,3,4.

      One possible explanation for this puzzling discrepancy is that behavioural responses occur sometime after the initial neural response to sensory input. There are several subsequent neural processes between perception and a behavioural response5, all of which introduce additional noise that may obscure super-additive perceptual sensitivity. In particular, the mismatch between neural and behavioural accuracy may be the result of additional neural processes that translate sensory activity into a motor response to perform the behavioural task.

      Our measure of neural super-additivity (exceeding optimally weighted linear summation) differs from how it is traditionally assessed (exceeding summation of single neuron responses)2. However, neither method has yet fully explained how this neural activity translates to behavioural responses, and we think that more work is needed to resolve the abovementioned discrepancy. However, our method will facilitate this work by providing a reliable method of measuring neural super-additivity in humans, using non-invasive recordings.

      References

      (1) Alais, D., & Burr, D. (2004). The ventriloquist effect results from near-optimal bimodal integration. Current Biology, 14(3), 257–262.

      (2) Ernst, M. O., & Banks, M. S., (2002). Humans integrate visual and haptic information in a statistically optimal fashion. Nature, 415(6870), 429–433.

      (3) Meredith, M. A., & Stein, B. E. (1993). Interactions among converging sensory inputs in the superior colliculus. Science, 221, 389–391.

      (4) Stanford, T. R., & Stein, B. E. (2007). Superadditivity in multisensory integration: putting the computation in context. Neuroreport 18, 787–792.

      (5) Heekeren, H., Marrett, S. & Ungerleider, L. (2008). The neural systems that mediate human perceptual decision making. Nature Reviews Neuroscience, 9, 467–479.

    1. eLife assessment

      This useful experiment seeks to better understand how memory interacts with incoming visual information to effectively guide human behavior. Using several methods, the authors report two distinct pathways relating visual processing to the default mode network: one that emphasizes "semantic" cognition, and the other, spatial cognition. Despite the impressive array of methods employed, the evidence supporting a clear distinction is currently incomplete.

    2. Reviewer #1 (Public Review):

      In this study, Gonzalez Alam et al. report a series of functional MRI results about the neural processing from the visual cortex to high-order regions in the default-mode network (DMN), compiling evidence from task-based functional MRI, resting-state connectivity, and diffusion-weighted imaging. Their participants were first trained to learn the association between objects and rooms/buildings in a virtual reality experiment; after the training was completed, in the task-based MRI experiment, participants viewed the objects from the earlier training session and judged if the objects were in the semantic category (semantic task) or if they were previously shown in the same spatial context (spatial context task). Based on the task data, the authors utilised resting-state data from their previous studies, visual localiser data also from previous studies, as well as structural connectivity data from the Human Connectome Project, to perform various seed-based connectivity analysis. They found that the semantic task causes more activation of various regions involved in object perception while the spatial context task causes more activation in various regions for place perception, respectively. They further showed that those object perception regions are more connected with the frontotemporal subnetwork of the DMN while those place perception regions are more connected with the medial-temporal subnetwork of the DMN. Based on these results, the authors argue that there are two main pathways connecting the visual system to high-level regions in the DMN, one linking object perception regions (e.g., LOC) leading to semantic regions (e.g., IFG, pMTG), the other linking place perception regions (e.g., parahippocampal gyri) to the entorhinal cortex and hippocampus.

      Below I provide my takes on (1) the significance of the findings and the strength of evidence, (2) my guidance for readers regarding how to interpret the data, as well as several caveats that apply to their results, and finally (3) my suggestions for the authors.

      (1) Significance of the results and strength of the evidence

      I would like to praise the authors for, first of all, trying to associate visual processing with high-order regions in the DMN. While many vision scientists focus specifically on the macroscale organisation of the visual cortex, relatively few efforts are made to unravel how neural processing in the visual system goes on to engage representations in regions higher up in the hierarchy (a nice precedent study that looks at this issue is by Konkle and Caramazza, 2017). We all know that visual processing goes beyond the visual cortex, potentially further into the DMN, but there's no direct evidence. So, in this regard, the authors made a nice try to look at this issue.

      Having said this, the authors' characterisation of the organisation of the visual cortex (object perception/semantics vs. place perception/spatial contexts) does not go beyond what has been known for many decades by vision neuroscience. Specifically, over the past two decades, numerous proposals have been put forward to explain the macroscale organisation of the visual system, particularly the ventrolateral occipitotemporal cortex. A lateral-medial division has been reliably found in numerous studies. For example, some researchers found that the visual cortex is organised along the separation of foveal vision (lateral) vs. peripheral vision (medial), while others found that it is structured according to faces (lateral) vs. places (medial). Such a bipartite division is also found in animate (lateral) vs. inanimate (medial), small objects (lateral) vs. big objects (medial), as well as various cytoarchitectonic and connectomic differences between the medial side and the lateral side of the visual cortex. Some more recent studies even demonstrate a tripartite division (small objects, animals, big objects; see Konkle and Caramazza, 2013). So, in terms of their characterisation of the visual cortex, I think Gonzalez Alam et al. do not add any novel evidence to what the community of neuroscience has already known.

      However, the authors' effort to link visual processing with various regions of the DMN is certainly novel, and their attempt to gather converging evidence with different methodologies is commendable. The authors are able to show that, in an independent sample of resting-state data, object-related regions are more connected with semantic regions in the DMN while place-related regions are more connected with navigation-related regions in the DMN, respectively. Such patterns reveal a consistent spatial overlap with their Kanwisher-type face/house localiser data and also concur with the HCP white-matter tractography data. Overall, I think the two pathways explanation that the authors seek to argue is backed by converging evidence. The lack of travelling wave type of analysis to show the spatiotemporal dynamics across the cortex from the visual cortex to high-level regions is disappointing though because I was expecting this type of analysis would provide the most convincing evidence of a 'pathway' going from one point to another. Dynamic caudal modelling or Granger causality may also buttress the authors' claim of pathway because many readers, like me, would feel that there is not enough evidence to convincingly prove the existence of a 'pathway'.

      (2) Guidance to the readers about interpretation of the data

      The organisation of the visual cortex and the organisation of the DMN historically have been studied in parallel with little crosstalk between different communities of researchers. Thus, the work by Gonzalez Alam et al. has made a nice attempt to look at how visual processing goes beyond the realm of the visual cortex and continues into different subregions of the DMN.

      While the authors of this study have utilised multiple methods to obtain converging evidence, there are several important caveats in the interpretation of their results:

      (1) While the authors choose to use the term 'pathway' to call the inter-dependence between a set of visual regions and default-mode regions, their results have not convincingly demonstrated a definitive route of neural processing or travelling. Instead, the findings reveal a set of DMN regions are functionally more connected with object-related regions compared to place-related regions. The results are very much dependent on masking and thresholding, and the patterns can change drastically if different masks or thresholds are used.

      (2) Ideally, if the authors could demonstrate the dynamics between the visual cortex and DMN in the primary task data, it would be very convincing evidence for characterising the journey from the visual cortex to DMN. Instead, the current connectivity results are derived from a separate set of resting state data. While the advantage of the authors' approach is that they are able to verify certain visual regions are more connected with certain DMN regions even under a task-free situation, it falls short of explaining how these regions dynamically interact to convert vision into semantic/spatial decision.

      (3) There are several results that are difficult to interpret, such as their psychophysiological interactions (PPI), representational similarity analysis, and gradient analysis. For example, typically for PPI analysis, researchers interrogate the whole brain to look for PPI connectivity. Their use of targeted ROI is unusual, and their use of spatially extensive clusters that encompass fairly large cortical zones in both occipital and temporal lobes as the PPI seeds is also an unusual approach. As for the gradient analysis, the argument that the semantic task is higher on Gradient 1 than the spatial task based on the statistics of p-value = 0.027 is not a very convincing claim (unhelpfully, the figure on the top just shows quite a few blue 'spatial dots' on the hetero-modal end which can make readers wonder if the spatial context task is really closer to the unimodal end or it is simply the authors' statistical luck that they get a p-value under 0.05). While it is statistically significant, it is weak evidence (and it is not pertinent to the main points the authors try to make).

      (3) My suggestion for the authors

      There are several conceptual-level suggestions that I would like to offer to the authors:

      (1) If the pathway explanation is the key argument that you wish to convey to the readers, an effective connectivity type of analysis, such as Granger causality or dynamic caudal modelling, would be helpful in revealing there is a starting point and end point in the pathway as well as revealing the directionality of neural processing. While both of these methods have their issues (e.g., Granger causality is not suitable for haemodynamic data, DCM's selection of seeds is susceptible to bias, etc), they can help you get started to test if the path during task performance does exist. Alternatively, travelling wave type of analysis (such as the results by Raut et al. 2021 published in Science Advances) can also be useful to support your claims of the pathway.

      (2) I think the thresholding for resting state data needs to be explained - by the look of Figure 2E and 3E, it looks like whole-brain un-thresholded results, and then you went on to compute the conjunction between these un-thresholded maps with network templates of the visual system and DMN. This does not seem statistically acceptable, and I wonder if the conjunction that you found would disappear and reappear if you used different thresholds. Thus, for example, if the left IFG cluster (which you have shown to be connected with the visual object regions) would disappear when you apply a conventional threshold, this means that you need to seriously consider the robustness of the pathway that you seek to claim... it may be just a wild goose that you are chasing.

      (3) There are several analyses that are hard to interpret and you can consider only reporting them in the supplementary materials, such as the PPI results and representational similarity analysis, as none of these are convincing. These analyses do not seem to add much value to make your argument more convincing and may elicit more methodological critiques, such as statistical issues, the set-up of your representational theory matrix, and so on.

    3. Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Alam et al. sought to understand how memory interacts with incoming visual information to effectively guide human behavior by using a task that combines spatial contexts (houses) with objects of one or multiple semantic categories. Three additional datasets (all from separate participants) were also employed: one that functionally localized regions of interest (ROIs) based on subtractions of different visually presented category types (in this case, scenes, objects, and scrambled objects); another consisting of resting-state functional connectivity scans, and a section of the Human Connectome Project that employed DTI data for structural connectivity analysis. Across multiple analyses, the authors identify dissociations between regions preferentially activated during scene or object judgments, between the functional connectivity of regions demonstrating such preferences, and in the anatomical connectivity of these same regions. The authors conclude that the processing streams that take in visual information and support semantic or spatial processing are largely parallel and distinct.

      Strengths:

      (1) Recent work has reconceptualized the classic default mode network as two parallel and interdigitated systems (e.g., Braga & Buckner, 2017; DiNicola et al., 2021). The current manuscript is timely in that it attempts to describe how information is differentially processed by two streams that appear to begin in visual cortex and connect to different default subnetworks. Even at a group level where neuroanatomy is necessarily blurred across individuals, these results provide clear evidence of stimulus-based dissociation.

      (2) The manuscript contains a large number of analyses across multiple independent datasets. It is therefore unlikely that a single experimenter choice in any given analysis would spuriously produce the overall pattern of results reported in this work.

      Weaknesses:

      (1) Throughout the manuscript, a strong distinction is drawn between semantic and spatial processing. However, given that only objects and spatial contexts were employed in the primary experiment, it is not clear that a broader conceptual distinction is warranted between "semantic" and "spatial" cognition. There are multiple grounds for concern regarding this basic premise of the manuscript.<br /> a. One can have conceptual knowledge of different types of scenes or spatial contexts. A city street will consistently differ from a beach in predictable ways, and a kitchen context provides different expectations than a living room. Such distinctions reflect semantic knowledge of scene-related concepts, but in the present work spatial and "all other" semantic information are considered and discussed as distinct and separate.<br /> b. As a related question, are scenes uniquely different from all other types of semantic/category information? If faces were used instead of scenes, could one expect to see different regions of the visual cortex coupling with task-defined face > object ROIs? The current data do not speak to this possibility, but as written the manuscript suggests that all (non-spatial) semantic knowledge should be processed by the FT-DMN.<br /> c. Recent precision fMRI studies characterizing networks corresponding to the FT-DMN and MTL-DMN have associated the former with social cognition and the latter with scene construction/spatial processing (DiNicola et al., 2020; 2021; 2023). This is only briefly mentioned by the authors in the current manuscript (p. 28), and when discussed, the authors draw a distinction between semantic and social or emotional "codes" when noting that future work is necessary to support the generality of the current claims. However, if generality is a concern, then emphasizing the distinction between object-centric and spatial cognition, rather than semantic and spatial cognition, would represent a more conservative and better-supported theoretical point in the current manuscript.

      (2) Both the retrosplenial/parieto-occipital sulcus and parahippocampal regions are adjacent to the visual network as defined using the Yeo et al. atlas, and spatial smoothness of the data could be impacting connectivity metrics here in a way that qualitatively differs from the (non-adjacent) FT-DMN ROIs. Although this proximity is a basic property of network locations on the cortical surface, the authors have several tools at their disposal that could be employed to help rule out this possibility. They might, for instance, reduce the smoothing in their multi-echo data, as the current 5 mm kernel is larger than the kernel used in Experiment 2's single-echo resting-state data. Spatial smoothing is less necessary in multi-echo data, as thermal noise can be attenuated by averaging over time (echoes) instead of space (see Gonzalez-Castillo et al., 2016 for discussion). Some multi-echo users have eschewed explicit spatial smoothing entirely (e.g., Ramot et al., 2021), just as the authors of the current paper did for their RSA analysis. Less smoothing of E1 data, combined with a local erosion of either the MTL-DMN and VIS masks (or both) near their points of overlap in the RSFC data, would improve confidence that the current results are not driven, at least in part, by spatial mixing of otherwise distinct network signals.

      (3) The authors identify a region of the right angular gyrus as demonstrating a "potential role in integrating the visual-to-DMN pathways." This would seem to imply that lesion damage to right AG should produce difficulties in integrating "semantic" and "spatial" knowledge. Are the authors aware of such a literature? If so, this would be an important point to make in the manuscript as it would tie in yet another independent source of information relevant to the framework being presented. The closest of which I am aware involves deficits in cued recall performance when associates consisted of auditory-visual pairings (Ben-Zvi et al., 2015), but that form of multi-modal pairing is distinct from the "spatial-semantic" integration forwarded in the current manuscript.

    1. Author response:

      Thanks for the eLife assessment

      “This study employed a comprehensive approach to examining how the MT+ region integrates into a complex cognition system in mediating human visuo-spatial intelligence. While the findings are useful, the experimental evidence is incomplete and the study design, hypothesis, analyses, writing, and presentation need to be improved.” We plan to revise the manuscript according to the comments of Public Reviews.

      We are grateful for the excellent and very helpful comments, and now we address provisional author responses.

      Reviewer #1 (Public Review):

      Summary:

      The study of human intelligence has been the focus of cognitive neuroscience research, and finding some objective behavioral or neural indicators of intelligence has been an ongoing problem for scientists for many years. Melnick et al, 2013 found for the first time that the phenomenon of spatial suppression in motion perception predicts an individual's IQ score. This is because IQ is likely associated with the ability to suppress irrelevant information. In this study, a high-resolution MRS approach was used to test this theory. In this paper, the phenomenon of spatial suppression in motion perception was found to be correlated with the visuo-spatial subtest of gF, while both variables were also correlated with the GABA concentration of MT+ in the human brain. In addition, there was no significant relationship with the excitatory transmitter Glu. At the same time, SI was also associated with MT+ and several frontal cortex FCs.

      Strengths:

      (1) 7T high-resolution MRS is used.

      (2) This study combines the behavioral tests, MRS, and fMRI.

      Weaknesses:

      (1) In the intro, it seems to me that the multiple-demand (MD) regions are the key in this study. However, I didn't see any results associated with the MD regions. Did I miss something??

      Thank reviewer for pointing this out. After careful consideration, we agree with your point of view. According to the results of Melnick 2013, the motion surround suppression (SI) and the time thresholds of small and large gratings representing hMT+ functionality are correlated with Verbal Comprehension, Perceptual Reasoning, Working Memory, and Processing Speed Indicators, with correlation coefficients of 0.69, 0.47, 0.49, and 0.50, respectively. This suggests that hMT+ does have the potential to become the core of MD system. However, due to our results only delving into “the GABA-ergic inhibition in human MT predicts visuo-spatial intelligence mediated by reverberation with frontal cortex”, it is not yet sufficient to prove that hMT+is the core node of the MD system, we will adjust the explanatory logic of the article, that is, emphasizing the de-redundancy of hMT+ in visual-spatial intelligence and the improvement of information processing efficiency, while weakening the significance of hMT+ in MD systems.

      (2) How was the sample size determined? Is it sufficient??

      Thank reviewer for pointing this out. We use G*power to determine our sample size. In the study by Melnick (2013), they reported a medium effect between SI and Perception Reasoning sub-ability (r=0.47). Here we use this r value as the correlation coefficient (ρ H1), setting the power at the commonly used threshold of 0.8 and the alpha error probability at 0.05. The required sample size is calculated to be 26. This ensures that our study has adequate power to yield valid statistical results. Furthermore, compared to earlier within-subject studies like Schallmo et al.'s 2018 research, which used 22 datasets to examine GABA levels in MT+ and the early visual cortex (EVC), our study includes a more extensive dataset.

      (3) In Schallmo elife 2018, there was no correlation between GABA concentration and SI. How can we justify the different results different here?

      Thank reviewer for pointing this out. There are several differences between us:

      a. While the earlier study by Schallmo et al. (2018) employed 3T MRS, we utilize 7T MRS, enhancing our ability to detect and measure GABA with greater accuracy.

      b. Schallmo elife 2018 choose to use the bilateral hMT+ as the MRS measurement region while we use the left hMT+. The reason why we focus on left hMT+ are describe in reviewer 1. (6). Briefly, use of left MT/V5 as a target was motivated by studies demonstrating that left MT/V5 TMS is more effective at causing perceptual effects (Tadin et al., 2011).

      c. The resolution of MRS sequence in Schallmo elife 2018 is 3 cm isotropic voxel, while we apply 2 cm isotropic voxel. This helps us more precisely locate hMT+ and exclude more white matter signal.

      (4) Basically this study contains the data of SI, BDT, GABA in MT+ and V1, Glu in MT+ and V1-all 6 measurements. There should be 6x5/2 = 15 pairwise correlations. However, not all of these results are included in Figure 1 and supplementary 1-3. I understand that it is not necessary to include all figures. But I suggest reporting all values in one Table.

      We thank the reviewer for the good suggestion, we are planning to make a correlation matrix to reporting all values.

      (5) In Melnick (2013), the IQ scores were measured by the full set of WAIS-III, including all subtests. However, this study only used the visual spatial domain of gF. I wonder why only the visuo-spatial subtest was used not the full WAIS-III?

      We thank the reviewer for pointing this out. The decision was informed by Melnick’s findings which indicated high correlations between Surround suppression (SI) and the Verbal Comprehension, Perceptual Reasoning, Working Memory, and Processing Speed Indexes, with correlation coefficients of 0.69, 0.47, 0.49, and 0.50, respectively. It is well-established that the hMT+ region of the brain is a sensory cortex involved in visual perception processing (3D perception). Furthermore, motion surround suppression (SI), a specific function of hMT+, aligns closely with this region's activities. Given this context, the Perception Reasoning sub-ability was deemed to have the clearest mechanism for further exploration. Consequently, we selected the most representative subtest of Perception Reasoning—the Block Design Test—which primarily assesses 3D visual intelligence.

      (6) In the functional connectivity part, there is no explanation as to why only the left MT+ was set to the seed region. What is the problem with the right MT+?

      We thank the reviewer for pointing this out. The main reason is that our MRS ROI is the left hMT+, we would like to make different models’ ROI consistent to each other. Use of left MT/V5 as a target was motivated by studies demonstrating that left MT/V5 TMS is more effective at causing perceptual effects (Tadin et al., 2011). In addition, we will check the results of our localizer to confirm whether similar findings are consistently replicated.

      (7) In Melnick (2013), the authors also reported the correlation between IQ and absolute duration thresholds of small and large stimuli. Please include these analyses as well.

      We thank the reviewer for the good advice. Containing such result do help researchers compare the result between Melnick and us. We are planning to make such picture in the revised version.

      Reviewer #2 (Public Review):

      Summary:

      Recent studies have identified specific regions within the occipito-temporal cortex as part of a broader fronto-parietal, domain-general, or "multiple-demand" (MD) network that mediates fluid intelligence (gF). According to the abstract, the authors aim to explore the mechanistic roles of these occipito-temporal regions by examining GABA/glutamate concentrations. However, the introduction presents a different rationale: investigating whether area MT+ specifically, could be a core component of the MD network.

      Strengths:

      The authors provide evidence that GABA concentrations in MT+ and its functional connectivity with frontal areas significantly correlate with visuo-spatial intelligence performance. Additionally, serial mediation analysis suggests that inhibitory mechanisms in MT+ contribute to individual differences in a specific subtest of the Wechsler Adult Intelligence Scale, which assesses visuo-spatial aspects of gF.

      Weaknesses:

      (1) While the findings are compelling and the analyses robust, the study's rationale and interpretations need strengthening. For instance, Assem et al. (2020) have previously defined the core and extended MD networks, identifying the occipito-temporal regions as TE1m and TE1p, which are located more rostrally than MT+. Area MT+ might overlap with brain regions identified previously in Fedorenko et al., 2013, however the authors attribute these activations to attentional enhancement of visual representations in the more difficult conditions of their tasks. For the aforementioned reasons, It is unclear why the authors chose MT+ as their focus. A stronger rationale for this selection is necessary and how it fits with the core/extended MD networks.

      We really appreciate reviewer’s opinions. The reason why we focus on hMT+ is following: According to the results of Melnick 2013, the motion surround suppression (SI) and the time thresholds of small and large gratings representing hMT+ functionality are correlated with Verbal Comprehension, Perceptual Reasoning, Working Memory, and Processing Speed Indicators, with high correlation coefficients of 0.69, 0.47, 0.49, and 0.50, respectively. In addition, Fedorenko et al. 2013, the averaged MD activity region appears to overlap with hMT+. Based on these findings, we assume that hMT+ does have the potential to become the core of MD system.

      (2) Moreover, although the study links MT+ inhibitory mechanisms to a visuo-spatial component of gF, this evidence alone may not suffice to position MT+ as a new core of the MD network. The MD network's definition typically encompasses a range of cognitive domains, including working memory, mathematics, language, and relational reasoning. Therefore, the claim that MT+ represents a new core of MD needs to be supported by more comprehensive evidence.

      Thank reviewer for pointing this out. After careful consideration, we agree with your point of view. Due to our results only delving into visuo-spatial intelligence, it is not yet sufficient to prove that hMT is the core node of the MD system. We will adjust the explanatory logic of the article, that is, emphasizing the de-redundancy of hMT+in visual-spatial intelligence and the improvement of information processing efficiency, while weakening the significance of hMT+ in MD systems.

      Reviewer #3 (Public Review):

      Summary:

      This manuscript aims to understand the role of GABA-ergic inhibition in the human MT+ region in predicting visuo-spatial intelligence through a combination of behavioral measures, fMRI (for functional connectivity measurement), and MRS (for GABA/glutamate concentration measurement). While this is a commendable goal, it becomes apparent that the authors lack fundamental understanding of vision, intelligence, or the relevant literature. As a result, the execution of the research is less coherent, dampening the enthusiasm of the review.

      Strengths:

      (1) Comprehensive Approach: The study adopts a multi-level approach, i.e., neurochemical analysis of GABA levels, functional connectivity, and behavioral measures to provide a holistic understanding of the relationship between GABA-ergic inhibition and visuo-spatial intelligence.

      (2) Sophisticated Techniques: The use of ultra-high field magnetic resonance spectroscopy (MRS) technology for measuring GABA and glutamate concentrations in the MT+ region is a recent development.

      Weaknesses:

      Study Design and Hypothesis

      (1) The central hypothesis of the manuscript posits that "3D visuo-spatial intelligence (the performance of BDT) might be predicted by the inhibitory and/or excitation mechanisms in MT+ and the integrative functions connecting MT+ with the frontal cortex." However, several issues arise:

      (1.1) The Suppression Index depicted in Figure 1a, labeled as the "behavior circle," appears irrelevant to the central hypothesis.

      We thank the reviewer for pointing this out. In our study, the inhibitory mechanisms in hMT+ are conceptualized through two models: the neurotransmitter model and the behavior model. The Suppression Index is essential for elucidating the local inhibitory mechanisms within behavior model. However, we acknowledge that our initial presentation in the introduction may not have clearly articulated our hypothesis, potentially leading to misunderstandings. We plan to revise the introduction to better clarify these connections and ensure the relevance of the Suppression Index is comprehensively understood.

      (1.2) The construct of 3D visuo-spatial intelligence, operationalized as the performance in the Block Design task, is inconsistently treated as another behavioral task throughout the manuscript, leading to confusion.

      We thank the reviewer for pointing this out. We acknowledge that our manuscript may have inconsistently presented this construct across different sections, causing confusion. To address this, we plan to ensure a consistent description of 3D visuo-spatial intelligence in both the introduction and the discussion sections. But we would like to maintain 'Block Design task score' within the results section to help readers clarify which subtest we use.

      (1.3) The schematics in Figure 1a and Figure 6 appear too high-level to be falsifiable. It is suggested that the authors formulate specific and testable hypotheses and preregister them before data collection.

      We thank the reviewer for pointing this out. We are planning to revise the Figure 1a and make it less abstract and more logical. For Figure 6, the schematic represents our theoretical framework of how hMT+ works in the 3D viso-spatial intelligence, we believe the elements within this framework are grounded in related theories and supported by evidence discussed in our results and discussions section, making them specific and testable.

      (2) Central to the hypothesis and design of the manuscript is a misinterpretation of a prior study by Melnick et al. (2013). While the original study identified a strong correlation between WAIS (IQ) and the Suppression Index (SI), the current manuscript erroneously asserts a specific relationship between the block design test (from WAIS) and SI. It should be noted that in the original paper, WAIS comprises Similarities, Vocabulary, Block design, and Matrix reasoning tests in Study 1, while the complete WAIS is used in Study 2. Did the authors conduct other WAIS subtests other than the block design task?

      Thanks for pointing this out. Reviewer #1 also asked this question, we copy the answers in here “The decision was informed by Melnick’s findings which indicated high correlations between Surround suppression (SI) and the Verbal Comprehension, Perceptual Reasoning, Working Memory, and Processing Speed Indexes, with correlation coefficients of 0.69, 0.47, 0.49, and 0.50, respectively. It is well-established that the hMT+ region of the brain is a sensory cortex involved in visual perception processing (3D perception). Furthermore, motion surround suppression (SI), a specific function of hMT+, aligns closely with this region's activities. Given this context, the Perception Reasoning sub-ability was deemed to have the clearest mechanism for further exploration. Consequently, we selected the most representative subtest of Perception Reasoning—the Block Design Test—which primarily assesses 3D visual intelligence.”

      (3) Additionally, there are numerous misleading references and unsubstantiated claims throughout the manuscript. As an example of misleading reference, "the human MT ... a key region in the multiple representations of sensory flows (including optic, tactile, and auditory flows) (Bedny et al., 2010; Ricciardi et al., 2007); this ideally suits it to be a new MD core." The two references in this sentence are claims about plasticity in the congenitally blind with sensory deprivation from birth, which is not really relevant to the proposal that hMT+ is a new MD core in healthy volunteers.

      Thanks for pointing this out. We have carefully read the corresponding references and considered the corresponding theories and agree with these comments. Due to our results only delving into “the GABA-ergic inhibition in human MT predicts visuo-spatial intelligence mediated by reverberation with frontal cortex”, it is not yet sufficient to prove that hMT+ is the core node of the MD system, we will adjust the explanatory logic of the article, that is, emphasizing the de redundancy of hMT+in visual-spatial intelligence and the improvement of information processing efficiency, while weakening the significance of hMT+ in MD systems. In addition, regarding the potential central role of hMT+ in the MD system, we agree with your view that research on hMT+ as a multisensory integration hub mainly focuses on developmental processes. Meanwhile, in adults, the MST region of hMT+ is considered a multisensory integration area for visual and vestibular inputs, which potentially supports the role of hMT+ in multitasking multisensory systems (Gu et al., J. Neurosci, 26(1), 73–85, 2006; Fetsch et al., Nat. Neurosci, 15, 146–154, 2012.). Further research could explore how other intelligence sub-ability such as working memory and language comprehension are facilitated by hMT+'s features.

      Another example of unsubstantiated claim: the rationale for selecting V1 as the control region is based on the assertion that "it mediates the 2D rather than 3D visual domain (Born & Bradley, 2005)". That's not the point made in the Born & Bradley (2005) paper on MT. It's crucial to note that V1 is where the initial binocular convergence occurs in cortex, i.e., inputs from both the right and left eyes to generate a perception of depth.

      Thank you for pointing this out. We acknowledge the inappropriate citation of "Born & Bradley, 2005," which focuses solely on the structure and function of the visual area MT. However, we believe that choosing hMT+ as the domain for 3D visual analysis and V1 as the control region is justified. Cumming and DeAngelis (Annu Rev Neurosci, 24:203–238.2001) state that binocular disparity provides the visual system with information about the three-dimensional layout of the environment, and the link between perception and neuronal activity is stronger in the extrastriate cortex (especially MT) than in the primary visual cortex(V1). This supports our choice and emphasizes the relevance of MT+ in our study. We will revise our reference in the revised version.

      Results & Discussion

      (1) The missing correlation between SI and BDT is crucial to the rest of the analysis. The authors should discuss whether they replicated the pattern of results from Melnick et al. (2013) despite using only one WAIS subtest.

      We thank for reviewer’s suggestion. Now the correlation result is placed in the supplemental material, we will put it back to the main text.

      (2) ROIs: can the authors clarify if the results are based on bilateral MT+/V1 or just those in the left hemisphere? Can the authors plot the MRS scan area in V1? I would be surprised if it's precise to V1 and doesn't spread to V2/3 (which is fine to report as early visual cortex).

      We thank for reviewer’s suggestion. We plan to draw the V1 ROI MRS scanning area and use the visual template to check if the scanning area contains V2/3. If it does, we will refer to it as the early visual cortex rather than specifically V1 in our reporting.

      (3) Did the authors examine V1 FC with either the frontal regions and/or whole brain, as a control analysis? If not, can the author justify why V1 serves as the control region only in the MRS but not in FC (Figure 4) or the mediation analysis (Figure 5)? That seems a little odd given that control analyses are needed to establish the specificity of the claim to MT+

      We thank for reviewer’s suggestion. We plan to do the V1 FC-behavior connection as control analysis. For mediation analysis, since V1 GABA/Glu has no correlation with BDT score, it is not sufficient to apply mediation analysis.

      (4) It is not clear how to interpret the similarity or difference between panels a and b in Figure 4.

      We thank reviewer for pointing this out. We plan to further interpret the difference between a and b in the revised version. Panels a represents BDT score correlated hMT+-region FC, which is obviously involved in frontal cortex. While panels b represents SI correlated hMT+-region FC, which shows relatively less regions. The overlap region is what we are interested in and explain how local inhibitory mechanisms works in the 3D viso-spatial intelligence. In addition, we would like to revise Figure 4 and point out the overlap region.

      (5) SI is not relevant to the authors‘ priori hypothesis, but is included in several mediation analyses. Can the authors do model comparisons between the ones in Figure 5c, d, and Figure S6? In other words, is SI necessary in the mediation model? There seem discrepancies between the necessity of SI in Figures 5c/S6 vs. Figure 5d.

      We thank the reviewer for highlighting this point. The relationship between the Suppression Index (SI) and our a priori hypotheses is elaborated in the response to reviewer 3, section (1). SI plays a crucial role in explicating how local inhibitory mechanisms function within the context of the 3D visuo-spatial task. Additionally, Figure 5c illustrates the interaction between the frontal cortex and hMT+, showing how the effects from the frontal cortex (BA46) on the Block Design Task are fully mediated by SI. This further underscores the significance of SI in our model.

      (6) The sudden appearance of "efficient information" in Figure 6, referring to the neural efficiency hypothesis, raises concerns. Efficient visual information processing occurs throughout the visual cortex, starting from V1. Thus, it appears somewhat selective to apply the neural efficiency hypothesis to MT+ in this context.

      We thank the reviewer for highlighting this point. There is no doubt that V1 involved in efficient visual information processing. However, in our result, the V1 GABA has no significant correlation between BDT score, suggesting that the V1 efficient processing might not sufficiently account for the individual differences in 3D viso-spatial intelligence. Additionally, we will clarify our use of the neural efficiency hypothesis by incorporating it into the introduction of our paper to better frame our argument.

      Transparency Issues:

      (1) Don't think it's acceptable to make the claim that "All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary information". It is the results or visualizations of data analysis, rather than the raw data themselves, that are presented in the paper/supp info.

      We thank reviewer for pointing this out. We realized that such expression will lead to confusion. We will delete this expression.

      (2) No GitHub link has been provided in the manuscript to access the source data, which limits the reproducibility and transparency of the study.

      We thank reviewer for pointing this out. We will attach the GitHub link in the revised version.

      Minor:

      "Locates" should be replaced with "located" throughout the paper. For example: "To investigate this issue, this study selects the human MT complex (hMT+), a region located at the occipito-temporal border, which represents multiple sensory flows, as the target brain area."

      We thank reviewer for pointing this out. We will revise it.

      Use "hMT+" instead of "MT+" to be consistent with the term in the literature.

      We thank reviewer for pointing this out. We agree to use hMT+ in the literature.

      "Green circle" in Figure 1 should be corrected to match its actual color.

      We thank reviewer for pointing this out. We will revise it.

      The abbreviation for the Wechsler Adult Intelligence Scale should be "WAIS," not "WASI."

      We thank reviewer for pointing this out. We will revise it.

    2. eLife assessment

      This study employed a comprehensive approach to examining how the MT+ region integrates into a complex cognition system in mediating human visuo-spatial intelligence. While the findings are useful, the experimental evidence is incomplete and the study design, hypothesis, analyses, writing, and presentation need to be improved. The work will be of interest to researchers in psychology, cognitive science, and neuroscience.

    3. Reviewer #1 (Public Review):

      Summary:

      The study of human intelligence has been the focus of cognitive neuroscience research, and finding some objective behavioral or neural indicators of intelligence has been an ongoing problem for scientists for many years. Melnick et al, 2013 found for the first time that the phenomenon of spatial suppression in motion perception predicts an individual's IQ score. This is because IQ is likely associated with the ability to suppress irrelevant information. In this study, a high-resolution MRS approach was used to test this theory. In this paper, the phenomenon of spatial suppression in motion perception was found to be correlated with the visuo-spatial subtest of gF, while both variables were also correlated with the GABA concentration of MT+ in the human brain. In addition, there was no significant relationship with the excitatory transmitter Glu. At the same time, SI was also associated with MT+ and several frontal cortex FCs.

      Strengths:

      (1) 7T high-resolution MRS is used.

      (2) This study combines the behavioral tests, MRS, and fMRI.

      Weaknesses:

      (1) In the intro, it seems to me that the multiple-demand (MD) regions are the key in this study. However, I didn't see any results associated with the MD regions. Did I miss something??

      (2) How was the sample size determined? Is it sufficient??

      (3) In Schallmo elife 2018, there was no correlation between GABA concentration and SI. How can we justify the different results different here?

      (4) Basically this study contains the data of SI, BDT, GABA in MT+ and V1, Glu in MT+ and V1-all 6 measurements. There should be 6x5/2 = 15 pairwise correlations. However, not all of these results are included in Figure 1 and supplementary 1-3. I understand that it is not necessary to include all figures. But I suggest reporting all values in one Table.

      (5) In Melnick (2013), the IQ scores were measured by the full set of WAIS-III, including all subtests. However, this study only used the visual spatial domain of gF. I wonder why only the visuo-spatial subtest was used not the full WAIS-III?

      (6) In the functional connectivity part, there is no explanation as to why only the left MT+ was set to the seed region. What is the problem with the right MT+?

      (7) In Melnick (2013), the authors also reported the correlation between IQ and absolute duration thresholds of small and large stimuli. Please include these analyses as well.

    4. Reviewer #2 (Public Review):

      Summary:

      Recent studies have identified specific regions within the occipito-temporal cortex as part of a broader fronto-parietal, domain-general, or "multiple-demand" (MD) network that mediates fluid intelligence (gF). According to the abstract, the authors aim to explore the mechanistic roles of these occipito-temporal regions by examining GABA/glutamate concentrations. However, the introduction presents a different rationale: investigating whether area MT+ specifically, could be a core component of the MD network.

      Strengths:

      The authors provide evidence that GABA concentrations in MT+ and its functional connectivity with frontal areas significantly correlate with visuo-spatial intelligence performance. Additionally, serial mediation analysis suggests that inhibitory mechanisms in MT+ contribute to individual differences in a specific subtest of the Wechsler Adult Intelligence Scale, which assesses visuo-spatial aspects of gF.

      Weaknesses:

      While the findings are compelling and the analyses robust, the study's rationale and interpretations need strengthening. For instance, Assem et al. (2020) have previously defined the core and extended MD networks, identifying the occipito-temporal regions as TE1m and TE1p, which are located more rostrally than MT+. Area MT+ might overlap with brain regions identified previously in Fedorenko et al., 2013, however the authors attribute these activations to attentional enhancement of visual representations in the more difficult conditions of their tasks. For the aforementioned reasons, It is unclear why the authors chose MT+ as their focus. A stronger rationale for this selection is necessary and how it fits with the core/extended MD networks.

      Moreover, although the study links MT+ inhibitory mechanisms to a visuo-spatial component of gF, this evidence alone may not suffice to position MT+ as a new core of the MD network. The MD network's definition typically encompasses a range of cognitive domains, including working memory, mathematics, language, and relational reasoning. Therefore, the claim that MT+ represents a new core of MD needs to be supported by more comprehensive evidence.

    5. Reviewer #3 (Public Review):

      Summary:

      This manuscript aims to understand the role of GABA-ergic inhibition in the human MT+ region in predicting visuo-spatial intelligence through a combination of behavioral measures, fMRI (for functional connectivity measurement), and MRS (for GABA/glutamate concentration measurement). While this is a commendable goal, it becomes apparent that the authors lack fundamental understanding of vision, intelligence, or the relevant literature. As a result, the execution of the research is less coherent, dampening the enthusiasm of the review.

      Strengths:

      (1) Comprehensive Approach: The study adopts a multi-level approach, i.e., neurochemical analysis of GABA levels, functional connectivity, and behavioral measures to provide a holistic understanding of the relationship between GABA-ergic inhibition and visuo-spatial intelligence.

      (2) Sophisticated Techniques: The use of ultra-high field magnetic resonance spectroscopy (MRS) technology for measuring GABA and glutamate concentrations in the MT+ region is a recent development.

      Weaknesses:

      Study Design and Hypothesis<br /> (1) The central hypothesis of the manuscript posits that "3D visuo-spatial intelligence (the performance of BDT) might be predicted by the inhibitory and/or excitation mechanisms in MT+ and the integrative functions connecting MT+ with the frontal cortex." However, several issues arise:<br /> 1.1 The Suppression Index depicted in Figure 1a, labeled as the "behavior circle," appears irrelevant to the central hypothesis.<br /> 1.2 The construct of 3D visuo-spatial intelligence, operationalized as the performance in the Block Design task, is inconsistently treated as another behavioral task throughout the manuscript, leading to confusion.<br /> 1.3 The schematics in Figure 1a and Figure 6 appear too high-level to be falsifiable. It is suggested that the authors formulate specific and testable hypotheses and preregister them before data collection.

      (2) Central to the hypothesis and design of the manuscript is a misinterpretation of a prior study by Melnick et al. (2013). While the original study identified a strong correlation between WAIS (IQ) and the Suppression Index (SI), the current manuscript erroneously asserts a specific relationship between the block design test (from WAIS) and SI. It should be noted that in the original paper, WAIS comprises Similarities, Vocabulary, Block design, and Matrix reasoning tests in Study 1, while the complete WAIS is used in Study 2. Did the authors conduct other WAIS subtests other than the block design task?

      (3) Additionally, there are numerous misleading references and unsubstantiated claims throughout the manuscript. As an example of misleading reference, "the human MT ... a key region in the multiple representations of sensory flows (including optic, tactile, and auditory flows) (Bedny et al., 2010; Ricciardi et al., 2007); this ideally suits it to be a new MD core." The two references in this sentence are claims about plasticity in the congenitally blind with sensory deprivation from birth, which is not really relevant to the proposal that hMT+ is a new MD core in healthy volunteers.<br /> Another example of unsubstantiated claim: the rationale for selecting V1 as the control region is based on the assertion that "it mediates the 2D rather than 3D visual domain (Born & Bradley, 2005)". That's not the point made in the Born & Bradley (2005) paper on MT. It's crucial to note that V1 is where the initial binocular convergence occurs in cortex, i.e., inputs from both the right and left eyes to generate a perception of depth.

      Results & Discussion<br /> (1) The missing correlation between SI and BDT is crucial to the rest of the analysis. The authors should discuss whether they replicated the pattern of results from Melnick et al. (2013) despite using only one WAIS subtest.

      (2) ROIs: can the authors clarify if the results are based on bilateral MT+/V1 or just those in the left hemisphere? Can the authors plot the MRS scan area in V1? I would be surprised if it's precise to V1 and doesn't spread to V2/3 (which is fine to report as early visual cortex).

      (3) Did the authors examine V1 FC with either the frontal regions and/or whole brain, as a control analysis? If not, can the author justify why V1 serves as the control region only in the MRS but not in FC (Figure 4) or the mediation analysis (Figure 5)? That seems a little odd given that control analyses are needed to establish the specificity of the claim to MT+.

      (4) It is not clear how to interpret the similarity or difference between panels a and b in Figure 4.

      (5) SI is not relevant to the authors' priori hypothesis, but is included in several mediation analyses. Can the authors do model comparisons between the ones in Figure 5c, d, and Figure S6? In other words, is SI necessary in the mediation model? There seem discrepancies between the necessity of SI in Figures 5c/S6 vs. Figure 5d.

      (6) The sudden appearance of "efficient information" in Figure 6, referring to the neural efficiency hypothesis, raises concerns. Efficient visual information processing occurs throughout the visual cortex, starting from V1. Thus, it appears somewhat selective to apply the neural efficiency hypothesis to MT+ in this context.

      Transparency Issues:<br /> (1) Don't think it's acceptable to make the claim that "All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary information". It is the results or visualizations of data analysis, rather than the raw data themselves, that are presented in the paper/supp info.

      (2) No GitHub link has been provided in the manuscript to access the source data, which limits the reproducibility and transparency of the study.

      Minor:<br /> "Locates" should be replaced with "located" throughout the paper. For example: "To investigate this issue, this study selects the human MT complex (hMT+), a region located at the occipito-temporal border, which represents multiple sensory flows, as the target brain area."

      Use "hMT+" instead of "MT+" to be consistent with the term in the literature.

      "Green circle" in Figure 1 should be corrected to match its actual color.

      The abbreviation for the Wechsler Adult Intelligence Scale should be "WAIS," not "WASI."

    1. Reviewer #1 (Public Review):

      The study by Chikermane and colleagues investigates the functional, structural, and dopaminergic network substrates of cortical beta oscillations (13-30 Hz). The major strength of the work lies in the methodology taken by the authors, namely a multimodal lesion network mapping. First, using invasive electrophysiological recordings from healthy cortical territories of epileptic patients they identify regions with the highest beta power. Next, they leverage open-access MRI data and PET atlases and use the identified high-beta regions as seeds to find (1) the whole-brain functional and structural maps of regions that form the putative underlying network of high-beta regions and (2) the spatial distribution of dopaminergic receptors that show correlation with nodal connectivity of the identified networks. These steps are achieved by generating aggregate functional, structural, and dopaminergic network maps using lead-DBS toolbox, and by contrasting the results with those obtained from high-alpha regions.

      The main findings are:<br /> (1) Beta power is strongest across frontal, cingulate, and insular regions in invasive electrophysiological data, and these regions map onto a shared functional and structural network.<br /> (2) The shared functional and structural networks show significant positive correlations with dopamine receptors across the cortex and basal ganglia (which is not the case for alpha, where correlations are found with GABA).

      Nevertheless, a few clarifications regarding the choice of high-power electrodes and distributions of functional connectivity maps (i.e., strength and sign across cortex and sub-cortex) can help with understanding the results.

    2. eLife assessment

      This study could pose an important step forward in understanding brain network embedding of beta oscillations, advancing our circuit-level understanding of the pathophysiology associated with frontal beta or dopaminergic alterations in psychiatric or neurological disorders. The study provides compelling evidence that beta oscillations across the neocortex and basal ganglia map onto shared functional and structural networks that show significant positive correlations with dopamine receptors.

    3. Reviewer #2 (Public Review):

      Summary:

      This is a very interesting paper that leveraged several publicly available datasets: invasive cortical recording in epilepsy patients, functional and structural connectomic data, and PET data related to dopaminergic and gaba-ergic synapses. These were combined to create a unified hypothesis of beta band oscillatory activity in the human brain. They show that beta frequency activity is ubiquitous, not just in sensorimotor areas, and cortical regions where beta predominated had high connectivity to regions high in dopamine re-update.

      Strengths:

      The authors leverage and integrate three publicly available human brain datasets in a creative way. While these public datasets are powerful tools for human neuroscience, it is innovative to combine these three types of data into a common brain space to generate novel findings and hypotheses. Findings are nicely controlled by separately examining cortical regions where alpha predominates (which have a different connectivity pattern). GABA uptake from PET studies is used as a control for the specificity of the relationship between beta activity and dopamine uptake. There is much interest in synchronized oscillatory activity as a mechanism of brain function and dysfunction, but the field is short on unifying hypotheses of why particular rhythms predominate in particular regions. This paper contributes nicely to that gap. It is ambitious in generating hypotheses, particularly that modulation of beta activity may be used as a "proxy" for modulating phasic dopamine release.

      Weaknesses:

      As the authors point out, the use of normative data is excellent for exploring hypotheses but does not address or explore individual variations which could lead to other insights. It is also biased to resting state activity; maps of task-related activity (if they were available) might show different findings.

      The figures, results, introduction, and methods are admirably clear and succinct but the discussion could be both shorter and more convincing.

    4. Reviewer #3 (Public Review):

      Summary:

      In this paper, Chikermane et al. leverages a large open dataset of intracranial recordings (sEEG or ECoG) to analyze resting state (eyes closed) oscillatory activity from a variety of human brain areas. The authors identify a dominant proportion of channels in which beta band activity (12-30Hz) is most prominent and subsequently seek to relate this to anatomical connectivity data by using the sEEG/ECoG electrodes as seeds in a large set of MRI data from the human connectome project. This reveals separate regions and white matter tracts for alpha (primarily occipital) and beta (prefrontal cortex and basal ganglia) oscillations. Finally, using a third available dataset of PET imaging, the authors relate the parcellated signals to dopamine signaling as estimated by spatial uptake patterns of dopamine, and reveal a significant correlation between the functional connectivity maps and the dopamine reuptake maps, suggesting a functional relationship between the two.

      Strengths:

      Overall, I found the paper well justified, focused on an important topic, and interesting. The authors' use of 3 different open datasets was creative and informative, and it significantly adds to our understanding of different oscillatory networks in the human brain, and their more elusive relation with neuromodulator signaling networks by adding to our knowledge of the association between beta oscillations and dopamine signaling. Even my main comments about the lack of a theta network analysis and discussion points are relatively minor, and I believe this paper is valuable and informative.

      Weaknesses:

      The analyses were adequate, and the authors cleverly leveraged these different datasets to build an interesting story. The main aspect I found missing (in addition to some discussion items, see below) was an examination of the theta network. Theta oscillations have been involved in a number of cognitive processes including spatial navigation and memory, and have been proposed to have different potential originating brain regions, and it would be informative to see how their anatomical networks (e.g. as in Figure 2) look like under the author's analyses.

      The authors devote a significant portion of the discussion to relating their findings to a popular hypothesis for the function of beta oscillations, the maintenance of the "status quo", mostly in the context of motor control. As the authors acknowledge, given the static nature of the data and lack of behavior, this interpretation remains largely speculative and I found it a bit too far-reaching given the data shown in the paper. In contrast, I missed a more detailed discussion on the growing literature indicating a role for beta in mood (e.g. in Kirkby et al. 2018), especially given the apparent lack of hippocampal and amygdala involvement in the paper, which was surprising.

      Major comment:

      • Although the proportion of electrodes with theta-dominant oscillations was lower (~15%) than alpha (~22%) or beta (~57%), it would be very valuable to also see the same analyses the authors carried out in these frequency bands extended to theta oscillations.

    1. eLife assessment

      The study presents a useful investigation of the relation between pupil size and saccade decision in human observers. Based on the premise that pupil size is a reliable proxy of "effort", the authors conclude that less costly saccade targets are preferred. The data were collected and analyzed using solid and validated methodology, but the evidence supporting the claim that effort drives saccade target selection is incomplete and alternative explanations are not ruled out.

    2. Reviewer #1 (Public Review):

      Vision is a highly active process. Humans move their eyes 3-4 times per second to sample information with high visual acuity from our environment, and where eye movements are directed is critical to our understanding of active vision. Here, the authors propose that the cost of making a saccade contributes critically to saccade selection (i.e., whether and where to move the eyes). The authors build on their own recent work that the effort (as measured by pupil size) that comes with planning and generating an eye movement varies with saccade direction. To do this, the authors first measured pupil size for different saccade directions for each participant. They then correlated the variations in pupil size obtained in the mapping task with the saccade decision in a free-choice task. The authors observed a striking correlation: pupil size in the mapping task predicted the decision of where to move the eyes in the free choice task. In this study, the authors provide a number of additional insightful analyses (e.g., based on saccade curvature, and saccade latency) and experiments that further support their claim that the decision to move the eyes is influenced by the effort to move the eyes in a particular direction. One experiment showed that the same influence of assumed saccade costs on saccade selection is observed during visual search in natural scenes. Moreover, increasing the cognitive load by adding an auditory counting task reduced the number of saccades, and in particular reduced the costly saccades. In sum, these experiments form a nice package that convincingly establishes the association between pupil size and saccade selection.

      In my opinion, the causal structure underlying the observed results is not so clear. While the relationship between pupil size and saccade selection is compelling, it is not clear that saccade-related effort (i.e., the cost of a saccade) really drives saccade selection. Given the correlational nature of this relationship, there are other alternatives that could explain the finding. For example, saccade latency and the variance in landing positions also vary across saccade directions. This can be interpreted for instance that there are variations in oculomotor noise across saccade directions, and maybe the oculomotor system seeks to minimize that noise in a free-choice task. In fact, given such a correlational result, many other alternative mechanisms are possible. While I think the authors' approach of systematically exploring what we can learn about saccade selection using pupil size is interesting, it would be important to know what exactly pupil size can add that was not previously known by simply analyzing saccade latency. For example, saccade latency anisotropies across saccade directions are well known, and the authors also show here that saccade costs are related to saccade latency. An important question would be to compare how pupil size and saccade latency uniquely contribute to saccade selection. That is, the authors could apply the exact same logic to their analysis by first determining how saccade latencies (or variations in saccade landing positions; see Greenwood et al., 2017 PNAS) vary across saccade directions and how this saccade latency map explains saccade selection in subsequent tasks. Is it more advantageous to use one or the other saccade metric, and how well does a saccade latency map correlate with a pupil size map?

      In addition to eye-movement-related anisotropies across the visual field, there are of course many studies reporting visual field anisotropies (see Himmelberg, Winawer & Carrasco, 2023, Trends in Neuroscience for a review). It would be interesting to understand how the authors think about visual field anisotropies in the context of their own study. Do they think that their results are (in)dependent on such visual field variations (see Greenwood et al., 2017, PNAS; Ohl, Kroell, & Rolfs, 2024, JEP:Gen for a similar discussion)?

      Finally, the authors conclude that their results "suggests that the eye-movement system and other cognitive operations consume similar resources that are flexibly allocated among each other as cognitive demand changes. The authors should speculate what these similar resources could mean? What are the specific operations of the auditory task that overlap in terms of resources with the eye movement system?

    3. Reviewer #2 (Public Review):

      The authors attempt to establish presaccadic pupil size as an index of 'saccade effort' and propose this index as one new predictor of saccade target selection. They only partially achieved their aim: When choosing between two saccade directions, the less costly direction, according to preceding pupil size, is preferred. However, the claim that with increased cognitive demand participants would especially cut costly directions is not supported by the data. I would have expected to see a negative correlation between saccade effort and saccade direction 'change' under increased load. Yet participants mostly cut upwards saccades, but not other directions that, according to pupil size, are equally or even more costly (e.g. oblique saccades).

      Strengths:

      The paper is well-written, easy to understand, and nicely illustrated.

      The sample size seems appropriate, and the data were collected and analyzed using solid and validated methodology.

      Overall, I find the topic of investigating factors that drive saccade choices highly interesting and relevant.

      Weaknesses:

      The authors obtain pupil size and saccade preference measures in two separate tasks. Relating these two measures is problematic because the computations that underly saccade preparation differ. In Experiment 1, the saccade is cued centrally, and has to be delayed until a "go-signal" is presented; In Experiment 2, an immediate saccade is executed to an exogenously cued peripheral target. The 'costs' in Experiment 1 (computing the saccade target location from a central cue; withholding the saccade) do not relate to Experiment 2. It is unfortunate, that measuring presaccadic pupil size directly in the comparatively more 'natural' Experiment 2 (where saccades did not have to be artificially withheld) does not seem to be possible. This questions the practical application of pupil size as an index of saccade effort

      The authors claim that the observed direction-specific 'saccade costs' obtained in Experiment 1 "were not mediated by differences in saccade properties, such as duration, amplitude, peak velocity, and landing precision (Figure 1e,f)". Saccade latency, however, was not taken into account here but is discussed for Experiment 2.

      The apparent similarity of saccade latencies and pupil size, however, is striking. Previous work shows shorter latencies for cardinal than oblique saccades, and shorter latencies for horizontal and upward saccades than downward saccades - directly reflecting the pupil sizes obtained in Experiment 1 as well as in the authors' previous study (Koevoet et al., 2023, PsychScience).

      -

      The authors state that "from a costs-perspective, it should be efficient to not only adjust the number of saccades (non-specific), but also by cutting especially expensive directions the most (specific)". However, saccade targets should be selected based on the maximum expected information gain. If cognitive load increases (due to an additional task) an effective strategy seems to be to perform less - but still meaningful - saccades. How would it help natural orienting to selectively cut saccades in certain (effortful) directions? Choosing saccade targets based on comfort, over information gain, would result in overall more saccades to be made - which is non-optimal, also from a cost perspective.

      Overall, I am not sure what practical relevance the relation between pupil size (measured in a separate experiment) and saccade decisions has for eye movement research/vision science. Pupil size does not seem to be a straightforward measure of saccade effort. Saccade latency, instead, can be easily extracted in any eye movement experiment (no need to conduct a separate, delayed saccade task to measure pupil dilation), and seems to be an equally good index.

    4. Reviewer #3 (Public Review):

      This manuscript extends previous research by this group by relating variation in pupil size to the endpoints of saccades produced by human participants under various conditions including trial-based choices between pairs of spots and search for small items in natural scenes. Based on the premise that pupil size is a reliable proxy of "effort", the authors conclude that less costly saccade targets are preferred. Finding that this preference was influenced by the performance of a non-visual, attention-demanding task, the authors conclude that a common source of effort animates gaze behavior and other cognitive tasks.

      Strengths:

      Strengths of the manuscript include the novelty of the approach, the clarity of the findings, and the community interest in the problem.

      Weaknesses:

      Enthusiasm for this manuscript is reduced by the following weaknesses:

      (1) A relationship between pupil size and saccade production seems clear based on the authors' previous and current work. What is at issue is the interpretation. The authors test one, preferred hypothesis, and the narrative of the manuscript treats the hypothesis that pupil size is a proxy of effort as beyond dispute or question. The stated elements of their argument seem to go like this:<br /> PROPOSITION 1: Pupil size varies systematically across task conditions, being larger when tasks are more demanding.<br /> PROPOSITION 2: Pupil size is related to the locus coeruleus.<br /> PROPOSITION 3: The locus coeruleus NE system modulates neural activity and interactions.<br /> CONCLUSION: Therefore, pupil size indexes the resource demand or "effort" associated with task conditions.<br /> How the conclusion follows from the propositions is not self-evident. Proposition 3, in particular, fails to establish the link that is supposed to lead to the conclusion.

      (2) The authors test one, preferred hypothesis and do not consider plausible alternatives. Is "cost" the only conceivable hypothesis? The hypothesis is framed in very narrow terms. For example, the cholinergic and dopamine systems that have been featured in other researchers' consideration of pupil size modulation are missing here. Thus, because the authors do not rule out plausible alternative hypotheses, the logical structure of this manuscript can be criticized as committing the fallacy of affirming the consequent.

      (3) The authors cite particular publications in support of the claim that saccade selection is influenced by an assessment of effort. Given the extensive work by others on this general topic, the skeptic could regard the theoretical perspective of this manuscript as too impoverished. Their work may be enhanced by consideration of other work on this general topic, e.g, (i) Shenhav A, Botvinick MM, Cohen JD. (2013) The expected value of control: an integrative theory of anterior cingulate cortex function. Neuron. 2013 Jul 24;79(2):217-40. (ii) Müller T, Husain M, Apps MAJ. (2022) Preferences for seeking effort or reward information bias the willingness to work. Sci Rep. 2022 Nov 14;12(1):19486. (iii) Bustamante LA, Oshinowo T, Lee JR, Tong E, Burton AR, Shenhav A, Cohen JD, Daw ND. (2023) Effort Foraging Task reveals a positive correlation between individual differences in the cost of cognitive and physical effort in humans. Proc Natl Acad Sci U S A. 2023 Dec 12;120(50):e2221510120.

      (4) What is the source of cost in saccade production? What is the currency of that cost? The authors state (page 13), "... oblique saccades require more complex oculomotor programs than horizontal eye movements because more neuronal populations in the superior colliculus (SC) and frontal eye fields (FEF) [76-79], and more muscles are necessary to plan and execute the saccade [76, 80, 81]." This statement raises questions and concerns. First, the basis of the claim that more neurons in FEF and SC are needed for oblique versus cardinal saccades is not established in any of the publications cited. Second, the authors may be referring to the fact that oblique saccades require coordination between pontine and midbrain circuits. This must be clarified. Second, the cost is unlikely to originate in extraocular muscle fatigue because the muscle fibers are so different from skeletal muscles, being fundamentally less fatigable. Third, if net muscle contraction is the cost, then why are upward saccades, which require the eyelid, not more expensive than downward? Thus, just how some saccades are more effortful than others is not clear.

      (5) The authors do not consider observations about variation in pupil size that seem to be incompatible with the preferred hypothesis. For example, at least two studies have described systematically larger pupil dilation associated with faster relative to accurate performance in manual and saccade tasks (e.g., Naber M, Murphy P. Pupillometric investigation into the speed-accuracy trade-off in a visuo-motor aiming task. Psychophysiology. 2020 Mar;57(3):e13499; Reppert TR, Heitz RP, Schall JD. Neural mechanisms for executive control of speed-accuracy trade-off. Cell Rep. 2023 Nov 28;42(11):113422). Is the fast relative to the accurate option necessarily more costly?

      (6) The authors draw conclusions based on trends across participants, but they should be more transparent about variation that contradicts these trends. In Figures 3 and 4 we see many participants producing behavior unlike most others. Who are they? Why do they look so different? Is it just noise, or do different participants adopt different policies?

    5. Author Response:

      We appreciate the thorough comments from the reviewers. Before revising the manuscript, we would like to briefly reply to the main concerns raised:

      • Is pupil size a reliable proxy of effort? A vast amount of work demonstrates that pupil size sensitively scales with fluctuations in effort: for instance, the pupil dilates when increasing load in working memory, or multiple object tracking tasks, and such pupillary effects robustly explain individual differences in cognitive ability and fluctuations in performance across trials.1–4 This extends to the planning of movements as pupil dilations are observed prior to the execution of (eye) movements.5 As reviewed previously6–12 (based on vast literature each), any increase in effort is associated with an increase in pupil size. Inadvertently, we phrased as if the link between effort and pupil size was established via shared neural correlates. However, this is not the case as the link between effort and pupil size had been established well before the underlying neural circuitry of this relationship was investigated in detail. During the revision, we plan to rewrite this section to clarify that pupil size indexes effort and to provide a clear distinction between this link and putative neural underpinnings of such effort-linked modulations.

      • Is saccade latency an alternative explanation for the link between effort and saccade selection? Longer saccade latencies may imply more complex oculomotor programming (e.g. saccades with larger amplitudes require longer latencies for non-microsaccades13, and latencies increase when distractors are presented14), and latencies are indeed known to differ across directions15,16. As suggested, it is possible that saccade latencies may also predict saccade preferences. However, even if this is the case, this would not constitute an alternative explanation. As saccade latency may index oculomotor programming complexity, it can potentially be considered an alternative outcome measure of effort, albeit restricted to the context of saccades. Therefore, if saccade latencies predict saccade preferences, this would not affect our conclusion, rather it would constitute as converging evidence that supports the conclusion that effort drives saccade selection.

      A related question is why one would use pupil size as a measure of effort, given the methodological care that pupillometry requires. There are a number of points that make pupil size sensible and promising in comparison with saccade latencies. In contrast to saccade latencies, pupil size allows to capture the effort of different effector systems (e.g. head or hand movements), and potentially even the effort associated with covert shifts of attention. Moreover, pupil size is a temporally rich and continuous measure that allows to isolate processes unfolding prior to (eye) movement onset (e.g. oculomotor programming). Together, this makes pupil size a powerful tool to study the costs of visual selection more broadly. In the revision, we will add analyses incorporating latencies and other other saccade metrics. We will also discuss the differences between pupil size and saccade latencies in capturing saccade costs and effort.

      • Are the current results causal or correlational? Most of the currently reported results are indeed correlational in nature. In our first tasks, we correlated pupil size during saccade planning to saccade preferences in a subsequent task. Although the link between across tasks was correlational, the observed relationship clearly followed our previously specified hypothesis.17 Moreover, experiments 1 and 2 of the visual search data replicated and extended this relationship. We also directly manipulated cognitive demand in the second visual search experiment. In line with the hypothesis that effort affects saccade selection, participants executed less saccades overall when performing a (primary) auditory dual task, and even cut the costly saccades most. Whilst mostly correlational, we do not know of a more fitting and parsimonious explanation for our findings than effort predicting saccade selection. We will address causality in the discussion for transparency and point more clearly to the second visual search experiment for causal evidence.

      References

      (1) Alnæs, D. et al. Pupil size signals mental effort deployed during multiple object tracking and predicts brain activity in the dorsal attention network and the locus coeruleus. J. Vis. 14, 1 (2014).

      (2) Koevoet, D., Strauch, C., Van der Stigchel, S., Mathôt, S. & Naber, M. Revealing visual working memory operations with pupillometry: Encoding, maintenance, and prioritization. WIREs Cogn. Sci. e1668 (2023) doi:10.1002/wcs.1668.

      (3) Robison, M. K. & Unsworth, N. Pupillometry tracks fluctuations in working memory performance. Atten. Percept. Psychophys. 81, 407–419 (2019).

      (4) Unsworth, N. & Miller, A. L. Individual Differences in the Intensity and Consistency of Attention. Curr. Dir. Psychol. Sci. 30, 391–400 (2021).

      (5) Richer, F. & Beatty, J. Pupillary Dilations in Movement Preparation and Execution. Psychophysiology 22, 204–207 (1985).

      (6) Bumke, O. Die Pupillenstörungen Bei Geistes-Und Nervenkrankheiten. (Fischer, 1911).

      (7) Kahneman, D. Attention and Effort. (Prentice-Hall, 1973).

      (8) van der Wel, P. & van Steenbergen, H. Pupil dilation as an index of effort in cognitive control tasks: A review. Psychon. Bull. Rev. 25, 2005–2015 (2018).

      (9) Loewenfeld, I. E. Mechanisms of reflex dilatation of the pupil. Doc. Ophthalmol. 12, 185–448 (1958).

      (10) Mathôt, S. Pupillometry: Psychology, Physiology, and Function. J. Cogn. 1, 16 (2018).

      (11) Sirois, S. & Brisson, J. Pupillometry. WIREs Cogn. Sci. 5, 679–692 (2014).

      (12) Strauch, C., Wang, C.-A., Einhäuser, W., Van der Stigchel, S. & Naber, M. Pupillometry as an integrated readout of distinct attentional networks. Trends Neurosci. 45, 635–647 (2022).

      (13) Kalesnykas, R. P. & Hallett, P. E. Retinal eccentricity and the latency of eye saccades. Vision Res. 34, 517–531 (1994).

      (14) Walker, R., Deubel, H., Schneider, W. X. & Findlay, J. M. Effect of Remote Distractors on Saccade Programming: Evidence for an Extended Fixation Zone. J. Neurophysiol. 78, 1108–1119 (1997).

      (15) Hanning, N. M., Himmelberg, M. M. & Carrasco, M. Presaccadic attention enhances contrast sensitivity, but not at the upper vertical meridian. iScience 25, 103851 (2022).

      (16) Hanning, N. M., Himmelberg, M. M. & Carrasco, M. Presaccadic Attention Depends on Eye Movement Direction and Is Related to V1 Cortical Magnification. J. Neurosci. 4

      4, (2024).

      (17) Koevoet, D., Strauch, C., Naber, M. & Van der Stigchel, S. The Costs of Paying Overt and Covert Attention Assessed With Pupillometry. Psychol. Sci. 34, 887–898 (2023).

    1. eLife assessment

      This is a fundamental study that advances our understanding of the contribution of somatic variations in microglia that may contribute to the onset or progression of neurodegenerative disease. Specifically, during Alzheimer's disease, somatic mutations were identified in the MAPK pathway genes. The findings presented here are backed by compelling evidence drawn from a patient cohort, along with mechanistic proof-of-concept studies. Collectively, this research will be of interest to a wide audience, particularly those involved in the study of somatic mutations, neurodegeneration, immunology, and cell signalling.

    2. Reviewer #1 (Public Review):

      In the manuscript "A microglia clonal inflammatory disorder in Alzheimer's Disease", Vicario et al. provide a compelling study elucidating a potential contribution of somatic mutations within the microglia population of the CNS that accelerates microglia activation and disease-associated gene signatures in Alzheimer's disease. Here they especially identified an "enrichment" of pathological SNVs in microglia, but not the peripheral blood, that are associated with clonal proliferative disorders and neurological diseases in a subset of patients with AD. Convincingly, they identified P-SNVs in microglia of AD patients located within the ring domain of CBL, a negative regulator of MAPK signaling. They further provide mechanistic insights into how these variants result in MAPK over-activation and subsequently in a pro-inflammatory phenotype in human microglia-like cells in vitro.

      Overall, this study provides clear and detailed evidence from an AD patient cohort pointing to a potential contribution of microglia-specific somatic mutations to disease onset and/or progression in a subset of patients with Alzheimer's disease.

      Strengths:<br /> As outlined above, the study identified P-SNVs in microglia of AD patients associated with clonal proliferative disorders, but also gave an in-depth analysis of re-occurring P-SNVs located within the ring domain of CBL, a negative regulator of MAPK signaling. They further provide mechanistic insights into how these variants result in MAPK over-activation and subsequently in a pro-inflammatory phenotype in HEK cells, BV2 cells, MAC cells, and human microglia-like cells in vitro.

      Great care was taken here to validate their hypotheses at each step, as well as to identify the limitations of the possible conclusions. For example, they highlight that the pathway proposed to be affected may be an explanation for a subset of AD patients, and emphasize that it is yet unclear whether this accumulation of pathological SNVs is a cause or consequence of disease progression

      The study clearly supports an enrichment of P-SNVs in several genes associated with clonal proliferative disorders in microglia and nicely separates this from SNVs associated with clonal hematopoiesis in the peripheral blood found in AD patients and controls.

      The authors further acknowledged that several age-matched control patients were diagnosed with cancer or tumor-associated diseases and carefully dissected the occurring SNVs in these patients are not associated with the P-SNVs identified in the microglial compartment of the AD cohort.

      Weaknesses:

      Even though the study is overall very convincing, several points could help to connect the seen somatic variants in microglia more with a potential role in disease progression. The connection of P-SNVs in the genes chosen from neurological disorders was not further highlighted by the authors.

      The authors show in snRNA-seq data that a disease-associated microglia state seems to be enriched in patients with somatic variants in the CBL ring domain, however, this analysis could be deepened. For example, how this knowledge may translate to patient benefits when the relevant cell populations appear concentrated in a single patient sample (Figure 5; AD52) is unclear; increasing the analyzed patient pool for Figure 5 and showcasing the presence of this microglia state of interest in a few more patients with driving mutations for CBL or other MAPK pathway associated mutations would lend their hypotheses further credibility.

      A potential connection between P-SNVs in microglia and disease pathology and symptoms was not further explored by the authors.

      A recent preprint (Huang et al., 2024) connected the occurrence of somatic variants in genes associated with clonal hematopoiesis in microglia in a large cohort of AD patients, this study is not further discussed or compared to the data in this manuscript.

    3. Reviewer #2 (Public Review):

      Summary:

      In this study, Vicaro et al. aimed to quantify and characterize mosaic mutations in human sporadic Alzheimer's disease (AD) brain samples. They focused on three broad classes of brain cells, neurons that express the marker NeuN, microglia that express the marker PU.1, and double-negative cells that presumably comprise all other brain cell types, including astrocytes, oligodendrocytes, oligodendrocyte progenitor cells, and endothelial cells. The authors find an enrichment of potentially pathogenic somatic mutations in AD microglia compared to controls, with MAPK pathway genes being particularly enriched for somatic mutations in those cells. The authors report a striking enrichment for mutations in the gene CBL and use in vitro functional assays to show that these mutations indeed induce MAPK pathway activation.

      The current state of the AD and somatic mutation fields puts this work into context. First, AD is a devastating disease whose prevalence is only increasing as the population of the U.S. is aging, necessitating the investigation of novel features of AD to identify new therapeutic opportunities. Second, microglia have recently come into focus as important players in AD pathogenesis. Many AD risk genes are selectively expressed in microglia, and microglia from AD brain samples show a distinct transcriptional profile indicating an inflammatory phenotype. The authors' previous work shows that a genetic mouse model of mosaic BRAF activation in macrophages (including microglia) displays a neurodegenerative phenotype similar to AD (Mass et al., 2017, doi:10.1038/nature23672). Third, new technological developments have allowed for identifying mosaic mutations present in only a small fraction of or even single cells. Together, these data form a rationale for studying mosaic mutations in microglia in AD. In light of the authors' findings regarding MAPK pathway gene somatic mutations, it is also important to note that MAPK has previously been implicated in AD neuroinflammation in the literature.

      Strengths:

      The study demonstrated several strengths.

      Firstly, the authors used two methods to identify mosaic mutations:<br /> (1) deep (~1,100x) DNA sequencing of a targeted panel of 716 genes they hypothesized might, if mutated somatically, play a role in AD, and<br /> (2) deep (400x) whole-exome sequencing (WES) to identify clonal mosaics outside of those 716 genes.

      A second strength is the agreement between these experiments, where WES found many variants identified in the panel experiment, and both experiments revealed somatic mutations in MAPK pathway genes.

      Third, the authors demonstrated in several in vitro systems that many mutations they identified in MAPK genes activate MAPK signaling. Finally, the authors showed that in some human brain samples, single-cell gene expression analysis revealed that cells bearing a mosaic MAPK pathway mutation displayed dysregulated inflammatory signaling and dysregulation in other pathways. This single-cell analysis was in agreement with their in vitro analyses.

      Weaknesses:

      The study also showed some weaknesses. The sample size (45 AD donors and 44 controls) is small, reflected in the relatively modest effect sizes and p-values observed. This weakness is partially ameliorated by the authors' extensive molecular and functional validation of mutation candidates. Another weakness is the lack of discussion of whether the genes found to be mutated somatically in AD show any AD-risk alleles in the population. If they did, it would further support the authors' conclusions that they are playing a role in AD. Finally, as the authors point out, this study cannot conclude whether microglial mosaic mutations cause AD or are an effect of AD. Future studies may shed more light on this important question.

      Conclusions and Impact:

      Considering the study's aims, strengths, and weaknesses, I conclude that the authors achieved their goal of characterizing the role of mosaic mutations in human AD. Their data strongly suggest that mosaic MAPK mutations in microglia are associated with AD. The impacts of this study remain to be seen, but they could include attempts to target CBL or other mutated genes in the treatment of AD. This work also suggests a similar approach to identifying potentially causative somatic mutations in other neurodegenerative diseases.

    1. eLife assessment

      This study offers a valuable description of the layer-and sublayer specific outputs of the somatosensory cortex based on convincing evidence obtained with modern tools for the analysis of brain connectivity, together with functional validation of the connectivity using optogenetic approaches in vivo. Beyond bridging together, in one dataset, the results of disparate studies, this effort brings new insights on layer specific outputs, and on differences between primary and secondary somatosensory areas. This study will be of interest to neuroanatomists and neurophysiologists.

    2. Reviewer #1 (Public Review):

      Summary:

      This is a fine paper that serves the purpose to show that the use of light sheet imaging may be used to provide whole brain imaging of axonal projections. The data provided suggest that at this point the technique provides lower resolution than with other techniques. Nonetheless, the technique does provide useful, if not novel, information about particular brain systems.

      Strengths:

      The manuscript is well written. In the introduction a clear description of the functional organization of the barrel cortex is provided provides the context for applying the use of specific Cre-driver lines to map the projections of the main cortical projection types using whole brain neuroanatomical tracing techniques. The results provided are also well written, with sufficient detail describing the specifics of how techniques were used to obtain relevant data. Appropriate controls were done, including the identification of whisker fields for viral injections and determination of the laminar pattern of Cre expression. The mapping of the data provides a good way to visualize low resolution patterns of projections.

      Weaknesses:

      (1) The results provided are, as stated in the discussion, "largely in agreement with previously reported studies of the major projection targets". However it must be stated that the study does not "extend current knowledge through the high sensitivity for detecting sparse axons, the high specificity of labeling of genetically defined classes of neurons and the brain wide analysis for assigning axons to detailed brain regions" which have all been published in numerous other studies. ( the allen connectivity project and related papers, along with others). If anything the labeling of axons obtained with light sheet imaging in this study does not provide as detailed mapping obtained with other techniques. Some detail is provided of how the raw images are processed to resolve labeled axons, but the images shown in the figures do not demonstrate how well individual axons may be resolved, of particular interest would be to see labeling in terminal areas such as other cortical areas, striatum and thalamus. As presented the light sheet imaging appears to be rather low resolution compared to the many studies that have used viral tracing to look at cortical projections from genetically identified cortical neurons.<br /> (2) Amongst the limitations of this study is the inability to resolve axons of passage and terminal fields. This has been done in other studies with viral constructs labeling synaptophysin. This should be mentioned.<br /> (3) There is no quantitative analysis of differences between the genetically defined neurons projecting to the striatum, what is the relative area innervated by, density of terminals, other measures.<br /> (4) Figure 5 is an example of the type of large sets of data that can be generated with whole brain mapping and registration to the Allen CCF that provides information of questionable value. Ordering the 50 plus structures by the density of labeling does not provide much in terms of relative input to different types of areas. There are multiple subregions for different functional types ( ie, different visual areas and different motor subregions are scattered not grouped together. Makes it difficult to understand any organizing principles.<br /> (5) The GENSAT Cre driver lines used must have the specific line name used, not just the gene name as the GENSAT BAC-Cre lines had multiple lines for each gene and often with very different expression patterns. Rbp4_KL100, Tlx3_PL56, Sim1_KJ18, Ntsr1_ GN220.

    3. Reviewer #2 (Public Review):

      Summary:

      This study takes advantage of multiple methodological advances to perform layer-specific staining of cortical neurons and tracking of their axons to identify the pattern of their projections. This publication offers a mesoscale view of the projection patterns of neurons in the whisker primary and secondary somatosensory cortex. The authors report that, consistent with the literature, the pattern of projection is highly different across cortical layers and subtype, with targets being located around the whole brain. This was tested across 6 different mouse types that expressed a marker in layer 2/3, layer 4, layer 5 (3 sub-types) and layer 6.<br /> Looking more closely at the projections from primary somatosensory cortex into the primary motor cortex, they found that there was a significant spatial clustering of projections from topographically separated neurons across the primary somatosensory cortex. This was true for neurons with cell bodies located across all tested layers/types.

      Strengths:

      This study successfully looks at the relevant scale to study projection patterns, which is the whole brain. This is achieved thanks to an ambitious combination of mouse lines, immuno-histochemistry, imaging and image processing, which results in a standardized histological pipeline that processes the whole-brain projection patterns of layer-selected neurons of the primary and secondary somatosensory cortex.<br /> This standardization means that comparisons between cell-types projection patterns are possible and that both the large-scale structure of the pattern and the minute details of the intra-areas pattern are available.<br /> This reference dataset and the corresponding analysis code are made available to the research community.

      Weaknesses:

      One major question raised by this dataset is the risk of missing axons during the post-processing step. Indeed, it appears that the control and training efforts have focused on the risk of false positives (see Figure 1 supplementary panels). And indeed, the risk of overlooking existing axons in the raw fluorescence data id discussed in the article.

      Based on the data reported in the article, this is more than a risk. In particular, Figure 2 shows an example Rbp4-L5 mouse where axonal spread seems massive in Hippocampus, while there is no mention of this area in the processed projection data for this mouse line.

      Similarily, the Ntsr1-L6CT example shows a striking level of fluorescence in Striatum, that does not reflect in the amount of axons that are detected by the algorithms in the next figures.<br /> These apparent discrepancies may be due to non axonal-specific fluorescence in the samples. In any case, further analysis of such anatomical areas would be useful to consolidate the valuable dataset provided by the article.

    4. Reviewer #3 (Public Review):

      Summary:

      -The paper offers a systematic and rigorous description of the layer-and sublayer specific outputs of the somatosensory cortex using a modern toolbox for the analysis of brain connectivity which combines: 1) Layer-specific genetic drivers for conditional viral tracing; 2) whole brain analyses of axon tracts using tissue clearing and imaging; 3) Segmentation and quantification of axons with normalization to the number of transduced neurons; 4) registration of connectivity to a widely used anatomical reference atlas; 5) functional validation of the connectivity using optogenetic approaches in vivo.

      Strengths:

      - Although the connectivity of the somatosensory cortex is already known, precise data are dispersed in different accounts (papers, online resources,) using different methods. So the present account has the merit of condensing this information in one very precisely documented report. It also brings new insights on the connectivity, such as the precise comparison of layer specific outputs, and of the primary and secondary somatosensory areas. It also shows a topographic organization of the circuits linking the somatosensory and motor cortices. The paper also offers a clear description of the methodology and of a rigorous approach to quantitative anatomy.

      Weaknesses:

      The weakness relates to the intrinsic limitations of the in toto approaches, that currently lack the precision and resolution allowing to identify single axons, axon branching or synaptic connectivity. These limitations are identified and discussed by the authors.

    1. eLife assessment

      Abbasi and colleagues use Granger causality to explore the cortico-subcortical dynamics during speaking and listening. They find valuable evidence for bi-directional connectivity in distinct frequency bands as a function of behaviour, but currently offer incomplete support for the validity of their analyses and the predictive coding interpretation of their results.

    2. Reviewer #1 (Public Review):

      Abbasi et al. assess in this MEG study the directed connectivity of both cortical and subcortical regions during continuous speech production and perception. The authors observed bidirectional connectivity patterns between speech-related cortical areas as well as subcortical areas in production and perception. Interestingly, they found in speaking low-frequency connectivity from subcortical (the right cerebellum) to cortical (left superior temporal) areas, while connectivity from the cortical to subcortical areas was in the high frequencies. In listening a similar cortico-subcortical connectivity pattern was observed for the low frequencies, but the reversed connectivity in the higher frequencies was absent.

      The work by Abbasi and colleagues addresses a relevant, novel topic, namely understanding the brain dynamics between speaking and listening. This is important because traditionally production and perception of speech and language are investigated in a modality-specific manner. To have a more complete understanding of the neurobiology underlying these different speech behaviors, it is key to also understand their similarities and differences. Furthermore, to do so, the authors utilize state-of-the-art directed connectivity analyses on MEG measurements, providing a quite detailed profile of cortical and subcortical interactions for the production and perception of speech. Importantly, and perhaps most interesting in my opinion, is that the authors find evidence for frequency-specific directed connectivity, which is (partially) different between speaking and listening. This could suggest that both speech behaviors rely (to some extent) on similar cortico-cortical and cortico-subcortical networks, but different frequency-specific dynamics.

      These elements mentioned above (investigation of both production and perception, both cortico-cortical and cortico-subcortical connectivity is considered, and observing frequency-specific connectivity profiles within and between speech behaviors), make for important novel contributions to the field. Notwithstanding these strengths, I find that they are especially centered on methodology and functional anatomical description, but that precise theoretical contributions for neurobiological and cognitive models of speech are less transparent. This is in part because the study compares speech production and perception in general, but no psychophysical or psycholinguistic manipulations are considered. I also have some critical questions about the design which may pose some confounds in interpreting the data, especially with regard to comparing production and perception.

      (1) While the cortico-cortical and cortico-subcortical connectivity profiles highlighted in this study and the depth of the analyses are impressive, what these data mean for models of speech processing remains on the surface. This is in part due, I believe, to the fact that the authors have decided to explore speaking and listening in general, without targeting specific manipulations that help elucidate which aspects of speech processing are relevant for the particular connectivity profiles they have uncovered. For example, the frequency-specific directed connectivity is it driven by low-level psychophysical attributes of the speech or by more cognitive linguistic properties? Does it relate to the monitoring of speech, timing information, and updating of sensory predictions? Without manipulations trying to target one or several of these components, as some of the referenced work has done (e.g., Floegel et al., 2020; Stockert et al., 2021; Todorović et al., 2023), it is difficult to draw concrete conclusions as to which representations and/or processes of speech are reflected by the connectivity profiles. An additional disadvantage of not having manipulations within each speech behavior is that it makes the comparison between listening and speaking harder. That is, speaking and listening have marked input-output differences which likely will dominate any comparison between them. These physically driven differences (or similarities for that matter; see below) can be strongly reduced by instead exploring the same manipulations/variables between speaking and listening. If possible (if not to consider for future work), it may be interesting to score psychophysical (e.g., acoustic properties) or psycholinguistic (e.g., lexical frequency) information of the speech and see whether and how the frequency-specific connectivity profiles are affected by it.

      (2) Recent studies comparing the production and perception of language may be relevant to the current study and add some theoretical weight since their data and interpretations for the comparisons between production and perception fit quite well with the observations in the current work. These studies highlight that language processes between production and perception, specifically lexical and phonetic processing (Fairs et al., 2021), and syntactic processing (Giglio et al., 2024), may rely on the same neural representations, but are differentiated in their (temporal) dynamics upon those shared representations. This is relevant because it dispenses with the classical notion in neurobiological models of language where production and perception rely on (partially) dissociable networks (e.g., Price, 2010). Rather those data suggest shared networks where different language behaviors are dissociated in their dynamics. The speech results in this study nicely fit and extend those studies and their theoretical implications.

      (3) The authors align the frequency-selective connectivity between the right cerebellum and left temporal speech areas with recent studies demonstrating a role for the right cerebellum for the internal modelling in speech production and monitoring (e.g., Stockert et al., 2021; Todorović et al., 2023). This link is indeed interesting, but it does seem relevant to point out that at a more specific scale, it does not concern the exact same regions between those studies and the current study. That is, in the current study the frequency-specific connectivity with temporal regions concerns lobule VI in the right cerebellum, while in the referenced work it concerns Crus I/II. The distinction seems relevant since Crus I/II has been linked to the internal modelling of more cognitive behavior, while lobule VI seems more motor-related and/or contextual-related (e.g., D'Mello et al., 2020; Runnqvist et al., 2021; Runnqvist, 2023).

      (4) On the methodological side, my main concern is that for the listening condition, the authors have chosen to play back the speech produced by the participants in the production condition. Both the fixed order as well as hearing one's own speech as listening condition may produce confounds in data interpretation, especially with regard to the comparison between speech production and perception. Could order effects impact the observed connectivity profiles, and how would this impact the comparison between speaking and listening? In particular, I am thinking of repetition effects present in the listening condition as well as prediction, which will be much more elevated for the listening condition than the speaking condition. The fact that it also concerns their own voice furthermore adds to the possible predictability confound (e.g., Heinks-Maldonado et al., 2005). In addition, listening to one's speech which just before has been articulated may, potentially strategically even, enhance inner speech and "mouthing" in the participants, hereby thus engaging the production mechanism. Similarly, during production, the participants already hear their own voice (which serves as input in the subsequent listening condition). Taken together, both similarities or differences between speaking and listening connectivity may have been due to or influenced by these order effects, and the fact that the different speech behaviors are to some extent present in both conditions.

      (5) The ability of the authors to analyze the spatiotemporal dynamics during continuous speech is a potentially important feat of this study, given that one of the reasons that speech production is much less investigated compared to perception concerns motor and movement artifacts due to articulation (e.g., Strijkers et al., 2010). Two questions did spring to mind when reading the authors' articulation artifact correction procedure: If I understood correctly, the approach comes from Abbasi et al. (2021) and is based on signal space projection (SSP) as used for eye movement corrections, which the authors successfully applied to speech production. However, in that study, it concerned the repeated production of three syllables, while here it concerns continuous speech of full words embedded in discourse. The articulation and muscular variance will be much higher in the current study compared to three syllables (or compared to eye movements which produce much more stable movement potentials compared to an entire discourse). Given this, I can imagine that corrections of the signal in the speaking condition were likely substantial and one may wonder (1) how much signal relevant to speech production behavior is lost?; (2) similar corrections are not necessary for perception, so how would this marked difference in signal processing affect the comparability between the modalities?

      References:<br /> - Abbasi, O., Steingräber, N., & Gross, J. (2021). Correcting MEG artifacts caused by overt speech. Frontiers in Neuroscience, 15, 682419.<br /> - D'Mello, A. M., Gabrieli, J. D., & Nee, D. E. (2020). Evidence for hierarchical cognitive control in the human cerebellum. Current Biology, 30(10), 1881-1892.<br /> - Fairs, A., Michelas, A., Dufour, S., & Strijkers, K. (2021). The same ultra-rapid parallel brain dynamics underpin the production and perception of speech. Cerebral Cortex Communications, 2(3), tgab040.<br /> - Floegel, M., Fuchs, S., & Kell, C. A. (2020). Differential contributions of the two cerebral hemispheres to temporal and spectral speech feedback control. Nature Communications, 11(1), 2839.<br /> - Giglio, L., Ostarek, M., Sharoh, D., & Hagoort, P. (2024). Diverging neural dynamics for syntactic structure building in naturalistic speaking and listening. Proceedings of the National Academy of Sciences, 121(11), e2310766121.<br /> - Heinks‐Maldonado, T. H., Mathalon, D. H., Gray, M., & Ford, J. M. (2005). Fine‐tuning of auditory cortex during speech production. Psychophysiology, 42(2), 180-190.<br /> - Price, C. J. (2010). The anatomy of language: a review of 100 fMRI studies published in 2009. Annals of the new York Academy of Sciences, 1191(1), 62-88.<br /> - Runnqvist, E., Chanoine, V., Strijkers, K., Pattamadilok, C., Bonnard, M., Nazarian, B., ... & Alario, F. X. (2021). Cerebellar and cortical correlates of internal and external speech error monitoring. Cerebral Cortex Communications, 2(2), tgab038.<br /> - Runnqvist, E. (2023). Self-monitoring: The neurocognitive basis of error monitoring in language production. In Language production (pp. 168-190). Routledge.<br /> - Stockert, A., Schwartze, M., Poeppel, D., Anwander, A., & Kotz, S. A. (2021). Temporo-cerebellar connectivity underlies timing constraints in audition. Elife, 10, e67303.<br /> - Strijkers, K., Costa, A., & Thierry, G. (2010). Tracking lexical access in speech production: electrophysiological correlates of word frequency and cognate effects. Cerebral cortex, 20(4), 912-928.<br /> - Todorović, S., Anton, J. L., Sein, J., Nazarian, B., Chanoine, V., Rauchbauer, B., ... & Runnqvist, E. (2023). Cortico-cerebellar monitoring of speech sequence production. Neurobiology of Language, 1-21.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors re-analyse MEG data from a speech production and perception study and extend their previous Granger causality analysis to a larger number of cortical-cortical and in particular cortical-subcortical connections. Regions of interest were defined by means of a meta-analysis using Neurosynth.org and connectivity patterns were determined by calculating directed influence asymmetry indices from the Granger causality analysis results for each pair of brain regions. Abbasi et al. report feedforward signals communicated via fast rhythms and feedback signals via slow rhythms below 40 Hz, particularly during speaking. The authors highlight one of these connections between the right cerebellum lobule VI and auditory association area A5, where in addition the connection strength correlates negatively with the strength of speech tracking in the theta band during speaking (significant before multiple comparison correction). Results are interpreted within a framework of active inference by minimising prediction errors.

      While I find investigating the role of cortical-subcortical connections in speech production and perception interesting and relevant to the field, I am not yet convinced that the methods employed are fully suitable to this endeavour or that the results provide sufficient evidence to make the strong claim of dissociation of bottom-up and top-down information flow during speaking in distinct frequency bands.

      Strengths:

      The investigation of electrophysiological cortical-subcortical connections in speech production and perception is interesting and relevant to the field. The authors analyse a valuable dataset, where they spent a considerable amount of effort to correct for speech production-related artefacts. Overall, the manuscript is well-written and clearly structured.

      Weaknesses:

      The description of the multivariate Granger causality analysis did not allow me to fully grasp how the analysis was performed and I hence struggled to evaluate its appropriateness.<br /> Knowing that (1) filtered Granger causality is prone to false positives and (2) recent work demonstrates that significant Granger causality can simply arise from frequency-specific activity being present in the source but not the target area without functional relevance for communication (Schneider et al. 2021) raises doubts about the validity of the results, in particular with respect to their frequency specificity. These doubts are reinforced by what I perceive as an overemphasis on results that support the assumption of specific frequencies for feedforward and top-down connections, while findings not aligning with this hypothesis appear to be underreported. Furthermore, the authors report some main findings that I found difficult to reconcile with the data presented in the figures. Overall, I feel the conclusions with respect to frequency-specific bottom-up and top-down information flow need to be moderated and that some of the reported findings need to be checked and if necessary corrected.

      Major points

      (1) I think more details on the multivariate GC approach are needed. I found the reference to Schaum et al., 2021 not sufficient to understand what has been done in this paper. Some questions that remained for me are:

      (i) Does multivariate here refer to the use of the authors' three components per parcel or to the conditioning on the remaining twelve sources? I think the latter is implied when citing Schaum et al., but I'm not sure this is what was done here?

      If it was not: how can we account for spurious results based on indirect effects?

      (ii) Did the authors check whether the GC of the course-target pairs was reliably above the bias level (as Schaum et. al. did for each condition separately)? If not, can they argue why they think that their results would still be valid? Does it make sense to compute DAIs on connections that were below the bias level? Should the data be re-analysed to take this concern into account?

      (iii) You may consider citing the paper that introduced the non-parametric GC analysis (which Schaum et al. then went on to apply): Dhamala M, Rangarajan G, Ding M. Analyzing Information Flow in Brain Networks with Nonparametric Granger Causality. Neuroimage. 2008; 41(2):354-362. https://doi.org/10.1016/j.neuroimage.2008.02. 020

      (2) GC has been discouraged for filtered data as it gives rise to false positives due to phase distortions and the ineffectiveness of filtering in the information-theoretic setting as reducing the power of a signal does not reduce the information contained in it (Florin et al., 2010; Barnett and Seth, 2011; Weber et al. 2017; Pinzuti et al., 2020 - who also suggest an approach that would circumvent those filter-related issues). With this in mind, I am wondering whether the strong frequency-specific claims in this work still hold.

      (3) I found it difficult to reconcile some statements in the manuscript with the data presented in the figures:

      (i) Most notably, the considerable number of feedforward connections from A5 and STS that project to areas further up the hierarchy at slower rhythms (e.g. L-A5 to R-PEF, R-Crus2, L CB6 L-Tha, L-FOP and L-STS to R-PEF, L-FOP, L-TOPJ or R-A5 as well as R-STS both to R-Crus2, L-CB6, L-Th) contradict the authors' main message that 'feedback signals were communicated via slow rhythms below 40 Hz, whereas feedforward signals were communicated via faster rhythms'. I struggled to recognise a principled approach that determined which connections were highlighted and reported and which ones were not.

      (ii) "Our analysis also revealed robust connectivity between the right cerebellum and the left parietal cortex, evident in both speaking and listening conditions, with stronger connectivity observed during speaking. Notably, Figure 4 depicts a prominent frequency peak in the alpha band, illustrating the specific frequency range through which information flows from the cerebellum to the parietal areas." There are two peaks discernible in Figure 4, one notably lower than the alpha band (rather theta or even delta), the other at around 30 Hz. Nevertheless, the authors report and discuss a peak in the alpha band.

      (iii) In the abstract: "Notably, high-frequency connectivity was absent during the listening condition." and p.9 "In contrast with what we reported for the speaking condition, during listening, there is only a significant connectivity in low frequency to the left temporal area but not a reverse connection in the high frequencies."<br /> While Fig. 4 shows significant connectivity from R-CB6 to A5 in the gamma frequency range for the speaking, but not for the listening condition, interpreting comparisons between two effects without directly comparing them is a common statistical mistake (Makin and Orban de Xivry). The spectrally-resolved connectivity in the two conditions actually look remarkably similar and I would thus refrain from highlighting this statement and indicate clearly that there were no significant differences between the two conditions.

      (iv) "This result indicates that in low frequencies, the sensory-motor area and cerebellum predominantly transmit information, while in higher frequencies, they are more involved in receiving it."<br /> I don't think that this statement holds in its generality: L-CB6 and R-3b both show strong output at high frequencies, particularly in the speaking condition. While they seem to transmit information mainly to areas outside A5 and STS these effects are strong and should be discussed.

      (4) "However, definitive conclusions should be drawn with caution given recent studies raising concerns about the notion that top-down and bottom-up signals can only be transmitted via separate frequency channels (Ferro et al., 2021; Schneider et al., 2021; Vinck et al., 2023)."

      I appreciate this note of caution and think it would be useful if it were spelled out to the reader why this is the case so that they would be better able to grasp the main concerns here. For example, Schneider et al. make a strong point that we expect to find Granger-causality with a peak in a specific frequency band for areas that are anatomically connected when the sending area shows stronger activity in that band than the receiving one, simply because of the coherence of a signal with its own linear projection onto the other area. The direction of a Granger causal connection would in that case only indicate that one area shows stronger activity than the other in the given frequency band. I am wondering to what degree the reported connectivity pattern can be traced back to regional differences in frequency-specific source strength or to differences in source strength across the two conditions.

    4. Reviewer #3 (Public Review):

      In the current paper, Abbasi et al. aimed to characterize and compare the patterns of functional connectivity across frequency bands (1 Hz - 90 Hz) between regions of a speech network derived from an online meta-analysis tool (Neurosynth.org) during speech production and perception. The authors present evidence for complex neural dynamics from which they highlight directional connectivity from the right cerebellum to left superior temporal areas in lower frequency bands (up to beta) and between the same regions in the opposite direction in the (lower) high gamma range (60-90 Hz). Abbasi et al. interpret their findings within the predictive coding framework, with the cerebellum and other "higher-order" (motor) regions transmitting top-down sensory predictions to "lower-order" (sensory) regions in the lower frequencies and prediction errors flowing in the opposite direction (i.e., bottom-up) from those sensory regions in the gamma band. They also report a negative correlation between the strength of this top-down functional connectivity and the alignment of superior temporal regions to the syllable rate of one's speech.

      Strengths:

      (1) The comprehensive characterization of functional connectivity during speaking and listening to speech may be valuable as a first step toward understanding the neural dynamics involved.

      (2) The inclusion of subcortical regions and connectivity profiles up to 90Hz using MEG is interesting and relatively novel.

      (3) The analysis pipeline is generally adequate for the exploratory nature of the work.

      Weaknesses:

      (1) The work is framed as a test of the predictive coding theory as it applies to speech production and perception, but the methodological approach is not suited to this endeavor.

      (2) Because of their theoretical framework, the authors readily attribute roles or hierarchy to brain regions (e.g., higher- vs lower-order) and cognitive functions to observed connectivity patterns (e.g., feedforward vs feedback, predictions vs prediction errors) that cannot be determined from the data. Thus, many of the authors' claims are unsupported.

      (3) The authors' theoretical stance seems to influence the presentation of the results, which may inadvertently misrepresent the (otherwise perfectly valid; cf. Abbasi et al., 2023) exploratory nature of the study. Thus, results about specific regions are often highlighted in figures (e.g., Figure 2 top row) and text without clear reasons.

      (4) Some of the key findings (e.g., connectivity in opposite directions in distinct frequency bands) feature in a previous publication and are, therefore, interesting but not novel.

      (5) The quantitative comparison between speech production and perception is interesting but insufficiently motivated.

      (6) Details about the Neurosynth meta-analysis and subsequent selection of brain regions for the functional connectivity analyses are incomplete. Moreover, the use of the term 'Speech' in Neurosynth seems inappropriate (i.e., includes irrelevant works, yielding questionable results). The approach of using separate meta-analyses for 'Speech production' and 'Speech perception' taken by Abbasi et al. (2023) seems more principled. This approach would result, for example, in the inclusion of brain areas such as M1 and the BG that are relevant for speech production.

      (7) The results involving subcortical regions are central to the paper, but no steps are taken to address the challenges involved in the analysis of subcortical activity using MEG. Additional methodological detail and analyses would be required to make these results more compelling. For example, it would be important to know what the coverage of the MEG system is, what head model was used for the source localization of cerebellar activity, and if specific preprocessing or additional analyses were performed to ensure that the localized subcortical activity (in particular) is valid.

      (8) The results and methods are often detailed with important omissions (a speech-brain coupling analysis section is missing) and imprecisions (e.g., re: Figure 5; the Connectivity Analysis section is copy-pasted from their previous work), which makes it difficult to understand what is being examined and how. (It is also not good practice to refer the reader to previous publications for basic methodological details, for example, about the experimental paradigm and key analyses.) Conversely, some methodological details are given, e.g., the acquisition of EMG data, without further explanation of how those data were used in the current paper.

      (9) The examination of gamma functional connectivity in the 60 - 90 Hz range could be better motivated. Although some citations involving short-range connectivity in these frequencies are given (e.g., within the visual system), a more compelling argument for looking at this frequency range for longer-range connectivity may be required.

      (10) The choice of source localization method (linearly constrained minimum variance) could be explained, particularly given that other methods (e.g. dynamic imaging of coherent sources) were specifically designed and might potentially be a better alternative for the types of analyses performed in the study.

      (11) The mGC analysis needs to be more comprehensively detailed for the reader to be able to assess what is being reported and the strength of the evidence. Relatedly, first-level statistics (e.g., via estimation of the noise level) would make the mGC and DAI results more compelling.

      (12) Considering the exploratory nature of the study, it is essential for other researchers to continue investigating and validating the results presented in the current manuscript. Thus, it is concerning that data and scripts are not fully and openly available. Data need not be in its raw state to be shared and useful, which circumvents the stated data privacy concerns.

    1. eLife assessment

      This a useful study that reports a genetic regulatory network that accounts for altered lipid metabolism in response to two different bacterial diets of C. elegans. The proposed mechanism, linking vitamin B12, S-adenosyl methionine (SAM), phosphatidylcholine (PC), and neutral lipid levels, is solid but has been previously demonstrated by other studies using similar assays. The evidence to support a new layer of regulation, via the production of phospho-choline by ASM-3/acid sphingomyelinase, requires further substantiation.

    2. Reviewer #1 (Public Review):

      Summary:

      This paper reports the finding that less fat accumulates in C. elegans that are feeding on Comamonas aquatica DA1877 (DA) vs the standard lab diet of Escherichia coli OP50 (OP50). While these bacteria are likely to be different in many ways, the authors found that fat accumulation phenotype depends on the vitamin B12 content of the bacterial diet and the involvement of B12 in the methionine cycle, affecting SAMS-1 and phosphatidylcholine (PC) synthesis. They report that low PC levels activate SREBP-1 (SBP-1 in C. elegans) and that an important target of SBP-1 is the delta 9 desaturase FAT-7. Finally, they describe a role for ASM-3, an acid sphingomyelinase, in influencing PC synthesis and fat accumulation in the worm.

      Strengths:

      This is a comprehensive story about how a dietary change affects fat accumulation in C. elegans. Their experimental evidence is convincing. The most novel aspect of this paper is that the coelomecyte expression of asm-3 contributes to PC/TAG homeostasis in C. elegans, which most likely occurs through the production of phosphocholine by the enzymatic breakdown of sphingomyelin by ASM-3. The phosphocholine will provide precursors for phosphatidylcholine (PC) synthesis, contributing to the PC synthesis pathway.

      Weaknesses:

      In the way the story is presented, the authors tend to imply that they discovered the pathways of B12, PC, SBP-1, and FAT-7, ignoring some important studies describing the relationship between PC synthesis and TAG accumulation in both the mammalian lipid metabolism field (liver) as well as in C. elegans. Many previous studies with similar results are not cited appropriately. Thus, the pathways reported in the paper are not new, and in this sense, the work is mostly confirmatory.

    3. Reviewer #2 (Public Review):

      Summary:

      Han et al. present a manuscript focusing on difference metabolism and the regulatory circuits controlling it in C. elegans fed two bacterial diets. In the first three figures and a half figures, using a combination of methods, they investigate lipid levels, changes in gene expression and genetic assays to come to the conclusion that vitamin B12 acts through the S-adenosylmethioine synthase sams-1 to perturb phosphatidylcholine levels, which in turn stimulate the C. elegans ortholog of the SREBP transcription factors to activate fatty acid synthesis genes such as fat-7/SCD1. Thus, while connections between diet, metabolic pathways and gene regulation is of general interest, this study largely confirms the work of others without direct credit in many instances, then fails to develop a more novel cell non-autonomous link between the pathways in the last two figures. Thus, this study would be expected to have a useful impact on the field, if it can be placed in context of previously published work.

      Strengths:

      (1) Connections between diet, metabolic pathways and gene regulation is of general interest<br /> (2) Figures 1-4 confirm data/observations from previously published work from MacNeil, et al. Cell 2015; Walker, et al. Cell 2011; Svensk, et al. PLoS Genetics 2013; Smulan, et al. Cell Reports, 2016; Giese, et al. eLife 2020 and Qin, et al. Cell Reports 2022..<br /> (3) The data in figures 5 and 6 showing importance of non-cell autonomous effects on metabolism.

      Weaknesses:

      (1) In order to differentiate their study from previous work, it seems that the authors try to make the argument that PC is higher in Comomonas than E. coli, therefore they are looking at repression of SBP-1-dependent function, however, the pairing of the diets is arbitrary, and the comparisons could easily be reversed. They are simply comparing a higher to a lower level of PC, rather than a basal to a lower, thus the concepts are the same. In addition, they fail to cite the larger body of literature linking phospholipid balance to SREBP function. For example, multiple studies in mammalian models link phospholipid balance, not just lowered PC, to SREBP function: Lim, Genes and Dev 2011; Wang, et al. Cell Stem Cell, 2018; Rong, et al. J Clin Invest 2017; Smulan et al, Cell Reports, 2016; Dobrosotskaya, Science. 2002 and recently, Rong, et al. Cell Met 2024.

      (2) Figure 1: For example, the data in figure 1, shows measures of lipid content, RNA seq showing changes in metabolic enzymes such as fat-7/SCD-1 and lipid levels have already been shown in MacNeil, et al. Cell 2013 (lipid levels and gene expression changes) and the lipid levels in Comomonas vs E. coli were published in Ditot, et al. Nature Communications 2022 by Dr. Marian Walhout's lab.

      (3) Figure 2/3: In Figure 2 and 3, they use a genetic screen to find regulators of fat-7/scd1 expression, and unsurprisingly, pull out genes with known to regulate this pathway. The authors go on to show that changes in SAM lead to changes in PC, and affect SBP-1/SREBP-1-dependent lipogenesis. This is a well described pathway from publications by the Walhout lab, Dr. Amy Walker's lab and Dr. Marc Pilon's lab (Walker, et al. Cell 2011; Svensk, et al. PLoS Genetics 2013; Smulan, et al. Cell Reports, 2016; Giese, et al. eLife 2020) in addition to a recent publication, Qin, et al. Cell Reports 2022. While some of these studies are cited in other places in the manuscript, the authors describe their results as "discovery", then fail to cite the relevant studies at those points (selected examples below

      (4) Selected examples of citation issues:

      a) Selected example: pg 6: "To understand the mechanism underlying the regulation of host lipid content triggered by DA, we examined the gene expression changes elicited by the two different bacterial diets in young adult animals by RNA-seq...In particular, genes related to the biosynthesis of unsaturated fatty acids showed a significant decrease in expression in DA-fed worms. For example, the delta-(9) fatty acid desaturases, fat-5 and fat-7, (which convert fatty acids 16:0 to 16:1n7 and 18:0 to 18:1n9, respectively32) decreased"

      MacNeil et al Cell 2013 published a transcriptomics comparing young adult DA and Op50, which demonstrated decreases in fat-5 and fat-7. While MacNeil is cited in other parts of the paper, since the authors have performed a highly similar experiment and obtained similar results, this should be described as confirming the MacNeil study rather than as new data.

      b) Selected Example: pg 10: "To determine whether PC levels have a causal effect on organismal lipid content, we supplemented worm diets with choline, the PC precursor, and uncovered a dose-dependent decrease in lipid content as measured by O.R.O staining (Figure 3B)."

      Addition of choline to supplement defects in PC synthesis was first shown by Brendza, et al. Biochem J 2007. It was confirmed in Walker, et al. 2011, and further confirmation of PC rescue show in Ding, et al. 2015. The Brendza study is not cited at all and while studies from the Walker lab are cited in other places, the authors omit that changes in the DA diet are the same as changes seen when choline rescues PC loss from other perturbations.

      c) Selected Example: pg 9: "Notably, DA has been reported as a B12-rich bacterium compared to OP16, hinting at the possibility that the DA diet might boost dietary B12 levels."

      Reference 16 is Watson, et al. Cell 2015 where the Walhout lab demonstrates that DA does in fact act through the diet to alter the Met/SAM cycle and other B12 dependent processes in C. elegans. This paper, along with MacNeil above broke ground in linking B12 and the Met/SAM cycle to specific phenotypes in C. elegans, which was followed up by extensive work from the Walhout lab on this cycle, thus, it seems odd that the authors describe their own data as "hinting" at this connection.

      d) Selected example: pg 17: "Indeed, this is further supported by our observation that mutants of histone methyltransferases SET-2 and SET-30 (which install H3K4me1 and H3K4me2, respectively) exhibited elevated lipid content on DA diet (data not shown). Notably, while both set-2 and set-30 mutants had this effect, only set-2 appears to control fat-7 expression (data not shown)". Extensive work from Dr. Anne Brunet's lab (Greer, et al. Nature 2010; Greer, et al. Nature 2011; Han, et al. Nature 2017) link set-2 and H3K4 methylation to lipid accumulation and fat-7. The authors fail to cite these studies.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors presented data that linked vitamin B12, S-adenosyl methionine (SAM), and phosphatidylcholine (PC) synthesis to lipid homeostasis in C. elegans. They confirmed mechanisms previously shown by other labs, including the regulation of FAT-7 expression by SBP-1, and the targeting of SEIP-1 by PC levels. The authors also attempted to link the synthesis of phospho-choline by the ASM-3 sphingomyelinase to PC synthesis and lipid homeostasis. However, the relative contribution of phospho-choline by ASM-3 versus the canonical Kennedy pathway was not elucidated. Therefore, the significance of the ASM-3-dependent mechanism to PC synthesis requires further investigation.

      Strengths:

      The authors used a wide range of biochemical and cell biological methods to measure fatty acid composition, neutral lipid levels, and lipid droplet dynamics in C. elegans. The quality of the data is generally high.

      Weaknesses:

      Data interpretation and the construction of the working model did not seem to take into account the two well-established pathways for PC synthesis. The Kennedy pathway generates PC from phospho-choline and DAG via a cytidine-based intermediate. The second PC synthesis pathway entails the methylation of PE by PEMT, with the donor methyl groups provided by the vitamin B12-dependent 1-carbon cycle. The authors' model seemed to overlook part of the Kennedy pathway that involves choline kinase (and not ASM-3) as the canonical enzyme that generates phospho-choline. The authors also did not explicitly consider DAG as a precursor of triacylglycerol (TAG), which was directly or indirectly measured as a readout of organismal fat content in the paper. Therefore, alternative models should be entertained. For example, the proposed genetic and dietary effects on lipid homeostasis could stem from the competition for a limiting pool of precursors that were shared by PC and TAG synthesis. PC itself may not have a deterministic role, as depicted by the authors' model. Finally, the claim that "coelomocytes regulate diets-induced lipid homeostasis through asm-3" was not well supported. In the absence of quantitative analysis of phospho-choline in mutants, it was unclear how much ASM-3 contributed to the overall phospho-choline, and ultimately PC level. The proposed inter-tissue regulation of PC synthesis also requires coelomocytes-specific knock-down/depletion of asm-3 for verification.

    1. Author response:

      This important manuscript uses circuit mapping, chemogenetics, and optogenetics to demonstrate a novel hippocampal lateral septal circuit that regulates social novelty behaviours and shows that downstream of the hippocampal septal circuit, septal projections to the ventral tegmental area are necessary for general novelty discrimination. The strength of the evidence supporting the claims is convincing but would be strengthened by the inclusion of additional functional assays. The work will be of interest to systems and behavioural neuroscientists who are interested in the brain mechanisms of social behaviours.

      We thank the reviewers for their thoughtful and constructive feedback. We are excited that both reviewers thought that the manuscript was of “interest to specialists in the field and to the broad readership of the journal”, that the paper was “well-written and logically organized” and that the “study opens an avenue to study these circuits further to uncover the plasticity and synaptic mechanisms regulating social novelty preference.” Additionally, the reviewers wrote that the experiments were “well-designed” “with clever controls and conditions to provide compelling evidence for their conclusion.” The reviewers additionally provided constructive feedback which we address in our responses below.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The study investigated the neural circuits underlying social novelty preference in mice. Using viral circuit tracing, chemogenetics, and optogenetics in the vHPC, LS, and VTA, the authors found that vHPC to LS projections may contribute to the salience of social novelty investigations. In addition, the authors identify LS projections to the VTA involved in social novelty and familiar food responses. Finally, via viral tracing, they demonstrate that vHPC-LS neurons may establish direct monosynaptic connections with VTA dopaminergic neurons. The experiments are well-designed, and the conclusions are mostly very clear. The manuscript is well-written and logically organized, and the content will be of interest to specialists in the field and to the broad readership of the journal.

      Strengths:

      (1) The vHPC has been involved in social memory for novel and familiar conspecifics. Yet, how the vHPC conveys this information to drive motivation for novel social investigations remains unclear. The authors identified a pathway from the vHPC to the LS and eventually the VTA, that may be involved in this process.

      (2) Mice became familiar with a novel conspecific by co-housing for 72h. This represents a familiarization session with a longer duration as compared to previous literature. Using this new protocol, the authors found robust social novelty preference when animals were given a choice between a novel and familiar conspecific.

      (3) The effects of vHPC-LS inhibition are specific to novel social stimuli. The authors included novel food and novel object control experiments and those were not affected by neuronal manipulations.

      (4) For optogenetic studies, the authors applied closed-loop photoinhibition only when the animals investigated either the novel conspecific or the familiar. This optogenetic approach allowed for the investigation of functional manipulations to selective novel or familiar stimuli approaches.

      Weaknesses:

      (1) The abstract and the overall manuscript pose that the authors identified a novel vHPC-LS-VTA pathway that is necessary for mice to preferentially investigate novel conspecifics. However, the authors assessed the functional manipulations of vHPC-LS and LS-VTA circuits independently and the sentence could be misleading. Therefore, a viral strategy specifically designed to target the vHPC-LS-VTA circuit combined with optogenetic/chemogenetic tools and behavior may be necessary for the statement of this conclusion.

      The reviewer raises an important point. Although Figure 3 shows that vHPC (vCA1 and vCA3) is the source of the greatest number of monosynaptic inputs onto LS-VTA neurons, we did not perform any experiments that specifically manipulated vHPC neurons that project to LS-VTA neurons. While these experiments would be extremely interesting, they are technically challenging and beyond the scope of this study. We are happy to edit our manuscript to clarify this point.

      (2) The authors combined males and females in their analysis, as neural circuit manipulation affected novelty discrimination ratios in both sexes. However, supplementary Figure 1 demonstrates the chemogentic inhibition of vHPC-LS circuit may cause stronger effects in male mice as compared to females.

      The reviewer makes an interesting point. We can confirm that we found no significant differences in the effectiveness of our vHPC-LS inhibition between the males and females (2-factor ANOVA with sex (male/female) and drug condition (saline/CNO) as factors on the discrimination scores of hM4Di expressing animals: interaction p=0.2241, sex: p=0.1233, drug condition: p=0.0166). These data suggest that there are no significant sex differences in the effectiveness of inhibition of the vHPC-LS neurons. We will include these comparisons in the revised manuscript.

      (3) In most experiments, the same animals were used for social novelty preference, for food or object novelty responses but washout periods between experiments are not mentioned in the methods section. In this line, the authors did not mention the time frame between the closed-loop optogenetic experiments that silenced the vHPC-LS only during familiar and then only novel social investigations. When using the same animals tested for social experiments in the same context there may be an effect of context-dependent social behaviors that could affect future outcomes.

      We thank the reviewer for this important clarification. We apologize for not including these crucial details in our Methods section. For both the chemogenetic and optogenetic inhibition experiments, all conditions were separated by a minimum of 24 hours. In the chemogenetic inhibition experiments, saline and CNO conditions were counterbalanced between animals. Similarly, we counterbalanced the order of light ON vs light OFF conditions across animals during our optogenetic inhibition experiments. We will include these additional details in the revised manuscript.

      (4) All the experiments were performed in a non-cell-type-specific manner. The viral strategies used targeted multiple neuronal subpopulations that could have divergent effects on social novelty preference. This constraint could be added in the discussion section.

      We will expand our discussion section to address this essential point raised by the reviewer.

      (5) The authors' assumptions were all based on experiments of necessity. The authors could use an experiment of sufficiency by targeting for instance the LS-VTA circuit and assess if animals reduce novel social investigations with LS-VTA photostimulation.

      We agree with the reviewers that it would be interesting to determine if LS-VTA neurons are sufficient, in addition to being necessary, to drive social novelty. These will be interesting experiments to pursue in the future.

      Reviewer #2 (Public Review):

      Summary:

      Rashid and colleagues demonstrate a novel hippocampal lateral septal circuit that is important for social recognition and drives the exploration of novel conspecifics. Their study spans from neural tracing to close-loop optogenetic experiments with clever controls and conditions to provide compelling evidence for their conclusion. They demonstrate that downstream of the hippocampal septal circuit, septal projections to the ventral tegmental area are necessary for general novelty discrimination. The study opens an avenue to study these circuits further to uncover the plasticity and synaptic mechanisms regulating social novelty preference.

      Strengths:

      Chemogenetic and optogenetic experiments have excellent behavioral controls. The synaptic tracing provides important information that informs the narrative of experiments presented and invites future studies to investigate the effects of septal input on dopaminergic activity.

      Weaknesses:

      There are unclear methodological important details for circuit manipulation experiments and analyses where multiple measures are needed but missing. Based on the legends, the chemogenetic experiment is done in a within-animal design. That is the same mouse receives SAL and CNO. However, the data is not presented in a within-animal manner such that we can distinguish if the behavior of the same animal changes with drug treatment. Similarly, the methods specify that the optogenetic manipulations were done in three different conditions, but the analyses do not report within-animal changes across conditions nor account for multiple measures within subjects.

      Thank you for raising this important point. We are happy to include the repeated measure ANOVAs and paired t-tests in a revised version of the manuscript.

      Finally, it is unclear if the order of drug treatment and conditions were counterbalanced across subjects.

      As mentioned in the above response to Reviewer 1, for both the chemogenetic and optogenetic inhibition experiments, all conditions were separated by a minimum of 24 hours and we counterbalanced the order of chemogenetic (saline/CNO) and optogenetic (light ON/light OFF) experimental manipulations across animals. We will include these additional details in the revised manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      We thank the reviewers for the detailed assessment of our work as well as their praise and constructive feedback which helped us to significantly improve our manuscript.

      Reviewer #1 (Public Review):

      The inferior colliculus (IC) is the central auditory system's major hub. It integrates ascending brainstem signals to provide acoustic information to the auditory thalamus. The superficial layers of the IC ("shell" IC regions as defined in the current manuscript) also receive a massive descending projection from the auditory cortex. This auditory cortico-collicular pathway has long fascinated the hearing field, as it may provide a route to funnel "high-level" cortical signals and impart behavioral salience upon an otherwise behaviorally agnostic midbrain circuit.

      Accordingly, IC neurons can respond differently to the same sound depending on whether animals engage in a behavioral task (Ryan and Miller 1977; Ryan et al., 1984; Slee & David, 2015; Saderi et al., 2021; De Franceschi & Barkat, 2021). Many studies also report a rich variety of non-auditory responses in the IC, far beyond the simple acoustic responses one expects to find in a "low-level" region (Sakurai, 1990; Metzger et al., 2006; Porter et al., 2007). A tacit assumption is that the behaviorally relevant activity of IC neurons is inherited from the auditory cortico-collicular pathway. However, this assumption has never been tested, owing to two main limitations of past studies:

      (1) Prior studies could not confirm if data were obtained from IC neurons that receive monosynaptic input from the auditory cortex.

      (2) Many studies have tested how auditory cortical inactivation impacts IC neuron activity; the consequence of cortical silencing is sometimes quite modest. However, all prior inactivation studies were conducted in anesthetized or passively listening animals. These conditions may not fully engage the auditory cortico-collicular pathway. Moreover, the extent of cortical inactivation in prior studies was sometimes ambiguous, which complicates interpreting modest or negative results.

      Here, the authors' goal is to directly test if auditory cortex is necessary for behaviorally relevant activity in IC neurons. They conclude that surprisingly, task relevant activity in cortico-recipient IC neuron persists in absence of auditory cortico-collicular transmission. To this end, a major strength of the paper is that the authors combine a sound-detection behavior with clever approaches that unambiguously overcome the limitations of past studies.

      First, the authors inject a transsynaptic virus into the auditory cortex, thereby expressing a genetically encoded calcium indicator in the auditory cortex's postsynaptic targets in the IC. This powerful approach enables 2-photon Ca2+ imaging from IC neurons that unambiguously receive monosynaptic input from auditory cortex. Thus, any effect of cortical silencing should be maximally observable in this neuronal population. Second, they abrogate auditory cortico-collicular transmission using lesions of auditory cortex. This "sledgehammer" approach is arguably the most direct test of whether cortico-recipient IC neurons will continue to encode task-relevant information in absence of descending feedback. Indeed, their method circumvents the known limitations of more modern optogenetic or chemogenetic silencing, e.g. variable efficacy.

      I also see three weaknesses which limit what we can learn from the authors' hard work, at least in the current form. I want to emphasize that these issues do not reflect any fatal flaw of the approach. Rather, I believe that their datasets likely contain the treasure-trove of knowledge required to completely support their claims.

      (1) The conclusion of this paper requires the following assumption to be true: That the difference in neural activity between Hit and Miss trials reflects "information beyond the physical attributes of sound." The data presentation complicates asserting this assumption. Specifically, they average fluorescence transients of all Hit and all Miss trials in their detection task. Yet, Figure 3B shows that mice's d' depends on sound level, and since this is a detection task the smaller d' at low SPLs presumably reflects lower Hit rates (and thus higher Miss rates). As currently written, it is not clear if fluorescence traces for Hits arise from trials where the sound cue was played at a higher sound level than on Miss trials. Thus, the difference in neural activity on Hit and Miss trials could indeed reflect mice's behavior (licking or not licking). But in principle could also be explained by higher sound-evoked spike rates on Hit compared to Miss trials, simply due to louder click sounds. Indeed, the amplitude and decay tau of their indicator GCaMP6f is non-linearly dependent on the number and rate of spikes (Chen et al., 2013), so this isn't an unreasonable concern.

      (2) The authors' central claim effectively rests upon two analyses in Figures 5 and 6. The spectral clustering algorithm of Figure 5 identifies 10 separate activity patterns in IC neurons of control and lesioned mice; most of these clusters show distinct activity on averaged Hit and Miss trials. They conclude that although the proportions of neurons from control and lesioned mice in certain clusters deviates from an expected 50/50 split, neurons from lesioned mice are still represented in all clusters. A significant issue here is that in addition to averaging all Hits and Miss trials together, the data from control and lesioned mice are lumped for the clustering. There is no direct comparison of neural activity between the two groups, so the reader must rely on interpreting a row of pie charts to assess the conclusion. It's unclear how similar task relevant activity is between control and lesioned mice; we don't even have a ballpark estimate of how auditory cortex does or does not contribute to task relevant activity. Although ideally the authors would have approached this by repeatedly imaging the same IC neurons before and after lesioning auditory cortex, this within-subjects design may be unfeasible if lesions interfere with task retention. Nevertheless, they have recordings from hundreds to thousands of neurons across two groups, so even a small effect should be observable in a between-groups comparison.

      (3) In Figure 6, the authors show that logistic regression models predict whether the trial is a Hit or Miss from their fluorescence data. Classification accuracy peaks rapidly following sound presentation, implying substantial information regarding mice's actions. The authors further show that classification accuracy is reduced, but still above chance in mice with auditory cortical lesions. The authors conclude from this analysis task relevant activity persists in absence of auditory cortex. In principle I do not disagree with their conclusion.

      The weakness here is in the details. First, the reduction in classification accuracy of lesioned mice suggests that auditory cortex does nevertheless transmit some task relevant information, however minor it may be. I feel that as written, their narrative does not adequately highlight this finding. Rather one could argue that their results suggest redundant sources of task-relevant activity converging in the IC. Secondly, the authors conclude that decoding accuracy is impaired more in partially compared to fully lesioned mice. They admit that this conclusion is at face value counterintuitive, and provide compelling mechanistic arguments in the Discussion. However, aside from shaded 95% CIs, we have no estimate of variance in decoding accuracy across sessions or subjects for either control or lesioned mice. Thus we don't know if the small sample sizes of partial (n = 3) and full lesion (n = 4) groups adequately sample from the underlying population. Their result of Figure 6B may reflect spurious sampling from tail ends of the distributions, rather than a true non-monotonic effect of lesion size on task relevant activity in IC.

      Our responses to the ‘recommendations for the authors’ below lay out in detail how we addressed each comment and concern. Besides filling in key information about how our original analysis aimed at minimizing any potential impact of differences in sound level distributions - namely that trials used for decoding were limited to a subset of sound levels - and which was accidentally omitted in the original manuscript, we have now carried out several additional analyses.

      We would like to highlight one of these because it supplements both the clustering and decoding analysis that we conducted to compare hit and miss trial activity, and directly addresses what the reviewer identified as our work’s main weakness (a possible confound between animal behavior and sound level distributions) and the request for an analysis that operates at the level of single units rather than the population level. Specifically, we assessed, separately for each recorded neuron, whether there was a statistically significant difference in the magnitude of neural activity between hit and miss trials. This approach allowed us to fully balance the numbers of hit and miss trials at each sound level that were entered into the analysis. The results revealed that a large proportion (close to 50%) of units were task modulated, i.e. had significantly different response magnitudes between hit and miss trials, and that this proportion was not significantly different between lesioned and non-lesioned mice. We hope that this, together with the rest of our responses, convincingly demonstrates that the shell of the IC encodes mouse sound detection behavior even when top-down input from the auditory cortex is absent.

      Reviewer #2 (Public Review):

      Summary:

      This study takes a new approach to studying the role of corticofugal projections from auditory cortex to inferior colliculus. The authors performed two-photon imaging of cortico-recipient IC neurons during a click detection task in mice with and without lesions of auditory cortex. In both groups of animals, they observed similar task performance and relatively small differences in the encoding of task-response variables in the IC population. They conclude that non-cortical inputs to the IC provide can substantial task-related modulation, at least when AC is absent. Strengths:

      This study provides valuable new insight into big and challenging questions around top-down modulation of activity in the IC. The approach here is novel and appears to have been executed thoughtfully. Thus, it should be of interest to the community.

      Weaknesses: There are, however, substantial concerns about the interpretation of the findings and limitations to the current analysis. In particular, Analysis of single unit activity is absent, making interpretation of population clusters and decoding less interpretable. These concerns should be addressed to make sure that the results can be interpreted clearly in an active field that already contains a number of confusing and possibly contradictory findings.

      Our responses to the ‘recommendations for the authors’ below lay out in detail how we addressed each comment and concern. Several additional analyses have now been carried out including ones that operate at the level of single units rather than the population level, as requested by the reviewer. We would like to briefly highlight one here because it supplements both the clustering and decoding analysis that we conducted to compare hit and miss trial activity and directly addresses what the other reviewers identified as our work’s main weakness (a possible confound between animal behavior and sound level distributions). Specifically, we assessed, separately for each recorded neuron, whether there was a statistically significant difference in the magnitude of neural activity between hit and miss trials. This approach allowed us to fully balance the numbers of hit and miss trials at each sound level that were entered into the analysis. The results revealed that a large proportion (close to 50%) of units were task modulated, i.e. had significantly different response magnitudes between hit and miss trials, and that this proportion was not significantly different between lesioned and non-lesioned mice. We hope that this, together with the rest of our responses, convincingly demonstrates that the shell of the IC encodes mouse sound detection behavior even when top-down input from the auditory cortex is absent.

      Reviewer #3 (Public Review):

      Summary:

      This study aims to demonstrate that cortical feedback is not necessary to signal behavioral outcome to shell neurons of the inferior colliculus during a sound detection task. The demonstration is achieved by the observation of the activity of cortico-recipient neurons in animals which have received lesions of the auditory cortex. The experiment shows that neither behavior performance nor neuronal responses are significantly impacted by cortical lesions except for the case of partial lesions which seem to have a disruptive effect on behavioral outcome signaling. Strengths:

      The experimental procedure is based on state of the art methods. There is an in depth discussion of the different effects of auditory cortical lesions on sound detection behavior. Weaknesses:

      The analysis is not documented enough to be correctly evaluated. Have the authors pooled together trials with different sound levels for the key hit vs miss decoding/clustering analysis? If so, the conclusions are not well supported, as there are more misses for low sound levels, which would completely bias the outcome of the analysis. It would possible that the classification of hit versus misses actually only reflects a decoding of sound level based on sensory responses in the colliculus, and it would not be surprising then that in the presence or absence of cortical feedback, some neurons responds more to higher sound levels (hits) and less to lower sound levels (misses). It is important that the authors clarify and in any case perform an analysis in which the classification of hits vs misses is done only for the same sound levels. The description of feedback signals could be more detailed although it is difficult to achieve good temporal resolution with the calcium imaging technique necessary for targeting cortico-recipient neurons.

      Our responses to the ‘recommendations for the authors’ below lay out in detail how we addressed each comment and concern. Besides filling in key information about how our original analysis aimed at minimizing any potential impact of differences in sound level distributions - namely that trials used for decoding were limited to a subset of sound levels - and which was accidentally omitted in the original manuscript, we have now carried out several additional analyses to directly address what the reviewer identified as our work’s main weakness (a possible confound between animal behavior and sound level distributions). This includes an analysis in which we were able to demonstrate for one imaging session with a sufficiently large number of trials that limiting the trials entered into the decoding analysis to those from a single sound level did not meaningfully impact decoding accuracy. We would like to highlight another new analysis here because it supplements both the clustering and decoding analyses that we conducted to compare hit and miss trial activity and addresses the other reviewers’ request for an analysis that operates at the level of single units rather than the population level. Specifically, we assessed, separately for each recorded neuron, whether there was a statistically significant difference in the magnitude of neural activity between hit and miss trials. This approach allowed us to fully balance the numbers of hit and miss trials at each sound level that were entered into the analysis. The results revealed that a large proportion (close to 50%) of units were task modulated, i.e. had significantly different response magnitudes between hit and miss trials, and that this proportion was not significantly different between lesioned and non-lesioned mice. We hope that this, together with the rest of our responses, convincingly demonstrates that the shell of the IC encodes mouse sound detection behavior even when top-down input from the auditory cortex is absent.

      Reviewer #1 (Recommendations For The Authors):

      Thank you for the opportunity to read your paper. I think the conclusion is exciting. Indeed, you indicate that perhaps contrary to many of our (untested) assumptions, task-relevant activity in the IC may persist in absence of auditory cortex.

      As mentioned in my public review: Despite my interest in the work, I also think that there are several opportunities to significantly strengthen your conclusions. I feel this point is important because your work will likely guide the efforts of future students and post-docs working on this topic. The data can serve as a beacon to move the field away from the (somewhat naïve) idea that the evolved forebrain imparts behavioral relevance upon an otherwise uncivilized midbrain. This knowledge will inspire a search for alternative explanations. Indeed, although you don't highlight it in your narrative, your results dovetail nicely with several studies showing task-relevant activity in more ventral midbrain areas that project to the IC (e.g., pedunculopontine nuclei; see work from Hikosaka in monkeys, and more recently in mice from Karel Svoboda's lab).

      Thanks for the kind words.

      These studies, in particular the work by Inagaki et al. (2022) outlining how the transformation of an auditory go signal into movement could be mediated via a circuit involving the PPN/MRN (which might rely on the NLL for auditory input) and the motor thalamus, are indeed highly relevant.

      We made the following changes to the manuscript text.

      Line 472:”...or that the auditory midbrain, thalamus and cortex are bypassed entirely if simple acousticomotor transformations, such as licking a spout in response to a sound, are handled by circuits linking the auditory brainstem and motor thalamus via pedunculopontine and midbrain reticular nuclei (Inagaki et al., 2022).”

      The beauty of the eLife experiment is that you are free to incorporate or ignore these suggestions. After all, it's your paper, not mine. Nevertheless, I hope you find my comments useful.<br /> First, a few suggestions to address my three comments in the public review.

      Suggestion for public comment #1: An easy way to address this issue is to average the neural activity separately for each trial outcome at each sound level. That way you can measure if fluorescence amplitude (or integral) varies as a function of mice's action rather than sound level. This approach to data organization would also open the door to the additional analyses for addressing comment #2, such as directly comparing auditory and putatively non-auditory activity in neurons recorded from control and lesioned mice.

      We have carried out additional analyses for distinguishing between the two alternative explanations of the data put forward by the reviewer: That the difference in neural activity between hit and miss trials reflects a) behavior or b) sound level (more precisely: differences in response magnitude arising from a higher proportion of high-sound-level trials in the hit trial group than in the miss trial group). If the data favored b), we would expect no difference in activity between hit and miss trials when plotted separately for each sound level. The new Figure 4 - figure supplement 1 indicates that this is not the case. Hit and miss trial activity are clearly distinct even when plotted separately for different sound levels, confirming that this difference in activity reflects the animals’ behavior rather than sensory information.

      Changes to manuscript.

      Line 214: “While averaging across all neurons cannot capture the diversity of responses, the averaged response profiles suggest that it is mostly trial outcome rather than the acoustic stimulus and neuronal sensitivity to sound level that shapes those responses (Figure 4 – figure supplement 1).”

      Additionally, we assessed for each neuron separately whether there was a significant difference between hit and miss trial activity and therefore whether the activity of the neuron could be considered “task-modulated”. To achieve this, we used equal numbers of hit and miss trials at each sound level to ensure balanced sound level distributions and thus rule out any potential confound between sound level distributions and trial outcome. This analysis revealed that the proportion of task-modulated neurons was very high (close to 50%) and not significantly different between lesioned and non-lesioned mice (Figure 6 - figure supplement 3).

      Changes to the manuscript.

      Line 217: “Indeed, close to half (1272 / 2649) of all neurons showed a statistically significant difference in response magnitude between hit and miss trials…”

      Line 307: “Although the proportion of individual neurons with distinct response magnitudes in hit and miss trials in lesioned mice did not differ from that in non-lesioned mice, it was significantly lower when separating out mice with partial lesions (Figure 6 – figure supplement 3).”

      Differences in the distributions of sound levels in the different trial types could also potentially confound the decoding into hit and miss trials. Our original analysis was actually designed to take this into account but, unfortunately, we failed to include sufficient details in the methods section.

      Changes to the manuscript.

      Line 710: “Rather than including all the trials in a given session, only trials of intermediate difficulty were used for the decoding analysis. More specifically, we only included trials across five sound levels, comprising the lowest sound level that exceeded a d’ of 1.5 plus the two sound levels below and above that level. That ensured that differences in sound level distributions would be small, while still giving us a sufficient number of trials to perform the decoding analysis.“

      In this context, it is worth bearing in mind that a) the decoding analysis was done on a frame-byframe basis, meaning that the decoding score achieved early in the trial has no impact on the decoding score at later time points in the trial, b) sound-driven activity predominantly occurs immediately after stimulus onset and is largely over about 1 s into the trial (see cluster 3, for instance, or average miss trial activity in Figure 4 – figure supplement 1), c) decoding performance of the behavioral outcome starts to plateau 500-1000 ms into the trial and remains high until it very gradually begins to decline after about 2 s into the trial. In other words, decoding performance remains high far longer than the stimulus would be expected to have an impact on the neurons’ activity. Therefore, we would expect any residual bias due to differences in the sound level distribution that our approach did not control for to be restricted to the very beginning of the trial and not to meaningfully impact the conclusions derived from the decoding analysis.

      Finally, we carried out an additional decoding analysis for one imaging session in which we had a sufficient number of trials to perform the analysis not only over the five (59, 62, 65, 68, 71 dB SPL) original sound levels, but also over a reduced range of three (62, 65, 68 dB SPL) sound levels, as well as a single (65 dB SPL) sound level (Figure 6 - figure supplement 1). The mean sound level differences between the hit trial distributions and miss trial distributions for these three conditions were 3.08, 1.01 and 0 dB, respectively. This analysis suggests that decoding performance is not meaningfully impacted by changing the range of sound levels (and sound level distributions), other than that including fewer sound levels means fewer trials and thus noisier decoding.

      Changes to manuscript.

      Line 287: ”...and was not meaningfully affected by differences in sound level distributions between hit and miss trials (Figure 6 – figure supplement 1).”

      Suggestion for public comment #2: Perhaps a solution would be to display example neuron activity in each cluster, recorded in control and lesioned mice. The reader could then visually compare example data from the two groups, and immediately grasp the conclusion that task relevant activity remains in absence of auditory cortex. Additionally, one possibility might be to calculate the difference in neural activity between Hit and Miss trials for each task-modulated neuron. Then, you could compare these values for neurons recorded in control and lesion mice. I feel like this information would greatly add to our understanding of cortico-collicular processing.

      I would also argue that it's perhaps more informative to show one (or a few) example recordings rather than averaging across all cells in a cluster. Example cells would give the reader a better handle on the quality of the imaging, and this approach is more standard in the field. Finally, it would be useful to show the y axis calibration for each example trace (e.g. Figure 5 supp 1). That is also pretty standard so we can immediately grasp the magnitude of the recorded signal.

      We agree that while the information we provided shows that neurons from lesioned and nonlesioned groups are roughly equally represented across the clusters, it does not allow the reader to appreciate how similar the activity profiles of neurons are from each of the two groups. However, picking examples can be highly subjective and thus potentially open to bias. We therefore opted instead to display, separately for lesioned and non-lesioned mice, the peristimulus time histograms of all neurons in each cluster, as well as the cluster averages of the response profiles (Figure 5 - figure supplement 3). This, we believe, convincingly illustrates the close correspondence between neural activity in lesioned and non-lesioned mice across different clusters. All our existing and new figures indicate the response magnitude either on the figures’ y-axis or via scale/color bars.

      Changes to manuscript.

      Line 254: “Furthermore, there was a close correspondence between the cluster averages of lesioned and non-lesioned mice (Figure 5 – figure supplement 3).”

      Furthermore, we’ve now included a video of the imaging data which, we believe, gives the reader a much better handle on the data quality than further example response profiles would.

      Changes to manuscript.

      Line 197: ”...using two-photon microscopy (Figure 4B, Video 1).”

      Suggestion for public comment #3: In absence of laborious and costly follow-up experiments to boost the sample size of partial and complete lesion groups, it may be more prudent to simply tone down the claims that lesion size differentially impacts decoding accuracy. The results of this analysis are not necessary for your main claims.

      Our new results on the proportions of ‘task-modulated’ neurons (Figure 6 - figure supplement 3) across different experimental groups show that there is no difference between non-lesioned and lesioned mice as a whole, but mice with partial lesions have a smaller proportion of taskmodulated neurons than the other two groups. While this corroborates the results of the decoding analysis, we certainly agree that the small sample size is a caveat that needs to be acknowledged.

      Changes to manuscript.

      Line 477: ”Some differences were observed for mice with only partial lesions of the auditory cortex.

      Those mice had a lower proportion of neurons with distinct response magnitudes in hit and miss trials than mice with (near-)complete lesions. Furthermore, trial outcomes could be read out with lower accuracy from these mice. While this finding is somewhat counterintuitive and is based on only three mice with partial lesions, it has been observed before that smaller lesions…”

      A few more suggestions unrelated to public review:

      Figure 1: This is somewhat of an oddball in this manuscript, and its inclusion is not necessary for the main point. Indeed, the major conclusion of Fig 1 is that acute silencing of auditory cortex impairs task performance, and thus optogenetic methods are not suitable to test your hypothesis. However, this conclusion is also easily supported from decades of prior work, and thus citations might suffice.

      We do not agree that these data can easily be substituted with citations of prior published work. While previous studies (Talwar et al., 2001, Li et al., 2017) have demonstrated the impact of acute pharmacological silencing on sound detection in rodents, pharmacological and optogenetic silencing are not equivalent. Furthermore, we are aware of only one published study (Kato et al., 2015) that investigated the impact of optogenetically perturbing auditory cortex on sound detection (others have investigated its impact on discrimination tasks). Kato et al. (2015) examined the effect of acute optogenetic silencing of auditory cortex on the ability of mice to detect the offsets of very long (5-9 seconds) sounds, which is not easily comparable to the click detection task employed by us. Furthermore, when presenting our work at a recent meeting and leaving out the optogenetics results due to time constraints, audience members immediately enquired whether we had tried an optogenetic manipulation instead of lesions. Therefore, we believe that these data represent a valuable piece of information that will be appreciated by many readers and have decided not to remove them from the manuscript.

      A worst case scenario is that Figure 1 will detract from the reader's assessment of experimental rigor. The data of 1C are pooled from multiple sessions in three mice. It is not clear if the signed-rank test compares performance across n = 3 mice or n = 13 sessions. If the latter, a stats nitpicker could argue that the significance might not hold up with a nested analysis considering that some datapoints are not independent of one another. Finally, the experiment does not include a control group, gad2-cre mice injected with a EYFP virus. So as presented, the data are equally compatible with the pessimistic conclusion that shining light into the brain impairs mice's licking. My suggestion is to simply remove Figure 1 from the paper. Starting off with Figure 3 would be stronger, as the rest of the study hinges upon the knowledge that control and lesion mice's behavior is similar.

      Instead of reporting the results session-wise and doing stats on the d’ values, we now report results per mouse and perform stats on the proportions of hits and false alarms separately for each mouse. The results are statistically significant for each mouse and suggest that the differences in d’ are primarily caused by higher false alarm rates during the optogenetic perturbation than in the control condition.

      Changes to manuscript.

      New Figure 1.

      We agree that including control mice not expressing ChR2 would be important for fully characterizing the optogenetic manipulation and that the lack of this control group should be acknowledged. However, in the context of this study, the outcome of performing this additional experiment would be inconsequential. We originally considered using an optogenetic approach to explore the contribution of cortical activity to IC responses, but found that this altered the animals’ sound detection behavior. Whether that change in behavior is due to activation of the opsin or simply due to light being shone on the brain has no bearing on the conclusion that this type of manipulation is unsuitable for determining whether auditory cortex is required for the choice-related activity that we recorded in the IC.

      Changes to manuscript.

      Line 106: ”Although a control group in which the auditory cortex was injected with an EYFP virus lacking ChR2 would be required to confirm that the altered behavior results from an opsindependent perturbation of cortical activity, this result shows that this manipulation is also unsuitable… ”

      Figure 2, comment #1: The micrograph of panel B shows the densest fluorescence in the central IC. You interpret this as evidence of retrograde labeling of central IC neurons that project to the shell IC. This is a nice finding, but perhaps a more relevant micrograph would be to show the actual injection site in the shell layers. The rest of Figure 2 documents the non-auditory cortical sources of forebrain feedback. Since non-auditory cortical neurons may or may not target distinct shell IC sub-circuits, it's important to know where the retrograde virus was injected. Stylistic comment: The flow of the panels is somewhat unorthodox. Panel A and B follow horizontally, then C and D follow vertically, followed by E-H in a separate column. Consider sequencing either horizontally or vertically to maximize the reader's experience.

      Figure 2, comment # 2: It would also be useful to show more rostral sections from these mice, perhaps as a figure supplement, if you have the data. I think there is a lot of value here given a recent paper (Olthof et al., 2019 Jneuro) arguing that the IC receives corticofugal input from areas more rostral to the auditory cortex. So it would be beneficial for the field to know if these other cortical sources do or do not represent likely candidates for behavioral modulation in absence of auditory cortex.

      Figure 2, comment #3: You have a striking cluster of retrogradely labeled PPC neurons, and I'm not sure PPC has been consistently reported as targeting the IC. It would be good to confirm that this is a "true" IC projection as opposed to viral leakage into the SC. Indeed, Figure 2, supplement 2 also shows some visual cortex neurons that are retrogradely labeled. This has bearing on the interpretations, because choice-related activity is rampant in PPC, and thus could be a potential source of the task relevant activity that persists in your recordings. This could be addressed as the point above, by showing the SC sections from these same mice.

      All IC injections were made under visual guidance with the surface of the IC and adjacent brain areas fully exposed after removal of the imaging window. Targeting the IC and steering clear of surrounding structures, including the SC, was therefore relatively straightforward.

      We typically observed strong retrograde labeling in the central nucleus after viral injections into the dorsal IC and, given the moderate injection volume (~50 nL at each of up to three sites), it was also typical to see spatially fairly confined labeling at the injection sites. For the mouse shown in Figure 2, we do not have further images of the IC. This was one of the earliest mice to be included in the study and we did not have access to an automatic slide scanner at the time. We had to acquire confocal images in a ‘manual’ and very time-consuming manner and therefore did not take further IC images for this mouse. We have now included, however, a set of images spanning the whole IC and the adjacent SC sections for the mouse for which we already show sections in Figure 2 - figure supplement 2. These were added as Figure 2 - figure supplement 3A to the manuscript. These images show that the injections were located in the caudal half of the IC and that there was no spillover into the SC - close inspection of those sections did not reveal any labeled cell bodies in the SC. Furthermore, we include as Figure 2 - figure supplement 3B a dozen additional rostral cortical sections of the same mouse illustrating corticocollicular neurons in regions spanning visual, parietal, somatosensory and motor cortex. Given the inclusion of the IC micrographs in the new supplementary figure, we removed panel B from Figure 2. This should also make it easier for the reader to follow the sequencing of the remaining panels.

      Changes to manuscript.

      New Figure 2 - figure supplement 3.

      Line 159: “After the experiments, we injected a retrogradely-transported viral tracer (rAAV2-retrotdTomato) into the right IC to determine whether any corticocollicular neurons remained after the auditory cortex lesions (Figure 2, Figure 2 – figure supplement 2, Figure 2 – figure supplement 3). The presence of retrogradely-labeled corticocollicular neurons in non-temporal cortical areas (Figure 2) was not the result of viral leakage from the dorsal IC injection sites into the superior colliculus (Figure 2 – figure supplement 3).”

      Line 495: “...projections to the IC, such as those originating from somatosensory cortical areas (Lohse et al., 2021; Lesicko et al., 2016) and parietal cortex may have contributed to the response profiles that we observed.

      Figure 5 (see also public review point #2): I am not convinced that this unsupervised method yields particularly meaningful clusters; a grain of salt should be provided to the reader. For example, Clusters 2, 5, 6, and 7 contain neurons that pretty clearly respond with either short latency excitation or inhibition following the click sound on Hits. I would argue that neurons with such diametrically opposite responses should not be "classified" together. You can see the same issue in some of Namboodiri/Stuber's clustering (their Figure 1). It might be useful to make it clear to the reader that these clusters can reflect idiosyncrasies of the algorithm, the behavior task structure, or both.

      We agree.

      Changes to manuscript.

      Line 666: “While clustering is a useful approach for organizing and visualizing the activity of large and heterogeneous populations of neurons, we need to be mindful that, given continuous distributions of response properties, the locations of cluster boundaries can be somewhat arbitrary and/or reflect idiosyncrasies of the chosen method and thus vary from one algorithm to another. We employed an approach very similar to that described in Namboodiri et al. (2019) because it is thought to produce stable results in high-dimensional neural data (Hirokawa et al. 2019).”

      Methods:

      How was a "false alarm" defined? Is it any lick happening during the entire catch trial, or only during the time period corresponding to the response window on stimulus trials?

      The response window was identical for catch and stimulus trials and a false alarm was defined as licking during the response window of a catch trial.

      Changes to manuscript.

      Line 598: “During catch trials, neither licking (‘false alarm’) during the 1.5-second response window …”

      L597 and so forth: What's the denominator in the conversion from the raw fluorescence traces into DF/F? Did you take the median or mode fluorescence across a chunk of time? Baseline subtract average fluorescence prior to click onset? Similarly, please provide some more clarification as to how neuropil subtraction was achieved. This information will help us understand how the classifier can decode trial outcome from data prior to sound onset.

      Signal processing did not involve the subtraction of a pre-stimulus period.

      Changes to manuscript.

      Line 629: ”Neuropil extraction was performed using default suite2p parameters (https://suite2p.readthedocs.io/en/latest/settings.html), neuropil correction was done using a coefficient of 0.7, and calcium ΔF/F signals were obtained by using the median over the entire fluorescence trace as F0. To remove slow fluctuations in the signal, a baseline of each neuron’s entire trace was calculated by Gaussian filtering in addition to minimum and maximum filtering using default suite2p parameters. This baseline was then subtracted from the signal.”

      Was the experimenter blinded to the treatment group during the behavior experiments? If not, were there issues that precluded blinding (limited staffing owing to lab capacity restrictions during the pandemic)? This is important to clarify for the sake of rigor and reproducibility.

      Changes to manuscript.

      Line 574: “The experimenters were not blinded to the treatment group, i.e. lesioned or non-lesioned, but they were blind to the lesion size both during the behavior experiments and most of the data processing.”

      Minor:

      L127-128: "In order to test...lesioned the auditory cortex bilaterally in 7 out of 16 animals". I would clarify this by changing the word animals to "mice" and 7 out of 16 by stating n = 9 and n = 7 are control and lesion groups, respectively.

      Agreed.

      Changes to manuscript.

      Line 129: “...compared the performance of mice with bilateral lesions of the auditory cortex (n = 7) with non-lesioned controls (n = 9)”

      L225-226: You rule out self-generated sounds as a likely source of behavioral modulation by citing Nate Sawtell's paper in the DCN. However, Stephen David's lab suggested that in marmosets, post sound activity in central IC may in fact reflect self-generated sounds during licking. I suggest addressing this with a nod to SVD's work (Singla et al., 2017; but see Shaheen et al., 2021).

      Agreed.

      Changes to manuscript.

      Line 243: “(Singla et al., 2017; but see Shaheen et al., 2021)”

      Line 238 - 239: You state that proportions only deviate greater than 10% for one of the four statistically significant clusters. Something must be unclear here because I don't understand: The delta between the groups in the significant clusters of Fig 5C is (from left to right) 20%, 20%, 38%, and 12%. Please clarify.

      Our wording was meant to convey that a deviation “from a 50/50 split” of 10% means that each side deviates from 50 by 10% resulting in a 40/60 (or 60/40) split. We agree that that has the potential to confuse readers and is not as clear as it could be and have therefore dropped the ambiguous wording.

      Changes to manuscript.

      Line 253: ”,..the difference between the groups was greater than 20% for only one of them.”

      L445: I looked at the cited Allen experiment; I'd be cautious with the interpretation here. A monosynaptic IC->striatum projection is news to me. I think Allen Institute used an AAV1-EGFP virus for these experiments, no? As you know, AAV1 is quite transsynaptic. The labeled fibers in striatum of that experiment may reflect disynaptic labeling of MGB neurons (which do project to striatum).

      Agreed. We deleted the reference to this Allen experiment.

      L650: Please define "network activity". Is this the fluorescence value for each ROI on each frame of each trial? Averaged fluorescence of each ROI per frame? Total frame fluorescence including neuropil? Depending on who you ask, each of these measures provides some meaningful readout of network activity, so clarification would be useful.

      Changes to manuscript.

      Line 707: “Logistic regression models were trained on the network activity of each session, i.e., the ΔF/F values of all ROIs in each session, to classify hit vs miss trials. This was done on a frame-by-frame basis, meaning that each time point (frame) of each session was trained separately.

      Figure 3 narrative or legend: Listing the F values for the anova would be useful. There is pretty clearly a main effect of training session for hits, but what about for the false alarms? That information is important to solidify the result, and would help more specialized readers interpret the d-prime plot in this figure.

      Agreed. There were significant main effects of training day for both hit rates and false alarm rates (as well as d’).

      Changes to manuscript.

      Line 165: “The ability of the mice to learn and perform the click detection task was evident in increasing hit rates and decreasing false alarm rates across training days (Figure 3A, p < 0.01, mixed-design ANOVAs).”

      In summary, thank you for undertaking this work. Your conclusions are provocative, and thus will likely influence the field's direction for years to come.

      Thank you for those kind words and valuable and constructive feedback, which has certainly improved the manuscript.

      Reviewer #2 (Recommendations For The Authors):

      MAJOR CONCERNS

      (1) (Fig. 5) What fraction of individual neurons actually encode task-related information in each animal group? How many neurons respond to sound? The clustering and decoding analyses are interesting, but they obscure these simple questions, which get more directly at the main questions of the study. Suggested approach: For a direct comparison of AC-lesioned and -non-lesioned animals, why not simply compare the mean difference between PSTH response for each neuron individually? To test for trial outcome effects, compare Hit and Miss trials (same stimulus, different behavior) and for sound response effects, compare Hit and False alarm trials (same behavior, different response). How do you align for time in the latter case when there's no stimulus? Align to the first lick event. The authors should include this analysis or explain why their approach of jumping right to analysis of clusters is justified.

      We have now calculated the fraction of neurons that encode trial outcome by comparing hit and miss trial activity. That fraction does not differ between non-lesioned animals and lesioned animals as a whole, but is significantly smaller in mice with partial lesions. The author’s suggestion of comparing hit and false alarm trial activity to assess sound responsiveness is problematic because hit trials involve reward delivery and consumption. Consequently, they are behaviorally very different from false alarm trials (not least because hit trials tend to contain much more licking). Therefore, we calculated the fraction of neurons that respond to the acoustic stimulus by comparing activity before and after stimulus onset in miss trials. We found no significant difference between the non-lesioned and lesioned mice or between subgroups.

      We have addressed these points with the following changes to the manuscript:

      Line 217: “Indeed, close to half (1272 / 2649) of all neurons showed a statistically significant difference in response magnitude between hit and miss trials, while only a small fraction (97 / 2649) exhibited a significant response to the sound.”

      Line 307: “Although the proportion of individual neurons with distinct response magnitudes in hit and miss trials in lesioned mice did not differ from that in non-lesioned mice, it was significantly lower when separating out mice with partial lesions (Figure 6 – figure supplement 3).”

      Line 648: “Analysis of task-modulated and sound-driven neurons. To identify individual neurons that produced significantly different response magnitudes in hit and miss trials, we calculated the mean activity for each stimulus trial by taking the mean activity over the 5 seconds following stimulus presentation and subtracting the mean activity over the 2 seconds preceding the stimulus during that same trial. A Mann-Whitney U test was then performed to assess whether a neuron showed a statistically significant difference (Benjamini-Hochberg adjusted p-value of 0.05) in response magnitude between hit and miss trials. The analysis was performed using equal numbers of hit and miss trials at each sound level to ensure balanced sound level distributions. If, for a given sound level, there were more hit than miss trials, we randomly selected a sample of hit trials (without substitution) to match the sample size for the miss trials and vice versa. Sounddriven neurons were identified by comparing the mean miss trial activity before and after stimulus presentation. Specifically, we performed a Mann-Whitney U test to assess whether there was a statistically significant difference (Benjamini-Hochberg adjusted p-value of 0.05) between the mean activity over the 2 seconds preceding the stimulus and the mean activity over the 1 second period following stimulus presentation.”

      Some more specific concerns about focusing only on cluster-level and population decoding analysis are included below.

      (2) (L 234) "larger field of view". Do task-related or lesion-dependent effects depend on the subregion of IC imaged? Some anatomists would argue that the IC shell is not a uniform structure, and concomitantly, task-related effects may differ between fields. Did coverage of IC subregions differ between experimental groups? Is there any difference in task related effects between subregions of IC? Or maybe all this work was carried out only in the dorsal area? The differences between lesioned and non-lesioned animals are relatively small, so this may not have a huge impact, but a more nuanced discussion that accounts for observed or potential (if not tested) differences between regions of the IC.

      The specific subregion coverage could also impact the decoding analysis (Fig 6), and if possible it might be worth considering an interaction between field of view and lesion size on decoding.

      Each day we chose a new imaging location to avoid recording the same neurons more than once and aimed to sample widely across the optically accessible surface of the IC. We typically stopped the experiment only when there were no more new areas to record from. In terms of the depth of the imaged neurons, we were limited by the fact that corticorecipient neurons become sparser with depth and that the signal available from the GCaMP6f labeling of the Ai95 mice becomes rapidly weaker with increasing distance from the surface. This meant that we recorded no deeper than 150 µm from the surface of the IC. Consequently, while there may have been some variability in the average rostrocaudal and mediolateral positioning of imaging locations from animal to animal due to differences between mice in how much of the IC surface was visible, cranial window positioning, and in neuronal labeling etc, our dataset is anatomically uniform in that all recorded neurons receive input from the auditory cortex and are located within 150 µm of the surface of the IC. Therefore, we think it highly unlikely that small sampling differences across animals could have a meaningful impact on the results.

      Given that there is no consensus as to where the border between the dorsal and external/lateral cortices of the IC is located and that it is typically difficult to find reliable anatomical reference points (the location of the borders between the IC and surrounding structures is not always obvious during imaging, i.e. a transition from a labeled area to a dark area near the edge of the cranial window could indicate a border with another structure, but also the IC surface sloping away from the window or simply an unlabeled area within the IC), we made no attempt to assign our recordings from corticorecipient neurons to specific subdivisions of the IC.

      Changes to manuscript.

      Line 195: “We then proceeded to record the activity of corticorecipient neurons within about 150 µm of the dorsal surface of the IC using two-photon microscopy (Figure 4B, Video 1).”

      Line 375: “We imaged across the optically accessible dorsal surface of the IC down to a depth of about 150 µm below the surface. Consequently, the neurons we recorded were located predominantly in the dorsal cortex. However, identifying the borders between different subdivisions of the IC is not straightforward and we cannot rule out the possibility that some were located in the lateral cortex.”

      (3) (L 482-483) "auditory cortex is not required for the task-related activity recording in IC neurons of mice performing a sound detection task". Most places in the text are clearer, but this statement is confusing. Yes, animals with lesions can have a "normal"-looking IC, but does that mean that AC does not strongly modulate IC during this behavior in normal animals? The authors have shown convincingly that subcortical areas can both shape behavior and modulate IC normally, but AC may still be required for IC modulation in non-lesioned animals. Given the complexity of this system, the authors should make sure they summarize their results consistently and clearly throughout the manuscript.

      The reviewer raises an important point. What we have shown is that corticorecipient dorsal IC neurons in mice without auditory cortex show neural activity during a sound detection task that is largely indistinguishable from the activity of mice with an intact auditory cortex. In lesioned mice, the auditory cortex is thus not required. Whether the IC activity of the non-lesioned group can be shaped by input from the auditory cortex in a meaningful way in other contexts, such as during learning, is a question that our data cannot answer.

      Changes to manuscript.

      Line 508: "While modulation of IC activity by this descending projection has been implicated in various functions, most notably in the plasticity of auditory processing, we have shown in mice performing a sound detection task that IC neurons show task-related activity in the absence of auditory cortical input."

      LESSER CONCERNS

      (L. 106-107) "Optogenetic suppression of cortical activity is thus also unsuitable..." It appears that behavior is not completely abolished by the suppression. One could also imagine using a lower dose of muscimol for partial inactivation of AC feedback. When some behavior persists, it does seem possible to measure task-related changes in the IC. This may not be necessary for the current study, but the authors should consider how these transient methods could be applied usefully in the Discussion. What about inactivation of cortical terminals in the IC? Is that feasible?

      Our argument is not that acute manipulations are unsuitable because they completely abolish the behavior, but because they significantly alter the behavior. Although it would not be trivial to precisely measure the extent of pharmacological cortical silencing in behaving mice that have been fitted with a midbrain window, it should be possible to titrate the size of a muscimol injection to achieve partial silencing of the auditory cortex that does not fully abolish the ability to detect sounds. However, such an outcome would likely render the data uninterpretable. If no effect on IC activity was observed, it would not be possible to conclude whether this was due to the fact that the auditory cortex was only partially silenced or that projections from the auditory cortex have no influence on the recorded IC activity. Similarly, if IC activity was altered, it would not be possible to say whether this was due to altered descending modulation resulting from the (partially) silenced auditory cortex or to the change in behavior, which would likely be reflected in the choice-related activity measured in the IC.

      Silencing of corticocollicular axons in the IC is potentially a more promising approach and we did devote a considerable amount of time and effort to establishing a method that would allow us to simultaneously image IC neurons while silencing corticocollicular axons, trying both eNpHR3.0 and Jaws with different viral labeling approaches and mouse lines. However, we ultimately abandoned those attempts because we were not convinced that we had achieved sufficient silencing or that we would be able to convincingly verify this. Furthermore, axonal silencing comes with its own pitfalls and the interpretation of its consequences is not straightforward. Given that our discussion already contains a section (line 421) on axonal silencing, we do not feel there would be any benefit in adding to that.

      (Figure 1). Can the authors break down the performance for FA and HR, as they do in Fig. 3? It would be helpful to know what aspect of behavior is impaired by the transient inactivation.

      Good point. Figure 1 has been updated to show the results separately for hit rates, false alarms and d’. The new figure indicates that the change in d’ is primarily a consequence of altered false alarm rates. Please also see our response to a related comment by reviewer #1.

      Changes to manuscript.

      New figure 1.

      (Figure 4 legend). Minor: Please clarify, what is time 0 in panel C? Time of click presentation?

      Yes, that is correct.

      Changes to manuscript.

      Line 209: ”Vertical line at time 0 s indicates time of click presentation.”

      (L. 228-229). There has been a report of lick and other motor related activity in the IC - e.g., see Shaheen, Slee et al. (J Neurosci 2021), the timing of which suggests that some of it may be acoustically driven.

      Thanks for pointing this out. Shaheen et al., 2021 should certainly have been cited by us in this context as well as in other parts of the manuscript.

      Changes to manuscript.

      Line 243: “(Singla et al., 2017; but see Shaheen et al., 2021)”

      Also, have the authors considered measuring a peri-lick response? The difference between hit and miss trials could be perceptual or it could reflect differences in motor activity. This may be hard to tease apart, but, for example, one can test whether activity is stronger on trials with many licks vs. few licks?

      (L. 261) "Behavior can be decoded..." similar or alternative to the previous question of evoked activity, can you decode lick events from the population activity?

      The difference between hit and miss trial activity almost certainly partially reflects motor activity associated with licking. This was stated in the Discussion, but to make that point more explicitly, we now include a plot of average false alarm trial activity, i.e. trials without sound (catch trials) in which animals licked (but did not receive a reward).

      Given a sufficient number of catch trials, it should be possible to decode false alarm and correct rejection trials. However, our experiment was not designed with that in mind and contains a much smaller number of catch trials than stimulus trials (approximately one tenth the number of stimulus trials), so we have not attempted this.

      Changes to manuscript.

      New Figure 4 - figure supplement 1.

      (L. 315) "Pre-stimulus activity..." Given reports of changes in activity related to pupil-indexed arousal in the auditory system, do the authors by any chance have information about pupil size in these datasets?

      Given that all recordings were performed in the dark, fluctuations in pupil diameter were relatively small. Therefore, we have not made any attempt to relate pupil diameter to any of the variables assessed in this manuscript.

      (L. 412) "abolishes sound detection". While not exactly the same task, the authors might comment on Gimenez et al (J Neurophys 2015) which argued that temporary or permanent lesioning of AC did not impair tone discrimination. More generally, there seems to be some disagreement about what effects AC lesions have on auditory behavior.

      Thank you for this suggestion. Gimenez et al. (2015) investigated the ability of freely moving rats to discriminate sounds (and, in addition, how they adapt to changes in the discrimination boundary). Broadly consistent with later reports by Ceballo et al. (2019) (mild impairment) and O’Sullivan et al. (2019) (no impairment), Gimenez et al. (2015) reported that discrimination performance is mildly impaired after lesioning auditory cortex. Where the results of Gimenez et al. (2015) stand out is in the comparatively mild impairments that were seen in their task when they used muscimol injections, which contrast with the (much) larger impairments reported by others (e.g. Talwar et al., 2001; Li et al., 2017; Jaramillo and Zador, 2014).

      Changes to manuscript.

      Line 433: ”However, transient pharmacological silencing of the auditory cortex in freely moving rats (Talwar et al., 2001), as well as head-fixed mice (Li et al., 2017), completely abolishes sound detection (but see Gimenez et al., 2015).”

      (L. 649) "... were generally separable" Is the claim here that the clusters are really distinct from each other? This is unexpected, and it might be helpful if the authors could show this result in a figure.

      The half-sentence that this comment refers to has been removed from the methods section. Please also see a related comment by reviewer #1 which prompted us to add the following to the methods section.

      Changes to manuscript.

      Line 666: “While clustering is a useful approach for organizing and visualizing the activity of large and heterogeneous populations of neurons we need to be mindful that, given continuous distributions of response properties, the locations of cluster boundaries can be somewhat arbitrary and/or reflect idiosyncrasies of the chosen method and thus vary from one algorithm to another. We employed an approach very similar to that described in Namboodiri et al. (2019) because it is thought to produce stable results in high-dimensional neural data (Hirokawa et al. 2019).”

      Reviewer #3 (Recommendations For The Authors):

      (1) The authors must absolutely clarify if the hit versus misses decoding and clustering analysis is done for a single sound level or for multiple sound levels (what is the fraction of trials for each sound leve?). If the authors did it for multiple sound levels they should redo all analyses sound-level by sound-level, or for a single sound level if there is one that dominates. No doubt that there is information about the trial outcome in IC, but it should not be over-estimated by a confound with stimulus information.

      This is an important point. The original clustering analysis was carried out across different sound levels. We have now carried out additional analysis for distinguishing between two alternative explanations of the data, which were also raised by reviewer #1. – that the difference in neural activity between hit and miss trials could reflect a) the animals’ behavior or b) relatively more hit trials at higher sound levels, which would be expected to produce stronger responses. If the data favored b), we would expect no difference in activity between hit and miss trials when plotted separately for different sound levels. The new figure 4 - figure supplement 1 indicates that that is not the case. Hit and miss trial activity are clearly distinct even when plotted separately for different sound levels, confirming that this difference in activity reflects the animals’ behavior rather than sensory information.

      We made the following changes to manuscript.

      Line 214: “While averaging across all neurons cannot capture the diversity of responses, the averaged response profiles suggest that it is mostly trial outcome rather than the acoustic stimulus and neuronal sensitivity to sound level that shapes those responses (Figure 4 – figure supplement 1).”

      Differences in the distributions of sound levels in the different trial types could also potentially confound the decoding into hit and miss trials. Our analysis actually aimed to take this into account but, unfortunately, we failed to include sufficient details in the methods section.

      Changes to manuscript.

      Line 710: “Rather than including all the trials in a given session, only trials of intermediate difficulty were used for the decoding analysis. More specifically, we only included trials across five sound levels, comprising the lowest sound level that exceeded a d’ of 1.5 plus the two sound levels below and above that level. That ensured that differences in sound level distributions would be small, while still giving us a sufficient number of trials to perform the decoding analysis.“

      In this context, it is worth bearing in mind that a) the decoding analysis was done on a frame-byframe basis, meaning that the decoding score achieved early in the trial has no impact on the decoding score at later time points in the trial, b) sound-driven activity predominantly occurs immediately after stimulus onset and is largely over about 1 s into the trial (see cluster 3, for instance, or average miss trial activity in figure 4 - figure supplement 1), c) decoding performance of the behavioral outcome starts to plateau 500-1000 ms into the trial and remains high until it very gradually begins to decline after about 2 s into the trial. In other words, decoding performance remains high far longer than the stimulus would be expected to have an impact on the neurons’ activity. Therefore, we would expect any residual bias due to differences in the sound level distribution that our approach did not control for to be restricted to the very beginning of the trial and not to meaningfully impact the conclusions derived from the decoding analysis.

      Furthermore, we carried out an additional decoding analysis for one imaging session in which we had a sufficient number of trials to perform the analysis not only over the five (59, 62, 65, 68, 71 dB SPL) original sound levels, but also over a reduced range of three (62, 65, 68 dB SPL) sound levels, as well as a single (65 dB SPL) sound level (Figure 6 - figure supplement 1). The mean sound level difference between the hit trial distributions and miss trial distributions for these three conditions were 3.08, 1.01 and 0 dB, respectively. This analysis suggests that decoding performance is not meaningfully impacted by changing the range of sound levels (and sound level distributions) other than that including fewer sound levels means fewer trials and thus noisier decoding.

      Changes to manuscript.

      Line 287: ”...and was not meaningfully affected by differences in sound level distributions between hit and miss trials (Figure 6 – figure supplement 1).”

      Finally, in order to supplement the decoding analysis, we determined for each individual neuron whether there was a significant difference between the average hit and average miss trial activity. Note that this was done using equal numbers of hit and miss trials at each sound level to ensure balanced sound level distributions and to rule out any potential confound of sound level. This revealed that the proportion of neurons containing “information about trial outcome” was generally very high, close to 50% on average, and not significantly different between lesioned and non-lesioned mice.

      Changes to manuscript.

      Line 307: “Although the proportion of individual neurons with distinct response magnitudes in hit and miss trials in lesioned mice did not differ from that in non-lesioned mice, it was significantly lower when separating out mice with partial lesions (Figure 6 – figure supplement 3).”

      Line 648: “Analysis of task-modulated and sound-driven neurons. To identify individual neurons that produced significantly different response magnitudes in hit and miss trials, we calculated the mean activity for each stimulus trial by taking the mean activity over the 5 seconds following stimulus presentation and subtracting the mean activity over the 2 seconds preceding the stimulus during that same trial. A Mann-Whitney U test was then performed to assess whether a neuron showed a statistically significant difference (Benjamini-Hochberg adjusted p-value of 0.05) in response magnitude between hit and miss trials. The analysis was performed using equal numbers of hit and miss trials at each sound level to ensure balanced sound level distributions. If, for a given sound level, there were more hit than miss trials we randomly selected a sample of hit trials (without substitution) to match the sample size for the miss trials and vice versa. ”

      (2) I have the feeling that the authors do not exploit fully the functional data recorded with two-imaging. They identify several cluster but do not describe their functional differences. For example, cluster 3 is obviously mainly sensory driven as it is not modulated by outcome. This could be mentioned. This could also be used to rule out that trial outcome is the results of insufficient sensory inputs. Could this cluster be used to predict trial outcome at the onset response? Could it be used to predict the presence of the sound, and with which accuracy. The authors discuss a bit the different cluster type, but in a very elusive manner. I recognize that one should be careful with the use of signal analysis methods in calcium imaging but a simple linear deconvolution of the calcium dynamic who help to illustrate the conclusions that the authors propose based on peak responses. It would also be very interesting to align the clusters responses (deconvolved) to the timing of licking and rewards event to check if some clusters do not fire when mice perform licks before the sound comes. It would help clarify if the behavioral signals described here require both the presence of the sound and the behavioral action or are just the reflection of the motor command. As noted by the authors, some clusters have late peak responses (2 and 5). However, 2 and 5 are not equivalent and a deconvolution would evidence that much better. 2 has late onset firing. 5 has early onset but prolonged firing.

      We agree with the reviewer’s statement that “cluster 3 is obviously mainly sensory driven”. In the Discussion we refer to cluster 3 as having a “largely behaviorally invariant response profile to the auditory stimulus” (line X), which is consistent with the statement of the reviewer. With regard to the reviewer’s suggestion to describe the “functional differences” between the clusters, we would like to refer to the subsequent three sentences of the same paragraph in which we speculate on the cognitive and behavioral variables that may underlie the response profiles of different clusters. Given the limitations imposed by the task structure, we do not think it is justified to expand on this.

      We have added an additional analysis in order to explicitly address the question of which neurons are sound responsive (please also see response to point 3 below and to point 1 of reviewer #2). That trial outcome could be predicted on the basis of only the sound-responsive neurons’ activity during the initial period of the trial (“predict trial outcome at the onset response”) is unlikely given their small number (only 97 of 2649 neurons show a statistically significant sound-evoked response) and given that only a minority (42/98) of those sound-driven neurons are also modulated by trial outcome within that initial trial period (i.e. 0-1s after stimulus onset; data not shown).

      Changes to manuscript.

      Line 219: “..., while only a small fraction (97 / 2649) exhibited a significant response to the sound.”

      Line 658: “Sound-driven neurons were identified by comparing the mean miss trial activity before and after stimulus presentation. Specifically, we performed a Mann-Whitney U test to assess whether there was a statistically significant difference (Benjamini-Hochberg adjusted p-value of 0.05) between the mean activity over the 2 seconds preceding the stimulus and the mean activity over the 1 second period following stimulus presentation. This analysis was performed using miss trials with click intensities from 53 dB SPL to 65 dB SPL (many sessions contained very few or no miss trials at higher sound levels).”

      While calcium traces represent an indirect measure of neural activity, deconvolution does not necessarily provide an accurate picture of the spiking underlying those traces and has the potential to introduce additional problems. For instance, deconvolution algorithms tend to perform poorly at inferring the spiking of inhibited neurons (Vanwalleghem et al., 2021). Given that suppression is such a prominent feature of IC activity and is evident both in our calcium data as well as in the electrophysiology data of others (Franceschi and Barkat, 2021), we decided against using deconvolved spikes in our analyses. See also the side-by-side comparison below of the hit and miss trial activity of one example neuron based on either the calcium trace (left) or deconvolved spikes (right) (extracted using the OASIS algorithm (Friedrich et al., 2017) incorporated into suite2p (Pachitariu et al., 2016).

      Author response image 1.

      (3) Along the same line, the very small proportion of really sensory driven neurons (cluster 3) is not discussed. Is it what on would expect in typical shell or core IC neurons?

      As requested by reviewer #2 and mentioned in response to the previous point, we have now quantified the number of neurons in the dataset that produced significant responses to sound (97 / 2649). For a given imaging area, the fraction of neurons that show a statistically significant change in neural activity following presentation of a click of between 53 dB SPL and 65 dB SPL rarely exceeded ten percent. While that number is low, it is not necessarily surprising given the moderate intensity and very short duration of the stimuli. For comparison: Using the same transgenics, labeling approach and imaging setup and presenting 200-ms long pure tones at 60 dB SPL with frequencies between 2 kHz and 64 kHz, we typically find that between a quarter and a third of neurons in a given imaging area exhibit a statistically significant response (data not shown).

      Changes to manuscript.

      Line 219: “..., while only a small fraction (97 / 2649) exhibited a significant response to the sound.”

      Line 658: “Sound-driven neurons were identified by comparing the mean miss trial activity before and after stimulus presentation. Specifically, we performed a Mann-Whitney U test to assess whether there was a statistically significant difference (Benjamini-Hochberg adjusted p-value of 0.05) between the mean activity over the 2 seconds preceding the stimulus and the mean activity over the 1 second period following stimulus presentation. This analysis was performed using miss trials with click intensities from 53 dB SPL to 65 dB SPL (many sessions contained very few or no miss trials at higher sound levels).”

      Line 220: “While the number of sound-responsive neurons is low, it is not necessarily surprising given the moderate intensity and very short duration of the stimuli. For comparison: Using the same transgenics, labeling approach and imaging setup and presenting 200-ms long pure tones at 60 dB SPL with frequencies between 2 kHz and 64 kHz, we typically find that between a quarter and a third of neurons in a given imaging area exhibit a statistically significant response (data not shown).”

      (4) In the discussion, the interpretation of different transient and permanent cortical inactivation experiment is very interesting and well balanced given the complexity of the issue. There is nevertheless a comment that is difficult to follow. The authors state:

      If cortical lesioning results in a greater weight being placed on the activity in spared subcortical circuits for perceptual judgements, we would expect the accuracy with which trial-by-trial outcomes could be read out from IC neurons to be greater in mice without auditory cortex. However, that was not the case.

      However, there is no indication that the activity they observe in shell IC is causal to the behavioral decision and likely it is not. There is also no indication that the behavioral signals seen by the authors reflect the weight put on the subcortical pathway for behavior. I find this argument handwavy and would remove it.

      While we are happy to amend this section, we would not wish to remove it because a) we believe that the point we are trying to make here is an important and reasonable one and b) because it is consistent with the reviewer’s comment. Hopefully, the following will make this clearer: In order for the mouse to make a perceptual judgment and act upon it - in the context of our task, hearing a sound and then licking a spout - auditory information needs to be read out and converted into a motor command. If the auditory cortex normally plays a key role in such perceptual judgments, cortical lesions would require the animal to base its decisions on the information available from the remaining auditory structures, potentially including the auditory midbrain. This might result in a greater correspondence between the mouse’s behavior and the neural activity in those structures. That we did not observe this outcome for the IC could mean that the auditory cortex did not contribute to the relevant perceptual judgments (sound detection) in the first place. Therefore, no reweighting of signals from the other structures is necessary. Alternatively, greater weight might be placed exclusively on structures other than the auditory midbrain, e.g. the thalamus. The latter would imply that the contribution of the IC remains the same. This includes the possibility that the IC shell does not play a causal role in the behavioral decision – in either control mice or mice with cortical lesions – as suggested by the reviewer.

      Changes to manuscript.

      Line 471: “This could imply that, following cortical lesions, greater weight is placed on structures other than the IC, with the thalamus being the most likely candidate, ..”

      (5) In Fig. 5 the two colors used in B and C are the same although they describe different categories.

      The dark green and ‘deep orange’ we used to distinguish between non-lesioned and lesioned in Figure 5C are slightly lighter than the colors used to distinguish between these two categories in other figures and therefore might be more easily confused with the blue and red in Figure 5B. This has been changed.

    2. eLife assessment

      This study demonstrates that neurons receiving inputs from auditory cortex in the inferior colliculus widely encode the outcome of a sound detection task independant of the presence of auditory cortex. This valuable study based on imaging of transynaptically labelled neurons provides convincing evidence that auditory cortex is necessary neither for sound detection, nor to channel information related to behavioral outcome to the subcortical auditory system. This study will be of wide interest for sensory neuroscientists.

    3. Reviewer #1 (Public Review):

      The inferior colliculus (IC) is the central auditory system's major hub. It integrates ascending brainstem signals to provide acoustic information to the auditory thalamus. The superficial layers of the IC ("shell" IC regions as defined in the current manuscript) also receive a massive descending projection from the auditory cortex. This auditory cortico-collicular pathway has long fascinated the hearing field, as it may provide a route to funnel "high-level" cortical signals and impart behavioral salience upon an otherwise behaviorally agnostic midbrain circuit.

      Accordingly, IC neurons can respond differently to the same sound depending on whether animals engage in a behavioral task (Ryan and Miller 1977; Ryan et al., 1984; Slee & David, 2015; Saderi et al., 2021; De Franceschi & Barkat, 2021). Many studies also report a rich variety of non-auditory responses in the IC, far beyond the simple acoustic responses one expects to find in a "low-level" region (Sakurai, 1990; Metzger et al., 2006; Porter et al., 2007). A tacit assumption is that the behaviorally relevant activity of IC neurons is inherited from the auditory cortico-collicular pathway. However, this assumption has never been tested, owing to two main limitations of past studies:

      (1) Prior studies could not confirm if data were obtained from IC neurons that receive monosynaptic input from the auditory cortex.

      (2) Many studies have tested how auditory cortical inactivation impacts IC neuron activity; the consequence of cortical silencing is sometimes quite modest. However, all prior inactivation studies were conducted in anesthetized or passively listening animals. These conditions may not fully engage the auditory cortico-collicular pathway. Moreover, the extent of cortical inactivation in prior studies was sometimes ambiguous, which complicates interpreting modest or negative results.

      Here, the authors' goal is to directly test if the auditory cortex is necessary for behaviorally relevant activity in IC neurons. They conclude that surprisingly, task relevant activity in cortico-recipient IC neuron persists in absence of auditory cortico-collicular transmission. To this end, a major strength of the paper is that the authors combine a sound-detection behavior with clever approaches that unambiguously overcome the limitations of past studies.

      First the authors inject a transsynaptic virus into the auditory cortex, thereby expressing a genetically encoded calcium indicator in the auditory cortex's postsynaptic targets in the IC. This powerful approach enables 2-photon Ca2+ imaging from IC neurons that unambiguously receive monosynaptic input from auditory cortex. Thus, any effect of cortical silencing should be maximally observable in this neuronal population. Second, they abrogate auditory cortico-collicular transmission using lesions of auditory cortex. This "sledgehammer" approach is arguably the most direct test of whether cortico-recipient IC neurons will continue to encode task-relevant information in absence of descending feedback. Indeed, their method circumvents the known limitations of more modern optogenetic or chemogenetic silencing, e.g. variable efficacy.

      The authors have revised their manuscript and adequately addressed the major concerns. Although more in depth analyses of these rich datasets are definitely possible, the current results nevertheless stand on their own. Indeed, the work serves as a beacon to move away from the idea that cortico-collicular projections function primarily to impart behavioral relevance upon auditory midbrain neurons. This knowledge inspires a search for alternative explanations as to the role of auditory cortico-collicular synapses in behavior.

    4. Reviewer #2 (Public Review):

      Summary:

      This study takes a new approach to studying the role of corticofugal projections from auditory cortex to inferior colliculus. The authors performed two-photon imaging of cortico-recipient IC neurons during a click detection task in mice with and without lesions of auditory cortex. In both groups of animals, they observed similar task performance and relatively small differences in the encoding of task-response variables in the IC population. They conclude that non-cortical inputs to the IC provide can substantial task-related modulation, at least when AC is absent.

      Strengths:

      This study provides valuable new insight into big and challenging questions around top-down modulation of activity in the IC. The approach here is novel and appears to have been executed thoughtfully. Thus, it should be of interest to the community.

      Weaknesses:

      There are however, substantial concerns about the interpretation of the findings and limitations to the current analysis. In particular, Analysis of single unit activity is absent, making interpretation of population clusters and decoding less interpretable. These concerns should be addressed to make sure that the results can be interpreted clearly in an active field that already contains a number of confusing and possibly contradictory findings.

    5. Reviewer #3 (Public Review):

      Summary:

      This study aims to demonstrate that cortical feedback is not necessary to signal behavioral outcome to shell neurons of the inferior colliculus during a sound detection task. The demonstration is achieved in a very clear manner by the observation of the activity of cortico-recepient neurons in animals which have received lesions of the auditory cortex. The experiment shows that neither behavior performance nor neuronal responses are significantly impacted by cortical lesions except for the case of partial lesions which seem to have a disruptive effect on behavioral outcome signaling.

      Strengths:

      The demonstration of the main conclusions is based on state-of-the-art, carefully controlled methods and is highly convincing. There is an in depth discussion of the different effects of auditory cortical lesions on sound detection behavior.

      Weaknesses:

      The description of feedback signals could be more detailed although it is difficult to achieve good temporal resolution with the calcium imaging technique necessary for targeting cortico-recipient neurons.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We have made revisions accordingly. The following is a list of the changes we have made in this revised Version of Record:

      (1) We have added three more panels to Figure 1-figure supplement 1, showing that lipopolysaccharide-induced severe lung injury also generate some ectopic tuft cells expressing both Dclk1 and Gα-gustducin, a G protein α subunit expressed in taste bud cells and many tuft cells.

      (2) We have added a new supplemental figure, Figure 2-figure supplement 1, showing the reanalysis data of the single-cell RNAseq dataset (GSE197163) indicating the numbers of Trpm5-GFP+ ectopic tuft cells expressing Tas2r108, Tas2r105, Tas2r138, Tas2r137 and other Tas2rs, respectively. And the original “Figure 2-figure supplement 1” in the previous version has been changed to “Figure 2-figure supplement 2”.

      (3) We have added another new supplemental figure, Figure 3-figure supplement 1, showing the H1N1 infection-damaged lung tissue volumes in the Gng13-cKO mice are significantly greater than those in WT or Trpm-/- mice, which is in agreement with the data of the injured lung surface areas from these three genotypes of mice (Figure 3 C and D). And the original “Figure 3-figure supplement 1” in the previous version has been changed to “Figure 3-figure supplement 2”.

      (4) We have added to the new Figure 3-figure supplement 2 two new panels: I and J, showing the reanalysis data of the single-cell RNAseq dataset (GSE197163), indicating that about 57% of Trpm5-GFP+ ectopic tuft cells express Gγ13, some of which express Alox5, a key enzyme to the biosynthesis of pro-resolving mediators.

      (5) We have added one reference on Sytox and another on Alox5.

      (6) We have corrected two labeling errors to Figure 3 G and M, and some other typos in the article. Also, we have removed “Present address” attached to some authors since no present address was needed at all.

      Attached below is our point-by-point reply to the comments and suggestions made by the reviewers. We hope that you and the reviewers will find all concerns satisfactorily addressed.

      Responses to public reviews:

      Reviewer #1:

      Li et al. report here on the expression of a G-protein subunit Gng13 in ectopic tuft cells that develop after severe pulmonary injury in mice. By deleting this gene in ectopic tuft cells as they arise, the authors observed worsened lung injury and greater inflammation after influenza infection, as well as a decrease in the overall number of ectopic tuft cells. This was in stark contrast to the deletion of Trpm5, a cation channel generally thought to be required for all functional gustatory signaling in tuft cells, where no phenotype is observed. Strengths here include a thorough assessment of lung injury via a number of different techniques. Weaknesses are notable: confusingly, these findings are at odds with reports from other groups demonstrating no obvious phenotype upon influenza infection in mice lacking the transcription factor Pou2f3, which is essential for all tuft cell specification and development. The authors speculate that heterogeneity within nascent tuft cell populations, specifically the presence of pro- and anti-inflammatory tuft cells, may explain this difference, but they do not provide any data to support this idea.

      We thank the reviewer for pointing out the strengths of this work. The phenotypes of the Gng13 conditional knockout mice upon severe pulmonary injury seem to be severer than those of Trpm5 knockout or Pou2f3 knockout mice, which we would attribute to functionally specific tuft cell subtypes. In the intestines, tuft cells are known to promote type II innate immune responses. Those ectopic pulmonary tuft cells emerge at 12 days post infection, and may not be involved in the initial immune responses to the infection, and instead, some of them may contribute to the inflammation resolution and functional recovery. Reanalysis of the previously published single tuft cell RNAseq dataset indeed showed that Gng13 is expressed in a subset of these ectopic pulmonary tuft cells, and anti-inflammatory genes such as Alox5 are also found in some of these tuft cells (please see the newly added Figure 3 supplement 2 I and J). Together, these data suggest that while some of these tuft cells may still play a pro-inflammatory role as in the intestines, some other Gγ13-expressing tuft cells contribute to the inflammation resolution, and disruption of the latter’s function results in the severer phenotypes.

      Reviewer #2:

      The study by Li et al. aimed to demonstrate the role of the Gγ13-mediated signal transduction pathway in tuft cell-driven inflammation resolution and repairing injured lung tissue. The authors showed a reduced number of tuft cells in the parenchyma of Gγ13 null lungs following viral infection. Mice with a Gγ13 null mutation showed increased lung damage and heightened macrophage infiltration when exposed to the H1N1 virus. Their further findings suggested that lung inflammation resolution, epithelial barrier, and fibrosis were worsened in Gγ13 null mutants.

      Strengths:

      The beautiful immunostaining findings do suggest that the number of tuft cells is decreased in Gr13 null mutants.

      Weaknesses:

      The description of phenotypes, and the approaches used to measure the phenotypes are problematic. Rigorous investigation of the mouse lung phenotypes is needed to draw meaningful conclusions.

      Thank the reviewer for pointing out the major findings and strengths of our work. Regarding the approaches used to measure the phenotypes, we first did double immunostaining and validated that the lipopolysaccharide-induced DCLK1+ positive cells are indeed ectopic pulmonary tuft cells with an antibody to Gα-gustducin, a commonly expressed G protein α subunit in taste buds and tuft cells. Second, in addition to the measurements of the injured lung surface areas, we determined the injured lung tissue volumes by slicing the injured lungs into a series of tissue sections, quantifying the injured areas in each section and then reconstructing the injured volumes. Third, we reanalyzed the previously published single-tuft cell RNAseq dataset and found that a subset (i.e., ~57%) of Trpm5-GFP+ tuft cells express Gng13, some of which express anti-inflammatory genes such as Alox5. These additional data further support our finding that a subset of these Gγ13-expressing ectopic tuft cells may contribute to the inflammation resolution while others may play a proinflammatory role.

      Reply to the recommendations of Reviewer #1:

      (1) A major issue with this study is the fact that Chat-Cre mediated knockout of Gng13 leads to reduced tuft cells and impaired recovery, yet global TRPM5 deletion (this study) and global Pou2f3 deletion (Barr et al.) exhibit no apparent phenotype. One can imagine a Trpm5-independent role of Gng13 in tuft cells, but it is much harder to reconcile with the fact that Pou2f3 KO mice, which lack tuft cells entirely, exhibit no apparent phenotype. This was examined in some detail in Barr et al., demonstrating no apparent change in weight loss, dysplastic expansion (Krt5+ cells), or goblet cell metaplasia. The most parsimonious explanation is that Gng13 deletion in another Chat+ cell type, probably neurons of some sort, is leading to this phenotype. The authors really need to investigate this in some detail as the data does not really support a role of tuft cells in the phenotype they observe. Better yet, identification of the other Chat+ cell type in which Gng13 deletion promotes impaired lung recovery would be very interesting. While neurons seem likely, perhaps there is another Chat+ cell type expressing Gng13 in the respiratory tract that could be playing a role as well. In either case, the discrepancy between Pou2f3 KO (no phenotype) and Chat-Cre / Gng13 KO (impaired recovery) is difficult to reconcile.

      We agree with the reviewer, and it took us some time to make senses of the data as well. The differences in phenotypes between Trpm5-knockout versus Gng13 conditional knockout (Gng13-cKO) could be explained by that Gγ13 is a partner of Gβγ moiety of a heterotrimeric G protein (Gαβγ),which is known to act on many effector enzymes and ion channels, while Trpm5 largely regulates the influx of monovalent cations, depolarizing the plasma membrane potentials. Thus, it is understandable that nullification of Gng13 may have more profound effect on cell physiology and consequent phenotypes than that of Trpm5, and similar differential effects were also found in the intestines (Frontiers in Immunology, 2023, DOI 10.3389/fimmu.2023.1259521).

      Data from several research groups have indicated that there are subtypes of tuft cells, each of which displays unique gene expression patterns as well as input and out signal profiles. It is yet not well understood how each subtype may contribute to the inflammatory responses or inflammation resolution. Comparative analyses of our data from the Gng13-cKO mice versus those from Pou2f3-KO mice suggest that Gng13-expressing tuft cells may have a role in the inflammation resolution while other ectopic tuft cells may contribute to the maintenance of the inflammation at a certain level, impairing subsequent tissue repairing and recovery. The exact molecular and cellular mechanisms are to be revealed in our future studies.

      The central nervous system may also play a role in the impaired lung recovery. But our detailed immunochemical studies did not identify any significant number of neurons innervating the lung tissue co-expressing ChAT and Gng13, suggesting that no immediate action from these neurons may regulate the pulmonary inflammation resolution or functional recovery.

      Together, our data suggest the importance of tuft cell subtype-specific functions, which may help us further understand the role of these rare tuft cells.

      (2) Figures showing alternative injury models inducing the generation of ectopic tuft cells are not convincing and not quantified. DCLK1 can be a bit promiscuous, so verifying tuft cell expansion in these other models with other markers (especially for LPS and HDM which have not been reported elsewhere) is important.

      We agree with the reviewer that DCLK1 is not a very specific marker for tuft cells. We have also observed that chemical inductions of these ectopic tuft cells with bleomycin, HDM or LPS are not as effective as H1N1 viruses. To verify that these rare DCLK1-positive cells are indeed tuft cells, we performed double immunostaining with antibodies to DCLK1 and to Gα-gustducin, another tuft cell marker. The results showed that some of these spindle-shaped DCLK1 positive cells indeed also express Gα-gustducin (see the newly added panels in Figure 1-figure supplement 1), indicating that they are most likely the chemically induced ectopic tuft cells. We also agree with the reviewer that it would be important to further investigate the possible roles of these cells during the stages of the chemically induced injury, inflammation resolution and functional recovery.

      (3) Calcium responses in isolated post-flu tuft cells are interesting but difficult to interpret as presented. Can higher-power images be shown? Also, no statistical analysis is presented to provide any confidence in that data.

      Thank the reviewer for the suggestions. As found in taste buds, only a subset of these ectopic tuft cells expresses Tas2rs, and each of these cells may express a few of the 35 murine Tas2rs. Thus, a particular bitter tasting compound can activate only few tuft cells and we had to use low-magnification to include more responsive cells in a field under the imaging microscope. We agree with the reviewer that it would be an interesting idea to statistically correlate the response profile to bitter substances with the cell’s Tas2r expression pattern, which we have done with sperm cells before (Molecular Human Reproduction, 2013, doi:10.1093/molehr/gas040). However, the main focus of this work is on the effect of Gng13-cKO in a subset of these ectopic tuft cells on the recovery. We plan to investigate these interesting cells in more details in the future.

      (4) I am unaware of Sytox being a specific dye for pyroptotic cells. Can the authors please provide a reference or otherwise justify this?

      Sytox is a dye to stain dead cells, which has been used previously in the studies on gasdermin-mediated lytic cell death (Xi et al., Up-regulation of gasdermin C in mouse small intestine is associated with lytic cell death in enterocytes in worm-induced type 2 immunity. PNAS 2021 118(30) e2026307118 https://doi.org/10.1073/pnas.2026307118). In our work we used the dye for the same assay.

      (5) The authors perform qPCR for various taste receptor genes pre- and post-flu, but do not show that these genes are specifically induced in tuft cells. Since single-cell data and bulk RNA-Seq are available from Barr et al., the authors should validate the expression of these Tas2r genes specifically in post-flu tuft cells.

      Thank the reviewer for the suggestion. Yes, we have performed analysis of the single-cell RNAseq dataset (GSE197163, Barr et al. 2022) and found that among 613 Trpm5-GFP+ tuft cells, Tas2r108 was expressed in the greatest number of cells, i.e., 67 cells, followed by Tas2r105, Tas2R138, Tas2r137, Tas2r118 and Tas2r102, which were detected in 11, 10, 10, 5 and 4 cells, respectively (see the newly added Figure 2-figure supplement 1). This order of expressing cell numbers is very much in agreement with that of the relative Tas2r expression levels obtained with the qPCR experiment (Figure 2A), indicating the expression of these Tas2rs likely in the ectopic tuft cells. We will further validate the data by analyzing the bulk RNA-Seq dataset when it is accessible to us.

      (6) Some general editing of language throughout would be helpful to increase readability.

      Thanks for pointing out. We have carefully checked the manuscripts, corrected some typos and revised several sentences to increase its readability.

      (7) For the fibrosis analysis, trichrome staining is very heterogenous, which is reflected by the large error bars in Fig. 8B. A more quantitative, "whole lung" analysis such as hydroxyproline content or western blotting for Col1a1 would be ideal.

      The approach of Masson’s trichrome staining along with qRT-PCR assays on the fibrotic gene expression has been used previously to quantitatively analyze fibrosis (e.g., Zhang et al., Neuropilin-1 mediates lung tissue-specific control of ILC2 function in type 2 immunity. Nature Immunology 23:237-250, 2022, https://doi.org/10.1038/s41590-021-01097-8). We agree with the reviewer that there are large error bars in Fig. 8B, and hydroxyproline content assay or western blotting for Col1a1 would be ideal. But our qRT-PCR was performed on the RNA samples extracted from the “whole lungs”, and its data are also able to reflect the extent of fibrosis of the lungs.

      (8) The authors claim that only a subset of tuft cells express Gng13, but this is supported only by a single IF image in Fig. 3 supplement 1G. The authors could download the single-cell dataset from Barr et al. to confirm the heterogeneity of Gng13 expression and get a better sense of the fraction of total ectopic tuft cells that express this, as it is a critical point in their model.

      Thank the reviewer for the suggestion. Yes, we have downloaded and reanalyzed the single-cell RNAseq dataset (GSE197163), and found that out of 613 Trpm5-GFP+ tuft cells, 350 or 57% of these cells expressed Gng13 (Figure 3-figure supplement 2I). This result, together with our immunohistochemical data (Figure 3-figure supplement 2G and H) indicates that Gγ13 is expressed in a subset of these ectopic tuft cells. More comprehensive studies are needed to characterize these tuft cell subtypes and elucidate subtype-selective functions.

      Reply to the recommendations of Reviewer #2:

      The study needs more rigorous examinations of the phenotypes. For example, quantification of the injury area in Fig3C is problematic. Similarly, fibrotic phenotype and quantification in Fig 8C also have problems. This study heavily used qRT-PCR analysis to quantitate the level change of bitter/other receptors in a minor population of tuft cells which are also minor in a whole lung. Given the limited number of cells, it is difficult to appreciate that qRT-PCR can pick up the difference. In addition, how would the findings in this study reconcile with the finding by Huang (PMID: 36129169) where pou2f3 null mutants (without tuft cells) were used? Huang et al. did not observe more severe phenotypes in the mice without tuft cells than controls.

      Thank the reviewer for the recommendations. Regarding Fig 3C, please see the reply below: revisions for clarity point #2.

      Fig 8 B and C used Masson’s trichrome staining to quantitatively analyze fibrosis, which has been used by other groups as well (e.g., Zhang et al., Neuropilin-1 mediates lung tissue-specific control of ILC2 function in type 2 immunity. Nature Immunology 23:237-250, 2022, https://doi.org/10.1038/s41590-021-01097-8). Our qRT-PCR data on the fibrotic gene expression (Figure 8A) further support the Masson’s trichrome staining results.

      We realized that tuft cells make up only a minor population in the lungs. So, we performed qRT-PCR assays on the RNA samples isolated from mostly the injured tissues along with the corresponding tissues from the uninjured lungs as control. To validate our qRT-PCR data, we reanalyzed the previously published single ectopic tuft cell RNAseq dataset (GSE197163), and found that the most abundantly expressed Tas2r108 determined by qRT-PCR was also expressed in the greatest number of tuft cells, and the order of expression levels of other Tas2rs are also well in agreement between the qRT-PCR and single-cell RNAseq data (Figure 2A, Figure 2-figure supplement 1), cross-validating the data obtained by these two very different approaches.

      We have carefully studied the finding by Huang (PMID: 36129169). Our data suggest that there are subtypes of the ectopic tuft cells, some of which contribute to the inflammation resolution while others play a proinflammatory role. Indeed, the reanalysis of the aforementioned single tuft cell RNAseq dataset found that about 57% Trpm5-GFP+ ectopic tuft cells expressed Gng13, and some of which expressed Alox5, a key enzyme to the biosynthesis of pro-resolving mediators. Thus, in the Pou2f3-knockout mice, both pro- and anti-inflammatory tuft cells are ablated, it would be hard to observe any significant phenotypes. When the function of a subset of Gγ13-expressing tuft cells is disrupted, the anti-inflammatory role from these cells is eliminated, resulting severer phenotypes. More studies are needed to further understand the subtype-specific functions of these fascinating tuft cells.

      Do Gγ13 null mutants show similar phenotypes in bleomycin injury model?

      Bleomycin and other chemicals-induced injury models indeed engender much fewer ectopic pulmonary tuft cells. Thus, it is more difficult to test the effect of Gng13 mutation due to the small number of the Gng13-expressing tuft cells in either WT or mutant lungs.

      What is the cell fate of lineage labeled tuft cells in the lungs of Chat-Cre:Ai9:Gng13flox/flox mice following viral infection at different times examined? The numbers were decreased at different time points post-injury based on the data. Did these cells undergo apoptosis? It is an excellent idea to look into the cell fate of ChAT-Cre:Ai9:Gng13flox/flox. We believe that these cells would have a similar fate to other ectopic tuft cells, probably undergoing apoptosis. But our data suggest that Gng13 mutation suppresses the increase the ectopic tuft cells, or the increase of a particular subtype of these tuft cells. Further studies are needed to elucidate the molecular mechanisms of the Gγ13-mediated signal transduction pathways regulating the proliferation of a subset of ectopic tuft cells.

      Here are the revisions for clarity and coherence to the figures:

      (1) Fig 2: For the functional assessment, using tracheal tuft cells from the same ChAT-Cre:Ai9 mice would be a suitable positive control in the calcium response traces experiment. These specific cells could also serve as a control in Fig2a.

      We would agree with the reviewer that tracheal tuft cells from the same ChAT-Cre: Ai9 mice would be an ideal positive control in the calcium response experiment as well as in the qRT-PCR assay. But we have established reliable methods to calcium image primary cells expressing taste receptors and quantify their RNA expression levels, which have been used in our previous publications, e.g., (1) Functional characterization of bitter taste receptors expressed in mammalian testis. Molecular Human Reproduction, 2013, doi:10.1093/molehr/gas040; (2) Infection by the parasitic helminth Trichinella spiralis activates a Tas2r-mediated signaling pathway in intestinal tuft cells. PNAS 2019, www.pnas.org/cgi/doi/10.1073/pnas.1812901116. We thank the reviewer for the excellent suggestion.

      (2) Fig 3C: It is not clear whether the depicted areas really represent the injured area. To provide a more comprehensive view, the authors should also provide histological analysis and quantification of the injured lung. A 3D representation of the injury area would offer a more accurate presentation.

      Thank the reviewer for the point. The depicted areas in Fig 3C are indeed the injured surface areas of the lungs. Following the reviewer’s suggestion, we carried out the histological analysis to determine the injured tissue volumes of the lungs. We fixed the lungs, and sliced them into 12 μm-thick sections, which were imaged under a microscope. The injured areas in a section were identified and quantified using the ImageJ software, and then the injured volume for this section was obtained by multiplying the area by the thickness of the section, i.e., 12 μm. Statistical analyses indicate that the injured volume of the Gng13-cKO lungs is significantly more than those of WT or Trpm5-KO mice, which has been included in Figure 3-figure supplement 1, and is in agreement with the data of the injured surface areas (Fig 3C).

      (3) Fig 3 G/I/K/M: There seems to be an inconsistency in the time points. There's no indication for 14 dpi, yet two for 25 dpi. Additionally, a color legend for each sample would be helpful.

      Thank the reviewer for pointing out. There were two typos, which have been corrected. Yes, the time points should be 14 dpi, 20 dpi, 25 dpi and 50 dpi. And a color legend has been added as well.

      (4) Fig 4A: Using CD64 co-stained with Krt5 might better highlight the immune cells in the damaged region. Additionally, could you clarify the choice of the neutrophil marker CD64 over CD45 for staining the injured lung?

      We agree with the reviewer that Krt5 antibody staining can help define the damaged region. We sectioned the lung tissues with a special attention to the damaged areas, but we found that the adjacent healthy areas also had extra immune cells. Thus, we counted in all these CD64+ cells in both the damaged as well as the surrounding, seemingly healthy areas. We used CD64 instead of CD45 to label these altered immune cells because we found that CD64 can better label the differential immune cells between WT and Gng13-cKO mice following H1N1 infection. Furthermore, CD64-labeled cells could be readily related to the Gsdmd/Gsdme-expressing F4/80-labeled immune cells shown in Figure 5 and its supplemental figures.

      (5) Fig 5 and Supplemental Fig 5: It appears that the F4/80 staining exhibits notable background staining.

      Yes, there is some background staining. The antibody was the best we could find, but its quality could be further improved. On the other hand, we thought that there were some cellular debris that might be stained positive by that antibody. At a higher magnification, however, we could still identify individual cells co-expressing IL-1β.

      (6) Fig 8C: The depicted area does not seem to adequately represent the fibrosis in the injured lung.

      Masson’s trichrome staining has been previously used to quantitatively analyze fibrosis (e.g., Zhang et al., Neuropilin-1 mediates lung tissue-specific control of ILC2 function in type 2 immunity. Nature Immunology 23:237-250, 2022, https://doi.org/10.1038/s41590-021-01097-8). Our qRT-PCR assays on the fibrotic gene expression (Figure 8A) were performed on the RNA samples extracted from the whole lungs, and the resultant data are able to reflect the extent of fibrosis of the lungs, although we also agree with the reviewer that additional data would make the conclusion more convincing.

    2. eLife assessment

      This, in principle, useful study suggests that the G-protein subunit Gng13 is required for limiting injury and inflammation following H1N1 influenza infection via anti-inflammatory effects from ectopic tuft cells. While support for Gng13 helping to limit influenza injury in the transgenic mouse models used here is solid, evidence for these effects being mediated by normal tuft cells remains incomplete, giving conflicting data from mice that lack tuft cells entirely.

    3. Reviewer #1 (Public Review):

      Li et al. report here on the expression of a G-protein subunit Gng13 in ectopic tuft cells that develop after severe pulmonary injury in mice. By deleting this gene in ectopic tuft cells as they arise, the authors observed worsened lung injury and greater inflammation after influenza infection, as well as a decrease in the overall number of ectopic tuft cells. This was in stark contrast to deletion of Trpm5, a cation channel generally thought to be required for all functional gustatory signaling in tuft cells, where no phenotype is observed. Strengths here include a thorough assessment of lung injury via a number of different techniques. Weaknesses are notable: Confusingly, these findings are at odds with reports from other groups demonstrating no obvious phenotype upon influenza infection in mice lacking the transcription factor Pou2f3, which is essential for all tuft cell specification and development. The authors speculate that heterogeneity within nascent tuft cell populations, specifically the presence of pro- and anti-inflammatory tuft cells, may explain this difference, but they do not provide any data to support this idea.

      Notes on revision: The authors provided responses to some of my critiques. I think the central discrepancy between the lack of a phenotype in Pou2f3 and Trpm5 KO mice compared to the stronger phenotype in the Chat-Cre / Gng13 KO mice remains unresolved and will require future work to provide a clear model. This may or may not ultimately involve tuft cell heterogeneity.

    4. Reviewer #2 (Public Review):

      Summary:

      The study by Li et al. aimed to demonstrate the role of the G𝛾13-mediated signal transduction pathway in tuft cell-driven inflammation resolution and repairing injured lung tissue. The authors showed the reduced number of tuft cells in the parenchyma of G𝛾13 null lungs following viral infection. Mice with a G𝛾13 null mutation showed increased lung damage and heightened macrophage infiltration when exposed to the H1N1 virus. Their further findings suggested that lung inflammation resolution, epithelial barrier and fibrosis were worsen in G𝛾13 null mutants.

      Strengths:

      The revised study carefully analyzed phenotypes in mice lacking G𝛾13 in response to viral infection, providing further support that G𝛾13+ tuft cells play a role in the resolution of inflammation and injury repair.

    1. Reviewer #3 (Public Review):

      Summary:

      Prior research on SCC3, a cohesin subunit protein, in yeast and Arabidopsis has underscored its vital role in cell division. This study investigated into the specific functions of SCC3 in rice mitosis and meiosis. In a weakened SCC3 mutant, sister chromatids separating was observed in anaphase I, resulting in 24 univalents and subsequent sterility. The authors meticulously documented SCC3's loading and degradation dynamics on chromosomes, noting its impact on DNA replication. Despite the loss of homologous chromosome pairing and synapsis in the mutant, chromosomes retained double-strand breaks without fragmenting. Consequently, the authors inferred that in the scc3 mutant, DNA repair more frequently relies on sister chromatids as templates compared to the wild type.

      Strengths:

      The study presents exceptionally well-executed research in the field of rice cytogenetics.

      Weaknesses:

      While the paper's conclusions are generally well-supported, further substantiation is needed for the claim that SCC3 inhibits template choice for sister chromatids. To bolster this conclusion, I recommend that the authors perform whole-genome sequencing on parental and F1 individuals from two rice variants, subsequently calculating the allele frequencies at heterozygous sites in the F1 individuals. If SCC3 indeed inhibits inter-sister chromatid repair in the wild type, we would anticipate a higher frequency of inter-homologous chromosome repair (i.e., gene conversion). This should be manifested as a bias away from the Mendelian inheritance ratio (50:50) in the offspring of the wild type compared to the offspring of the scc3+/- mutant.

    2. eLife assessment

      This fundamental study elucidates the function of the cohesin subunit SCC3 in maintaining homologous chromosome pairing and synapsis during meiosis. The observation of sterility in the SCC3 weak mutant prompted an investigation of abnormal chromosome behavior during anaphase I, and the discovery that SCC3's loading onto meiotic chromosomes is REC8-dependent. The convincing evidence presented in this study contributes to our understanding of meiosis in rice and attracts cell biologists, reproductive biologists, and plant geneticists.

    3. Reviewer #1 (Public Review):

      Summary:

      The revised manuscript is much improved. As stated previously, it is on an interesting and important topic and provides many new potentially important findings. The manuscript contains a large amount of high-quality data. In the revised manuscript, the authors have done a nice job addressing the concerns raised in the previous review. They have refined their conclusions and the evidence provided supports conclusions drawn. Likewise, the writing and low of the manuscript is much improved.

      Strengths:

      The manuscript contains a large amount of high-quality data that is used to draw interesting and important conclusions.

      Weaknesses:

      There are still some issues with grammar and word usage, but these should be easily corrected with some additional minor editing. Other than some minor editing, my only real question/concern is whether the data presented shows that SCC3 is directly involved in gene regulation. It may well be that changes in chromatin structure caused by mutations in SCC3 and the axial element protein containing genes examined indirectly affect transcript levels for the genes examined.

    4. Reviewer #2 (Public Review):

      Summary:

      This manuscript shows detailed evidence about the role of cohesin regulator in rice meiosis and mitosis

      Strengths:

      There is a very clear mechanism for its role during replication

      Weaknesses:

      The authors did not consider to create heterozygous mutants for the replication fork.

      April 15. Revisions read.

    5. Author response:

      The following is the authors’ response to the original reviews.

      We express our sincere appreciation for your insightful comments and constructive suggestions. It is with great pleasure that we submit the revised version of our manuscript. Over the past months, we have meticulously considered all the invaluable feedback provided by the three anonymous reviewers, and endeavored to incorporate significant revisions accordingly. Furthermore, we have meticulously rephrased the results section in accordance with your guidance, aiming to bolster the rigor of our manuscript. The specific changes implemented in the revised manuscript are outlined below:

      - Revised the title of the manuscript.

      - Revised the description of early mitotic and meiotic chromosome structure in the scc3 mutant (Lines 167-274).

      - Added the BiFC results illustrating the interaction between SCC3 and other cohesin proteins in Figure S10.

      - Enhanced the detail in the description of figure legends, particularly for Figures 2 and 4.

      - Refined and rephrased the language of the manuscript.

      We hope these positive revisions have substantially strengthened the manuscript. Once again, we extend our heartfelt gratitude for your invaluable input.

      eLife assessment

      This important study elucidates the function of the cohesin subunit SCC3 in impeding DNA repair between inter-sister chromatids in rice. The observation of sterility in the SCC3 weak mutant prompted an investigation of abnormal chromosome behavior during anaphase I through karyotype analysis. While the evidence presented is largely solid, the strength of support can be substantially improved in some aspects, leaving room for further investigation. This research contributes to our understanding of meiosis in rice and attracts cell biologists, reproductive biologists, and plant geneticists.

      Public Reviews:

      Reviewer #1 (Public Review):

      The manuscript describes the identification and characterization of rice SCC3, including the generation and characterization of plants containing apparently lethal null mutations in SCC3 as well as mutant plants containing a c-terminal frame-shift mutation. The weak scc3 mutants showed both vegetative and reproductive defects. Specifically, mitotic chromosomes appeared to partially separate during prometaphase, while meiotic chromosomes were diffuse during early meiosis and showed alterations in sister chromatid cohesion, homologous chromosome pairing, and recombination. The authors suggest that SCC3 acts as a cohesin subunit in mitosis and meiosis, but also plays more functions other than just cohesion.

      Reviewer #2 (Public Review):

      This manuscript shows detailed evidence of the role of cohesin regulators in rice meiosis and mitosis.

      Reviewer #3 (Public Review):

      Prior research on SCC3, a cohesin subunit protein, in yeast and Arabidopsis has underscored its vital role in cell division. This study investigated into the specific functions of SCC3 in rice mitosis and meiosis. In a weakened SCC3 mutant, sister chromatids separating was observed in anaphase I, resulting in 24 univalents and subsequent sterility. The authors meticulously documented SCC3's loading and degradation dynamics on chromosomes, noting its impact on DNA replication. Despite the loss of homologous chromosome pairing and synapsis in the mutant, chromosomes retained double-strand breaks without fragmenting. Consequently, the authors inferred that in the scc3 mutant, DNA repair more frequently relies on sister chromatids as templates compared to the wild type.

      We extend our sincere gratitude to the Editors and the Reviewers for their highly constructive and insightful suggestions. We deeply appreciate receiving both positive feedback and constructive criticism on our manuscript. In light of the reviewers’ comments, we have diligently undertaken substantial revisions to improve the manuscript. The revised version comprehensively addresses all the points raised by the reviewers.

      Below, we provide a detailed point-by-point response to the reviewers’ comments:

      Recommendations for the authors:

      Reviewer #1:

      (1) Line 170- looking at pollen formation does not specifically evaluate whether SCC3 is involved in meiosis.

      Thank you very much for this advice. We totally agree with your point of view that pollen formation defects only indicate the problem of gametogenesis. We are sorry for not accurately describing this sentence. It has been revised in the manuscript (Lines 167-176).

      (2) Lines 203-205- this seems more like discussion and is pure speculation. Another possibility described above is that the truncated SCC3 protein is partially functional and what they see is due to this partial functionality. Have the authors considered the possibility that a partially functional version of SCC3 is produced that alters its function or the function of the cohesin complex? How much of the protein epitope remains in the truncated protein?

      We are so grateful for the insightful suggestions provided. We concur with the proposition that a partially functional SCC3 may indeed be synthesized, contributing to the survivability of the mutant. Notably, the truncated version of the protein retains approximately 60% to 70% of the epitope, which ostensibly maintains a residual functionality within the weak scc3 mutant. In this manuscript, the loss of C-terminal 910-1116 aa of SCC3 contains a special protein epitope and a certain protein secondary structure, which may alter the protein’s folding and its subsequent roles within the cohesin complex.

      In this study, we encountered challenges in generating null alleles of the scc3 mutants in rice utilizing the CRISPR-Cas9 system. Consequently, it is plausible that the scc3-1 and scc3-2 variants represent null alleles of SCC3, resulting in embryonic lethality. We posit that the identification of weak alleles is paramount to facilitating the survival of the organism. Thus, selecting some weak mutants, particularly those exhibiting the most pronounced phenotype, is advantageous for conducting further research. Our findings indicate that the diminished scc3 mutant lacks only a segment of the C-terminal, yet this deficiency is adequate to ensure the plant's survival while significantly impeding the meiotic process. We cannot dismiss the likelihood that these observed defects are attributable to the unique truncated proteins. We extend our sincerest thanks once again.

      (3) Lines 212- I question whether what the authors see in Figure 2 is chromosome fragmentation. It could just as well be alterations in chromosome structure. Likewise, the authors provide little to no evidence that the mutation affects the replication process. Rather, the presence of replicated chromosomes later in mitosis and meiosis would argue that replication is not disrupted.

      We express our gratitude to the reviewer for highlighting this critical inquiry. Contrary to the scenario of chromosome fragmentation, as you astutely observed, the preservation of normal sister chromatids during prometaphase indicates that the replication process remains uninterrupted. In alignment with your insights, our study embarked on an extensive series of full-length fluorescence in situ hybridization (FISH) experiments to elucidate the underlying mechanisms contributing to the observed increase in the distance between sister chromatids, particularly during interphase. The preponderance of our findings corroborates the hypothesis that the chromosomes exhibit alterations in structure, as depicted in Figure 2A. Intriguingly, our data suggest that cohesin, upon interaction with other chromatin-bound proteins, may facilitate loop extrusion, anchoring itself in a manner that potentially alters chromosomal architecture. These alterations in chromosome structure and the subsequent defects in genome folding and cohesion establishment, particularly rely on SCC3. In response to your valuable suggestions, we have meticulously revised the relevant sections of our manuscript. We extend our sincere thanks for your insightful comments.

      (4) Line 230- what does the sentence SCC3 may enhance the interaction with DNA mean, the interaction of the cohesin complex?

      We are sorry for the ambiguity in our initial description and wish to clarify that SCC3 indeed plays a pivotal role in augmenting the interaction between the cohesin complex and DNA. Our observations revealed an upsurge in the signal intensity of SCC3 as cells transition from interphase to prophase, as depicted in Figure 2B. This enhancement correlates with the observed defects in scc3 mutants during prophase, suggesting that SCC3’s functional significance is particularly pronounced at this stage of the cell cycle. We have revised our manuscript to reflect these insights more accurately, in accordance with your valuable suggestions. We express our sincere gratitude for your guidance.

      (5) Oddly, and unexplainably the authors present data indicating that SCC3 interacts with RAD21.1, but not SMC1, SMC3, or REC8. The fact that the authors report that SCC3 only interacts with RAD21.1 but no other cohesin proteins is quite hard to explain.

      As argued in the point above, the available data do not provide compelling evidence supporting the interaction between SCC3 and other cohesin proteins. We have repeated yeast two-hybrid (Y2H) experiments yielding consistent outcomes, which also surprised us initially. In the revised manuscript, we further added the bimolecular fluorescence complementation (BiFC) results between SCC3 and other cohesin proteins in rice protoplast (Figure S10). These supplementary data affirm that SCC3 predominantly interacts with RAD21.1, excluding interactions with other cohesin proteins. While the absence of such interactions is perplexing, our investigations have failed to detect any binding between SCC3 and other cohesin proteins.

      A weak interaction between SCC3 and REC8 has been reported in Arabidopsis (Kuttig et al. bioRxiv https://doi.org/10.1101/2022.06.20.496767). We speculate that either these proteins do not interact or the yeast-hybrid assays may be inadequate for detecting their interaction, as several factors can impede interaction in a heterologous system. In Figure 7, we could only detect the interaction between SCC3 and RAD21.1 in both Y2H and BiFC experiments. This suggests potential alterations in protein folding or conformation, or the involvement of additional regulatory factors modulating the interaction between SCC3 and other cohesin proteins. Notably, given RAD21.1’s pivotal role as a core component in the cohesin complex, our supplementary findings demonstrate the interactions between SMC1/3 and RAD21.1 (data not shown). Consequently, our current data propose a model wherein RAD21.1 and SMC1/3 form a circular structure, with SCC3 positioned on the outer periphery of the ring complex, associating specifically with RAD21.1 (Figure 8A).

      Reviewer #2:

      The authors did not consider creating heterozygous mutants for the replication fork. Moderate English language editing may be required.

      We extend our gratitude to the reviewer for their valuable suggestions. Initially, we did not explore the potential relationship between SCC3 and the replication fork. Cohesin, as we understand, becomes associated with DNA prior to DNA replication. The phenomenon of sister chromatid co-entrapment arises as replication forks traverse through cohesin rings, a process intricately linked to DNA replication dynamics. In this study, we exclusively observed aberrant chromosome structures in the scc3 mutant during interphase (Figure 2). We conjecture that these anomalies may stem from alterations in chromosome structure, such as genome folding and loop extrusion, rather than being directly attributable to the DNA replication fork. However, the precise nature of these chromosome structural aberrations during interphase in the scc3 mutant remains elusive, necessitating further comprehensive investigation in future studies. We have refined the language of our manuscript in accordance with the reviewer’s suggestions. Once again, we express our sincere appreciation for the invaluable suggestions provided.

      Reviewer #3:

      While the paper's conclusions are generally well-supported, further substantiation is needed for the claim that SCC3 inhibits template choice for sister chromatids. To bolster this conclusion, I recommend that the authors perform whole-genome sequencing on parental and F1 individuals from two rice variants, subsequently calculating the allele frequencies at heterozygous sites in the F1 individuals. If SCC3 indeed inhibits inter-sister chromatid repair in the wild type, we would anticipate a higher frequency of inter-homologous chromosome repair (i.e., gene conversion). This should be manifested as a bias away from the Mendelian inheritance ratio (50:50) in the offspring of the wild type compared to the offspring of the scc3+/- mutant.

      We express our sincere appreciation for your insightful suggestions. It is really a good suggestion. We have arranged to do this experiment. As it takes long time to prepare plant materials and sequence analysis, we hope the ongoing sequencing work will get some important information supporting those hypotheses. As we have not obtained the direct evidence that SCC3 involved in sister chromatid repair, we changed the title as “SCC3 is an axial element essential for homologous chromosome pairing and synapsis”. Once again, we really extend our gratitude for your invaluable suggestions.

      A point that warrants consideration is the placement of the protein interaction experiments involving SCC3 within the paper. It is presented relatively late in the manuscript. If the authors possess information regarding the interaction between RAD21.1 and SCC3 and how it relates to the functional study of RAD21.1, it could contribute to a more comprehensive analysis. However, if this information is unrelated to the current study, it might be advisable to omit it, as it appears to diverge from the main focus of this work.

      We express our sincere gratitude for your invaluable suggestions. It has been documented in yeast that the interaction between SCC3 and SCC1 is indispensable for the efficient loading of cohesin. In our study, we endeavored to elucidate the intricate relationships among various cohesin subunits. Through our investigations, we have discerned that RAD21.1 serves as a pivotal core subunit within the cohesin complex, facilitating interactions with both SMC1/3 and SCC3 (data not shown). Additionally, our findings indicate that the interaction between RAD21.1 and SCC3 is imperative for maintaining the stability of the cohesin ring and its association with DNA (data not shown). Consequently, the interaction between these two proteins assumes paramount importance for our subsequent analyses. This study holds significant promise for future investigations.

      It's worth noting that while the title of the study claims that "SCC3 inhibits inter-sister chromatids repair during rice meiosis," the last sentence of the abstract weakens this conclusion by using the word "seems." A study's title should ideally reflect the most definitive and conclusive findings.

      We sincerely appreciate your valuable suggestions. In response, we have revised the description in our manuscript to enhance its rigor.

      In Figure 8C, it appears that cohesin is depicted between two DNA strands.

      Figure 8C illustrates the process of sister chromatid repair during meiosis in the scc3 mutant. Two gray lines and two blue lines represent the four sister chromatids of two homologous chromosomes, respectively. In the wild type, cohesin plays a crucial role in tethering together the two sister chromatids. As per your reminder, cohesin should indeed encircle the two sister chromatids, as depicted in Figure 8B. Following a thorough evaluation and to mitigate any potential confusion, we have deleted Figure 8C.

    1. eLife assessment

      This study represents a useful description of a third interaction site between melanophilin and myosin-5a which has a role in regulating the distribution of pigment granules in melanocytes. While much of the data forms a solid case for this interaction, the inclusion of controls for the cellular studies and measurement of interaction affinities would have been helpful.

    2. Reviewer #1 (Public Review):

      Interactions known to be important for melanosome transport include exon F and the globular tail domain (GTD) of MyoVa with Mlph. Motivated by a discrepancy between in vitro and cell culture results regarding necessary interactions for MyoVa to be recruited to the melanosome, the authors used a series of pull-down and pelleting assays experiments to identify an additional interaction that occurs between exon G of MyoVa and Mlph. This interaction is independent of and synergistic with the interaction of Mlph with exon F. However, the interaction of the actin-binding domain of Mlph can occur either with exon G or with the actin filament, but not both simultaneously. These data lead to a modified recruitment model where both exon F and exon G enhance binding of Mlph to auto-inhibited MyoVa, and then via an unidentified switch (PKA?) the actin-binding domain of Mlph dissociates from MyoVa and interacts with the actin filament to enhance MyoVa processivity.

      The only weakness noted is that the authors could have had a more complete story if they pursued whether PKA phosphorylation/dephosphorylation of Mlph is indeed the switch for the actin-binding domain of Mlph to interact with exon G versus the actin filament.

    3. Reviewer #2 (Public Review):

      The authors identify a third component in the interaction between myosin Va and melanophilin- an interaction between a 32-residue sequence encoded by exon-g in myosin Va and melanophilin's actin binding domain. This interaction has implications for how melanosome motility may be regulated.

      The authors have now included some necessary controls that were requested. In terms of adding new information to increase the significance and impact of the paper, they added a single affinity measurement. Unfortunately, it did not involve Exon G specifically. Moreover, they did not add any new mechanistic or functional data to provide a more conceptual advance. For example, is the Exon G interaction regulated by phosphorylation? Is this what dictates the choice between Mlph's actin binding domain (ABD) binding to actin or to exon-G. How does local actin concentration influence this decision. What changes regarding melanosome dynamics in cells between these two alternatives? Do in vitro reconstitution assays show that binding to Exon-G instead of actin affects the processivity of a Rab27a/Myosin 5a/Mlph transport complex? Finally, while the authors make clear in the abstract and text that they are just identifying a third component that mediates the Melanophilin-dependent association of myosin-5a with melanosomes, the title gives the impression that they identified all three in this manuscript. I really think the title should be changed to something like Identification of a third component that mediates the Melanophilin-dependent association of myosin-5a with melanosomes, as this accurately reflects what is new in this work.

    4. Author response:

      The following is the authors’ response to the original reviews.

      We appreciate your comments and suggestions on our manuscript.

      In particular, we have measured the affinity between the middle tail domain of myosin-5a (Myo5a-MTD) and the actin-binding domain of melanophilin (Mlph-ABD) using microscale thermophoresis, and obtained the Kd of ~0.56 uM, which is similar to the Kd of the globular tail domain of myosin-5a (Myo5a-GTD) to the GTD-binding motif of melanophilin (Mlph-GTBM). Moreover, we have performed Western blot of the lysate of transfected cells, showing that the proteins of the dominant negative construct and the negative control were expressed at similar lever without noticeable degradation.

      We appreciate the editors’ and reviewers’ comment on how melanophilin might be regulated in binding to the exon-G of myosin-5 and to actin filaments. Phosphorylation of melanophilin by protein kinase A is one possible mechanism. We will investigate this issues in our future study.

      We also took this opportunity to correct several minor errors in the manuscript. Textual alterations can be viewed in the “tracked change” version of the manuscript. Below is the comments from the editors and the two reviewers together with our point-by-point responses.

      eLife assessment

      This study represents a useful description of a third interaction site between melanophilin and myosin-5a which is important in regulating the distribution of pigment granules in melanocytes. While much of the data forms a solid case for this interaction, the inclusion of important controls for the cellular studies and measurement of interaction affinities would have been helpful.

      Public Reviews:

      Reviewer #1 (Public Review):

      Interactions known to be important for melanosome transport include exon F and the globular tail domain (GTD) of MyoVa with Mlph. Motivated by a discrepancy between in vitro and cell culture results regarding necessary interactions for MyoVa to be recruited to the melanosome, the authors used a series of pull-down and pelleting assays experiments to identify an additional interaction that occurs between exon G of MyoVa and Mlph. This interaction is independent of and synergistic with the interaction of Mlph with exon F. However, the interaction of the actin-binding domain of Mlph can occur either with exon G or with the actin filament, but not both simultaneously. These data lead to a modified recruitment model where both exon F and exon G enhance the binding of Mlph to auto-inhibited MyoVa, and then via an unidentified switch (PKA?) the actin-binding domain of Mlph dissociates from MyoVa and interacts with the actin filament to enhance MyoVa processivity.

      The only weakness noted is that the authors could have had a more complete story if they pursued whether PKA phosphorylation/dephosphorylation of Mlph is indeed the switch for the actin-binding domain of Mlph to interact with exon G versus the actin filament.

      We thank Reviewer #1 for careful reading of the manuscript and appreciation of the study. We agree with the Reviewer that it is important to understand how the actin-binding domain of Mlph switch its interaction with the exon-G of Myo5a and actin filament. We would like to pursue this direction in our future research.

      Reviewer #2 (Public Review):

      The authors identify a third component in the interaction between myosin Va and melanophilin- an interaction between a 32-residue sequence encoded by exon-g in myosin Va and melanophilin's actin-binding domain. This interaction has implications for how melanosome motility may be regulated.

      While this work is largely well done and certainly publishable following needed revisions (e.g. some affinity measurements, necessary controls for the dominant negative experiments), I believe that additional work would be required to make a more compelling case. First, the study provides just one more piece to a well-developed story (the role of exon-F and the GTD in myosin Va: melanophilin (Mlph) interaction), much of which was published 20 years ago by several labs. Second, the study does not demonstrate a physiological significance for their findings other than that exon-G plays an auxiliary role in the binding of myosin Va to Mlph. For example, what dictates the choice between Mlph's actin binding domain (ABD) binding to actin or to exon-G. Is it a PTM or local actin concentration? It is unlikely to be alternative splicing as exon-G is present in all spliced isoforms of myosin Va. And what changes re melanosome dynamics in cells between these two alternatives? Similarly, the paper does not provide any in vitro evidence that binding to exon-G instead of actin effects the processivity of a Rab27a/Myosin Va/Mlph transport complex. For example, if the ABD sticks to exon-G instead of actin, does that block Mlph's ability to promote processivity through its interaction with the actin filament during transport? In summary, given that the authors did not directly test their model either in vitro or in cells, I do not think this story represent a significant conceptual advance.

      We thank Reviewer #2 for careful reading of the manuscript and the suggestions of improving the manuscript. As suggested by the reviewer, we have measured the affinity between the middle tail domain of Myo5a (Myo5a-MTD) and Mlph-ABD (Kd ~0.562 uM), which is similar to that between the globular tail domain of Myo5a (Myo5a-GTD) and the GTBM of Mlph. In addition, we have performed additional experiments showing the integrity and the expression level of the dominant negative constructs in the transfected cells.

      We believe more extensive experiments are required to address other questions raised by the reviewer. For example, what dictates the choice between Mlph's actin binding domain (ABD) binding to actin or to exon-G is an open question. As we proposed, phosphorylation by protein kinase A is only one possible mechanism. We would like to pursue them in our future research.

      Recommendations for the authors:

      The reviewing editor feels strongly that addressing some of the points raised by the reviewers would make this a more compelling manuscript. In particular, a measurement of the affinity of the relevant fragments from melanophilin and myosin-5a would indicate that the interaction might be physiologically relevant. Concerning the dominant negative experiments, the lack of effect of an expressed fragment could be that the expressed fragments were simply degraded or expressed at too low of a level to be competing. The reviewer gives guidelines on how to address this. Reviewer #2 made a point that it would be compelling if the effect of phosphorylation as suggested in the model was tested, but we all agree that this could well be the subject of a later study. In addition, the authors make a very interesting proposal for how protein kinase A could be involved in this regulation as has been suggested previously. Perhaps the use of phosphomimetic mutations could give some insight into this. Such experiments, if consistent with the proposed model would certainly raise the impact of this study. Finally, a very clear periodicity in hydrophobic amino acids is apparent in the interacting sequences of both Myo5 (yrisLykrMidLmeqLekqdktVrkLkkqLkvFakkIgeLevgqmen) and Mlph (tdeeLseMedrVamtAseVqqAeseIsdIesrIaaLra). This is strongly suggesting a leucine-zipper-like coiled coil, rather than an interaction mediated solely by charge. Recent softwares (and easily accessible too) like AlphaFold multimer might yield important structural insight into the binding configuration and might help rationalize the effect of the mutations herein.

      We thank the editors and the reviewers for their suggestions of improving the manuscript. We have performed the several essential experiments to address the concerns raised by the reviewers.

      (1) Regarding the affinity of the relevant fragments from melanophilin and myosin-5a. We have measured the affinity between Mlph-ABD and Myo5a-MTD using MST (Kd ~562 nM) (see revised Figure 3A).

      (2) Regarding the concerns on the dominant negative experiments. We have examined the molecular sizes and expression levels of  Mlph or Myo5a constructs by Western blots. First, we show that all constructs have correct molecular size in transfected cells (see revised Figure 6C and 7D), indicating that the inability of Myo5a or Mlph truncations to generate dilute-like phenotypes was not due to the intracellular degradation of the EGFP fusion protein. Second, by correcting for the percentage of transfected cells, we show that the overall expression levels of the wild-type construct and the mutants are roughly equal. Third, we categorized the expression levels into high and low, and calculated percentage of the DN phenotype in high and low expression levels. The results are consistent with the percentage of DN phenotype in total EGFP fusion protein cells.

      (3) Regarding the suggestion to investigate the effect of phosphorylation by protein kinase A on Mlph-ABD’s interaction with Myo5a and actin filament. We understand that it is important to elucidate the mechanism by which the actin-binding domain of Mlph switch its interaction with the exon-G of Myo5a and actin filament. However, as we proposed, phosphorylation by protein kinase A is one possible mechanism, and more extensive experiments are required to address this question. Therefore, we would like to pursue it in our future research.

      (4) Regarding the suggestion to predict the interaction between the exon-G of myosin-5a and Mlph-ABD using AlphaFold. We have used AlphaFold multimer to predict the Myo5a-MTD/Mlph-ABD interaction. Remarkably, the AlphaFold predicted that the binding of Myo5a-MTD with Mlph-ABD is mediated by an antiparallel coiled-coil formed by Myo5a (1430-1467) and Mlph (450-481), just as predicted by the editors. This prediction is also consistent with our finding that the exon-G of Myo5a interacts with Mlph-ABD. However, the predicted model cannot explain our mutagenesis results. We will pursue this point in the future research. Nevertheless, we are grateful to the editors for bringing this idea to our attention, because it will help us to design experiments to investigate the nature of Myo5a-exon-G/Mlph-ABD interaction.

      Reviewer #1 (Recommendations For The Authors):

      Specific minor comments

      Q1: In figs 6-7 an overlay between DAPI and EGFP would be helpful for the reader to see perinuclear distribution.

      As suggested, we have added the merged images of DAPI and EGFP in the revised Figure 6 and 7.

      Q2: The delta symbol in the pdf text was corrupted.

      The corrupted delta symbol has been fixed in the revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      Q1: Please explain in detail early in the text what exon-G is - length, position in the tail, and evidence that it is a coiled coil (CC). Of note, is it only long enough for about 4 heptad repeats? Has it been shown biochemically to form a CC? Is the CC irreversible? What would be the consequence of removing the exon-G CC on the ability of surrounding regions to bind Mlph (exon-F and the GTD)?

      We thank the reviewer for this suggestion. In the revision, we added a new paragraph (the first paragraph in the results section) and revised Figure 1A to introduce the middle tail domain and alternatively spliced exons of Myo5a.

      Exon-G is 32 amino acids in length, located at the C-terminal region of the middle tail domain, immediately before the globular tail domain. Exon-G region was predicted to form a short coiled-coil by using on-line tools (such as paircoil), and this prediction has not been tested biochemically. Moreover, we do not know whether the exon-G coiled-coil is reversible or not.

      We have not examined the effect of removing the whole exon-G on the interaction between the GTD and Mlph-GTBM. The exon-G (residues 1436-1467) and the GTD core (residues 1498-1877) are separated by a long loop of 31 residues. We therefore expect that the removing the exon-G will not affect the GTD/Mlph-GTBM interaction.

      Physically, exon-F is immediately followed by exon-G, and those two regions might interfere with each other. In our preliminary study, we found that removing the whole exon-G abolished the interaction between exon-F and Mlph-EFBD. On the other hand, removing the C-terminal half (residues 1454-1467) of exon-G had little effect the interaction between exon-F and Mlph-EFBD (see Figure 2C). In this work, we intentionally selected the later construct for functional analysis of the exon-G/Mlph-ABD interaction, because removing the C-terminal half of exon-G abolishes the interaction with Mlph-ABD, but does not affect the exon-F/Mlph-EFBD interaction.

      Q2: Figures 1-3. While the pulldown experiments demonstrating an interaction between Mlph-ABD residues 446-571 and Myo5a-MTD are a good start, one would like to see affinity measurements to gauge the likelihood that this interaction is physiologically relevant. The same goes for the pulldown experiments demonstrating an interaction between (i) the C-terminal half of exon-G (residues 1453-1467) and the Mlph-ABD, (ii) between residues 1411-1467 (a short peptide containing exon-F and exon-G) and the Mlph-ABD, and (iii) between residues 1436-1467 (a short peptide containing exon-G) and the Mlph-ABD. This would also apply to the pulldowns in 3C-3E where versions of the proteins with charge residue changes were tested.

      We agree the reviewer’s opinion that determination of the affinities between Mlph-ABD and Myo5a-MTD and their variants will be helpful in understanding the physiological relevance of Exon-G/Mlph-ABD interaction. However, the extensive experiments suggested by the reviewer require many high quality, purified proteins, which are not trivial.

      Nevertheless, we think it is important to know the affinity between Myo5a-MTD and Mlph-ABD (both wild-type), as this parameter can be used for the comparison of the three interactions between Myo5a and Mlph. Therefore, we have obtained the affinity between Myo5a-MTD and Mlph-ABD using microscale thermophoresis (MST). The dissociation constant (Kd) of Myo5a-MTD to Mlph-ABD is 0.562±0.169 uM, which is similar to that between Myo5a-GTD and Mlph-GTBM (~1 uM) (Geething & Spudich (2007) JBC 282:21518). Consistent with GST pulldown results, MST shows that deletion of C-terminal half of exon-G (1453-1467) greatly decreases the MST signals (see revised Figure 3A).

      Q3: While the domain negative (DN) approach to testing functional significance is OK, rescuing dilute/myosin Va null melanocytes with full-length myosin Va containing the various deletions would have been more convincing. Also, the authors must show (i) that the DN constructs are the correct size in transfected cells (i.e. are not degraded), and (ii) that they are expressed at roughly equal levels (either by doing Westerns and correcting for the percent of transfected cells, or by measuring total cellular fluorescence in transfected cells). Without this information, it remains possible that constructs not exhibiting a DN effect are simply degraded or poorly expressed. This applies to all the DN data in Figures 6 and 7.

      We agree with the reviewer that Myo5a null melanocytes is ideal for investigating exon G function. Unfortunately, we do not have Myo5a null melanocytes derived from dilute mice.

      To confirm the integrity of the overexpressed proteins in the transfected cells, we performed Western blot of those proteins, including  EGFP-Mlph-RBD (wild-type and two mutants) and Myo5a-Tail (wild-type and G mutant), in the lysate of the transfected cells. Western blots show that all those proteins have correct molecular masses, indicating no degradation of those overexpressed proteins (see revised Figure 6C and 7C). Moreover, by correcting for the percentage of transfected cells, we show that the overall expression levels in each transfected cell of the wild-type construct and the mutants are roughly equal. This information is included in the revised manuscript (Line 222-225; 237-241).

      Q4: The authors scored the DN phenotype as yes/no but it mostly likely varies depending on the degree of over-expression. Showing that the degree of melanosome centralization scales with the degree of overexpression, and that the correlation between expression level and phenotype varies depending on the construct would strengthen the results.

      We agree with the reviewer’s prediction that the degree of DN phenotype should depend on the of over-expression level. We analyzed the EGFP signals of transfected cells and found very few cells with medium expression level. Therefore, we simply categorized the expression levels into high and low, and calculated the DN phenotype in each categories as shown in the table below. These results are consistent with the expectation that the degree of DN phenotype depends on the over-expression level of the transfected constructs.

      Author response table 1.

      Percentage of the EGFP-expressing cells with perinuclear aggregation of melanosomes

      Q5: The conclusion from the data in Figure 8A- "the presence of both exon-F and exon-G is insufficient for binding to the Mlph occupied by Myo5a, but sufficient for binding to the unoccupied Mlph"- should be verified by also doing the experiment in myosin Va knockdown cells.

      We agree. Unfortunately, our RNAi knockdown of Myo5a in melanocytes by RNAi is not ideal and we do not have Myo5a knockout melanocytes. We will pursue this point in the future.

      Q6: Line 213 "three Mlph-binding regions, i.e., exon-F, exon-F, and GTD (Figure 7A)" has a typo.

      This typo has been corrected.

      Q7: The authors should provide high mag insets for the images in Figure 8.

      As suggested, we have revised Figure 8 by including high mag insets for the images.

    1. Author response

      Reviewer #1 (Public Review):

      Summary:

      The authors aimed to modify the characteristics of the extracellular matrix (ECM) produced by immortalized mesenchymal stem cells (MSCs) by employing the CRISPR/Cas9 system to knock out specific genes. Initially, they established VEGF-KO cell lines, demonstrating that these cells retained chondrogenic and angiogenic properties. Additionally, lyophilized carriage tissues produced by these cells exhibited retained osteogenic properties.

      Subsequently, the authors established RUNX2-KO cell lines, which exhibited reduced COLX expression during chondrogenic differentiation and notably diminished osteogenic properties in vitro. Transplantation of lyophilized carriage tissues produced by RUNX2-KO cell lines into osteochondral defects in rat knee joints resulted in the regeneration of articular cartilage tissues as well as bone tissues, a phenomenon not observed with tissues derived from parental cells. This suggests that gene-edited MSCs represent a valuable cell source for producing ECM with enhanced quality.

      Strengths:

      The enhanced cartilage regeneration observed with ECM derived from RUNX2-KO cells supports the authors' strategy of creating gene-edited MSCs capable of producing ECM with superior quality. Immortalized cell lines offer a limitless source of off-the-shelf material for tissue regeneration.

      We thank the reviewer for the interest in our work. We however want to clarify that the present manuscript does not report the generation of ECM with “superior quality”, but rather of modulated composition and thus function.

      Weaknesses:

      Most data align with anticipated outcomes, offering limited novelty to advance scientific understanding. Methodologically, the chondrogenic differentiation properties of immortalized MSCs appeared deficient, evidenced by Safranin-O staining of 3D tissues and histological findings lacking robust evidence for endochondral differentiation. This presents a critical limitation, particularly as authors propose the implantation of cartilage tissues for in vivo experiments. Instead, the bulk of data stemmed from type I collagen scaffold with factors produced by MSCs stimulated by TGFβ.

      The chondrogenic differentiation of our MSOD-B line and their capacity of undergoing endochondral ossification has been robustly demonstrated in previous studies (Pigeot et al., Advanced Materials 2021 and Grigoryan et al., Science Translational Medicine 2022). In the present manuscript, we thus compare the chondrogenic capacity of newly established VEGF-KO and RUNX-KO lines to those of MSOD-B cells. We demonstrate by qualitative (Safranin-O staining, Collagen type 2 and Collagen type X immuno-stainings) and quantitative (glycosaminoglycans assay) assays that the generated tissues consist in cartilage grafts of similar quality than the MSOD-B counterpart. Of note, the safranin-O stainings were performed on lyophilized tissues, which can alter the staining quality/intensity. We will thus provide additional stainings of generated tissues pre-lyophilization.

      The rationale behind establishing VEGF-KO cell lines remains unclear. What specific outcomes did the authors anticipate from this modification?

      VEGF is a known master regulator of angiogenesis and a key mediator of endochondral ossification. It has also been extensively used in bone tissue engineering studies as a supplemented factor – primarily in the form of VEGFα – to increase the vascularization and thus outcome of bone formation of engineered grafts (https://www.nature.com/articles/s42003-020-01606-9, https://www.sciencedirect.com/science/article/pii/S8756328216301752). In our study, it was thus identified as a natural candidate to demonstrate the possibility to generate VEGF-KO cartilage and subsequently assess the functional impact on both the angiogenic and osteogenic potential of resulting cartilage tissue.

      Insufficient depth was given to elucidate the disparity in osteogenic properties between those observed in ectopic bone formation and those observed in transplantation into osteochondral defects. While the regeneration of articular cartilage in RUNX2-KO ECM presents intriguing results, the study lacked an exploration into underlying mechanisms, such as histological analyses at earlier time points.

      Using RUNX2-KO ECM, we aimed at demonstrating the impact on cartilage remodeling and bone formation. This was performed ectopically but also in the rat osteochondral defect as a regenerative set-up of higher clinical relevance. We agree with the reviewer that additional experimental groups and time-points (not only earlier but also longer ones) would offer a better mechanistic understanding of the ECM contribution to the joint repair. However, as stated in our manuscript this is a proof-of-concept study that successfully demonstrated the influence of the cartilage ECM modification on the in vivo skeletal regeneration. A follow-up study would need to be performed to complement existing evidence and strengthen the relevance of our approach for cartilage repair.

      Reviewer #2 (Public Review):

      The manuscript submitted by Sujeethkumar et al. describes an alternative approach to skeletal tissue repair using extracellular matrix (ECM) deposited by genetically modified mesenchymal stromal/stem cells. Here, they generate a loss of function mutations in VEGF or RUNX2 in a BMP2-overexpressing MSC line and define the differences in the resulting tissue-engineered constructs following seeding onto a type I collagen matrix in vitro, and following lyophilization and subcutaneous and orthotopic implantation into mice and rats. Some strengths of this manuscript are the establishment of a platform by which modifications in cell-derived ECM can be evaluated both in vitro and in vivo, the demonstration that genetic modification of cells results in complexity of in vitro cell-derived ECM that elicits quantifiable results, and the admirable goal to improve endogenous cartilage repair. However, I recommend the authors clarify their conclusions and add more information regarding reproducibility, which was one limitation of primary-cell-derived ECMs.

      We thank the reviewer for the positive evaluation of our work.

      Overcoming the limitations of native/autologous/allogeneic ECMs such as complete decellularization and reduction of batch-to-batch variability was not specifically addressed in the data provided herein. For the maintenance of ECM organization and complexity following lyophilization, evidence of complete decellularization was not addressed, but could be easily evaluated using polarized light microscopy and quantification of human DNA for example in constructs pre and post-lyophilization.

      We will clarify the experiments and characterization performed with lyophilized tissues versus those performed with decellularized ones. We will also provide evidence of DNA removal in our decellularized ECMs.

      It would be ideal to see minimization of batch-to-batch variability using this approach, as mitigation of using a sole cell line is likely not sufficient (considering that the sole cell line-derived Matrigel does exhibit batch-to-batch and manufacturer-to-manufacturer variability). I recommend adding details regarding experimental design and outcomes not initially considered. Inter- and intra-experimental reproducibility was not adequately addressed. The size of in vitro-derived cartilage pellets was not quantified, and it is not clear that more than one independent 'differentiation' was performed from each gene-edited MSC line to generate in vitro replicates and constructs that were implanted in vivo.

      We thank the Reviewer for the comment on variability/reproducibility concern. Using a cell line does confer higher robustness but indeed does not grant unlimited consistency of batch production. We will temper our claims in the discussion and mention the need to regularly re-characterize cell lines properties upon passages.

      In our study, our grafts have been generated from various batches and tested in more than one experimental repeat. This will be further described in the revised version of our manuscript. We will also implement data on the size variability of generated tissues.

      The use of descriptive language in describing conclusions may mislead the reader and should be modified accordingly throughout the manuscript. For example, although this reviewer agrees with the comparative statements made by the authors regarding parental and gene-edited MSC lines, non-quantifiable terms such as 'frank' 'superior' (example, line 242) are inappropriate and should rather be discussed in terms of significance. Another example is 'rich-collagenous matrix,' which was not substantiated by uniform immunostaining for type II collagen (line 189).

      I have similar recommendations regarding conclusive statements from the rat implantation model, which was appropriately used for the purpose of evaluating the response of native skeletal cells to the different cell-derived ECMs. Interpretations of these results should be described with more accuracy. For example, increased TRAP staining does not indicate reduced active bone formation (line 237). Many would not conclude that GAGs were retained in the RUNX2-KO line graft subchondral region based on the histology. Quantification of % chondral regeneration using histology is not accurate as it is greatly influenced by the location in the defect from which the section was taken. Chondral regeneration is usually semi-quantified from gross observations of the cartilage surface immediately following excision. The statements regarding integration (example line 290) are not founded by histological evidence, which should show high magnification of the periphery of the graft adjacent to the native tissue.

      We thank the Reviewer for the constructive suggestions. We will revise language accordingly throughout the manuscript.

      Reviewer #3 (Public Review):

      Summary:

      In this study, the authors have started off using an immortalized human cell line and then gene-edited it to decrease the levels of VEGF1 (in order to influence vascularization), and the levels of Runx2 (to decrease chondro/osteogenesis). They first transplanted these cells with a collagen scaffold. The modified cells showed a decrease in vascularization when VEGF1 was decreased, and suggested an increase in cartilage formation.

      In another study, the matrix generated by these cells was subsequently remodeled into a bone marrow organ. When RUNX2 was decreased, the cells did not mineralize in vitro, and their matrices expressed types I and II collagen but not type X collagen in vitro, in comparison with unedited cells. In vivo, the author claims that remodeling of the matrices into bone was somewhat inhibited. Lastly, they utilized matrices generated by RUNX2 edited cells to regenerate chondro-osteal defects. They suggest that the edited cells regenerated cartilage in comparison with unedited cells.

      Strengths:

      -The notion that inducing changes in the ECM by genetically editing the cells is a novel one, as it has long been thought that ECM composition influences cell activity.

      -If successful, it may be possible to make off-the-shelf ECMS to carry out different types of tissue repair.

      We thank the Reviewer for the critical evaluation of our work and the highlighted novelty of it.

      Weaknesses:

      -The authors have not generated histologically identifiable cartilage or bone in their transplants of the cells with a type I scaffold.

      The chondrogenic differentiation of our MSOD-B line and their capacity of undergoing endochondral ossification has been robustly demonstrated in previous studies (Pigeot et al., Advanced Materials 2021 and Grigoryan et al., Science Translational Medicine 2022). In the present manuscript, we thus compare the chondrogenic capacity of newly established VEGF-KO and RUNX-KO lines to those of MSOD-B. We demonstrate by qualitative (Safranin-O staining, Collagen type 2 and Collagen type X immuno-stainings) and quantitative (glycosaminoglycans assay) assays that the generated tissues consist in cartilage tissue of similar quality than the MSOD-B. However, the safranin-O stainings were performed on lyophilized tissues, which can alter the staining quality/intensity. We will thus provide additional stainings of generated tissues pre-lyophilization.

      On the contested formation of bone in vivo by our ECMs grafts, we have provided compelling qualitative evidence via Masson´s Trichrome stainings and quantification of mineralized volume by µCT. Both cortical bone and trabecular structures were identified ectopically. Those are standard evaluation methods in the field, we would be happy to receive additional suggestions by the Reviewer.

      -In many cases, they did not generate histologically identifiable cartilage with their cell-free-edited scaffold. They did generate small amounts of bone but this is most likely due to BMPs that were synthesized by the cells and trapped in the matrix.

      We now appreciate that the Reviewer agrees on the successful formation of bone induced by our engineered grafts. We however still respectfully disagree with the “small amount of bone” statement since our MSOD-B and MSOD-B VEGF KO cartilage grafts led to the full generation of a mature ectopic bone organ (that is, also composed of extensive marrow). This has been assessed qualitatively and quantitatively.

      We agree with the Reviewer on the key role of BMP-2 in the remodeling process into bone and bone marrow, which we have extensively described in our previous publication (Pigeot et al., Advanced Materials 2021). We previously demonstrated that the low amount of BMP-2 (in the dozens of nanogram/tissue range) embedded in the matrix is not sufficient per se to induce ectopic endochondral ossification. It is the combined presence of GAGs in the matrix -thus cartilage- that allows the success of bone formation. Since we have already demonstrated in the present manuscript that the GAGs content is the same in MSOD-B and MSOD-B edited ECMs, we will provide additional data demonstrating the maintenance of BMP-2 content in all generated cartilage tissues.

      -There is a great deal of missing detail in the manuscript.

      We will provide additional information on the MSOD-B line and the overall methodology in our revised version.

      -The in vivo study is underpowered, the results are not well documented pictorially, and are not convincing.

      We will provide additional information and pictures related to our in vivo studies. We believe our group size supports our conclusions confirmed by statistical assessment.

      -Given the fact that they have genetically modified cells, they could have done analyses of ECM components to determine what was different between the lines, both at the transcriptome and the protein level. Consequently, the study is purely descriptive and does not provide any mechanistic understanding of what mixture of matrix components and growth factors works best for cartilage or bone. But this presupposes that they actually induced the formation of bona fide cartilage, at least.

      We thank the Reviewer for the suggestion. However, our study did not aim at understanding what ECM graft composition work best for cartilage nor bone regeneration respectively. Instead, we propose the exploitation of our cellular tools to interrogate the function of key ECM constituents and their impact in skeletal regeneration. We once more confirm that we generated lyophilized cartilage grafts which will be more evidently supported by histological assessment before lyophilization.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Chen and colleagues first compared the cartilage tissues collected from OA and HA patients using histology and immunostaining. Then, a genome-wide DNA methylation analysis was performed, which informed the changes of a novel gene, TNXB. IHC confirmed that TNXB has a lower expression level in HA cartilage than OA. Next, the authors demonstrated that TNXB levels were reduced in the HA animal model, and intraarticular injection of AAV carrying TNXB siRNA induced cartilage degradation and promoted chondrocyte apoptosis. Based on KEGG enrichment, histopathological analysis, and western blot, the authors also showed the relationship between TNXB and AKT phosphorylation. Lastly, AKT agonist, specifically SC79 in this study, was shown to partially rescue the changes of in vitro-cultured chondrocytes induced by Tnxb knock-down. Overall, this is an interesting study and provided sufficient data to support their conclusion.

      Strengths:

      (1) Both human and mouse samples were examined.

      (2) The HA model was used.

      (3) Genome-wide DNA methylation analysis was performed.

      Weaknesses:

      (1) In some experiments, the selection of the control groups was not ideal.

      Thank you for comments. The reviewer raised the concerns about using human OA cartilage as control, instead of health cartilage. This is an important detail we didn’t describe in the previous version. We have added our explanation in revised Methods.

      (2) More details on analyzing methods and information on replicates need to be included.

      We greatly appreciate your careful review and helpful suggestions. We have added detailed information to our revised draft.

      (3) Discussion can be improved by comparing findings to other relevant studies.

      Thank the reviewer very much for the opportunity to improve our manuscript. We have improved discussions as reviewer suggested in Recommendation 13.

      (4) The use of transgenic mice with conditional Tnxb depletion can further define the physiological roles of Tnxb.

      Thanks for this valuable comment. We understand that conditional Tnxb-KO mice is much helpful for the study of biological roles of Tnxb, and it will be constructed and used in our future studies.

      Recommendations For the Authors:

      (1) Please add more information about HA such as incidence to highlight the importance of the study.

      We greatly appreciate your careful review and helpful suggestions. We have provided more information about the importance of HA study in revised Introduction. Please see lines 90-93 and 103-112.

      (2) Please justify the use of OA cartilage, instead of normal tissues, as the control.

      Thanks for your suggestion. We certainly would have liked to use healthy cartilage as control, but we were extremely difficult to obtain enough control samples from healthy individuals. Despite the mechanistic and phenotypic differences between HA and OA, OA is often used as “disease” control to reveal the characteristics in HA 1,2. Thus, we measured cartilage degeneration and DNA methylation difference in HA and OA patients. We have provided the statement and evidence in revised manuscript. Please see lines 144-145.

      (3) Please provide details of how to calculate the Cartilage wear area ratio in Figure 1D, and measure the positive staining area in Figure 1F.

      We apologize for the issue you pointed out. Here, we provide detailed information for how positively stained areas are calculated. Specifically, in Figure 1D, we obtained the cartilage area ratio by calculating the ratio of blue cartilage staining area to the whole tissue area by using image J software. In Figure 1F, the area of positive staining was determined upon secondary antibody treatment and color development using DAB chromogen (brown stain). We then obtained the positive staining area ratio by calculating the ratio of positive staining area to the whole cartilage area by using image J software.

      (4) Please label the location of hemorrhagic ferruginous deposits in Figure 1.

      Thank you for your valuable suggestion. We have used black arrows to indicate hemorrhagic ferruginous deposits in revised Figure 1A.

      (5) Please define the meaning of "n" in all figure legends, such as technical or biological replicates.

      Thanks for your suggestion. We have defined the meaning of "n" in all figure legends in revised manuscript.

      (6) In Figure 3, please increase the font size of B, D, F, H, and J. The same applies to other figures.

      Thank you for your valuable suggestion. We have increased the font size of figures in our revised manuscript.

      (7) Line 327, "(Figure 1, F and G)" should be Figure 2F, G.

      Thanks for your reminding. We have corrected it in the revision. Please see lines 347.

      (8) Reduced TNXB levels in human HA cartilage are one of the major findings in this study. Currently, only semi-quatative IHC was used to draw the conclusion. A second method, such as real-time PCR or western blot, is required.

      Thanks for your suggestion. We feel very sorry that we did not have enough samples of human HA cartilages for qPCR and WB experiments, due to severe erosion of the HA cartilage. We have pointed out this limitation in revised drafts. Please see lines 445-448.

      (9) Figure 3 shows that reduced Tnxb was accompanied by the increased Dnmt1. In addition, this study is about methylation. Have the authors tested the change of Dnmt1 levels when Tnxb was knocked down?

      Thanks for your suggestion. According to the reviewer's suggestion, we have tested the expression of Dnmt1 in Tnxb-KD chondrocytes, and no significant alteration was observed. Please see the following Figure.

      Author response image 1.

      Figure Legend: Representative IHC staining of Dnmt1 in articular cartilage from Tnxb-KD HA mice. Corresponding quantification of the proportion of Dnmt1 positive regions. Red arrows indicate positive cells. Scale bar: 100 μm. Data were presented as means ± SD; n = 5 in each group. ns = no significance by unpaired Student’s t test.

      (10) Also, is there a causal relationship between Tnxb levels and the distribution of methylation levels? Any related study was performed?

      Following the valuable suggestion of the reviewer, we used two well-known DNA methyltransferase inhibitors (RG108 or 5-Aza-dc) 3 to examine whether DNA methylation regulates transcriptional expression of TNXB. We found that both inhibitors significantly up-regulated Tnxb mRNA level. We have added this result to the revised Supplementary Figure 4 and draft (lines 292-296 and 369-374).

      (11) In Figure 6, what was the control of "AKT agnost" group?

      Thank you for your suggestion. We feel sorry for our negligence and we have added the vehicle group as a control for AKT agonists in Figure 6 in our revised manuscript.

      (12) Previous studies have reported the involvement of TNXB in TGF-β signaling. Have the authors examined the effect of TNXB on TGF-β signaling in chondrocytes?

      Thank you for your suggestion. Here, we examined the expression of TGF-β signaling in Tnxb-KD chondrocyte and no significant changes were observed. We have discussed this result in revised draft (lines 475-479). We have added this result to the revised Supplementary Figure 7.

      (13) Discussion can be improved. For example, have previous studies reported the association between TNXB and methylation in other cells/tissues? In addition to apoptosis, are there other potential mechanisms underlying the protective role of TNXB in chondrocytes?

      Thank you for your valuable comments. Previous studies have shown the different DNA methylation of TNXB in whole blood from rheumatoid arthritis patients and in retinal pigment epithelium from patients with age-related macular degeneration 4,5. Herein, we were the first to report the association between DNA methylation of TNXB and HA cartilage degeneration. As for TNXB, there are limited public studies regarding physiological function of TNXB, among which mostly report the effect of TNXB on extracellular matrix organization 6,7. In our work, we found that TNXB regulated the phosphorylation of AKT. Since previous reports showed AKT controlled the expression of Mmp13 8, we thought that TNXB might regulated the chondrocyte extracellular matrix organization, in addition to its function on apoptosis. We have discussed these in revised manuscript (lines 462-464, and 495-501).

      (14) The manuscript writing needs to be improved. Typos and grammar issues were noted.

      Thanks. We have modified and polished our language and we hope the revised version could be acceptable for you.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript mainly studied the biological effect of tenascin XB (TNXB) on hemophilic arthropathy (HA) progression. Using bioinformatic and histopathological approaches, the authors identified the novel candidate gene TNXB for HA. Next, the authors showed that TNXB knockdown leads to chondrocyte apoptosis, matrix degeneration, and subchondral bone loss in vivo/vitro. Furthermore, AKT agonists promoted extracellular matrix synthesis and prevented apoptosis in TNXB knockdown chondrocytes.

      Strengths:

      In general, this study significantly advances our understanding of HA pathogenesis. The authors utilize comprehensive experimental strategies to demonstrate the role of TNXB in cartilage degeneration associated with HA. The results are clearly presented, and the conclusions appear appropriate.

      Weaknesses:

      Additional clarification is required regarding the gender of the F8-/- mouse in the study. Is the mouse male or female?

      We feel sorry that we did not provide enough information about the gender of the F8-/- mouse in the previous draft. Here, we used male F8-/- mice as the study subjects for our experiments. Hemophilia A is predominantly seen in males because of the X chromosome linkage 9.

      Recommendations For The Authors:

      Some issues need to be addressed in the manuscript:

      (1) During the progression of HA, in addition to cartilage degeneration, synovial hypertrophy and inflammation are also significant symptoms. How is the expression of TNXB in HA synovium?

      Thank you for your valuable comments. According to the reviewer's suggestion, we tested the expression of TNXB in the synovium, and there was no statistically significant difference in the expression level of TNXB in the synovium (Supplementary Figure. 2) Please see lines 347-349.

      (2) Lines 183-188. The methods of virus infection should be more detailed. What was the concentration of the AAVs injected? And how many doses were administrated?

      Thank you for your suggestion. We have added an explanation of virus infection and injected doses in revised methods section (lines 205-206).

      (3) Line 197-198. Could the author double-check the decalcification time for human cartilage samples? Is it for 3 months? Or for 3 weeks?

      Thank you for your suggestion. We have reconfirmed the decalcification of human cartilage samples for 3 months.

      (4) Line 343-344 "Above results suggest that TNXB might be protective against HA and its cartilage suppression is closely related to HA development." The conclusion is inappropriate, please revise it.

      Thanks for your suggestion. We have revised this conclusion into “Above results suggest that the suppression of TNXB in cartilage promotes the HA development”. Please see lines 365-366.

      (5) Line 326-327, the IHC staining for human samples is shown in Figure 2, not Figure 1. Please double check and revise it.

      Thanks for your reminding. We feel sorry for our negligence and we have corrected it in the revision.

      (6) For Figure 1B, it shows the MRI images of knee joints. However, the method section lacks details regarding the MRI imaging scan and analysis. Could the author include this information in the method section?

      Thank you for your valuable comments. We have added the method of MRI imaging scan and analysis in revised Methods. Please see lines 154-163.

      (7) In Figure 5, The statistical result of Bcl-2 is inconsistent with its Western blot band. Please check.

      Thanks for your reminding. We have modified it in the revision.

      (8) Please read through the text carefully to check for language problems. For example, in Line 68 "Our" not "our".

      Thanks for your reminding. In revision, we have corrected it. Please see Line 68.

      Reviewer #3 (Public Review):

      Summary:

      The manuscript by Dr. Chen et al. investigates the genes that are differentially methylated and associated with cartilage degeneration in hemophilia patients. The study demonstrates the functional mechanisms of the TNXB gene in chondrocytes and F8-/- mice. The authors first showed significant DNA methylation differences between hemophilic arthritis (HA) and osteoarthritis through genome-wide DNA methylation analysis. Subsequently, they showed a decreased expression of the differentially methylated TNXB gene in cartilage from HA patients and mice. By knocking down TNXB in vivo and in vitro, the results indicated that TNXB regulates extracellular matrix homeostasis and apoptosis by modulating p-AKT. The findings are novel and interesting, and the study presents valuable information in blood-induced arthritis research.

      Strengths:

      The authors adopted a comprehensive approach by combining genome-wide DNA methylation analysis, in vivo and in vitro experiments using human and mouse samples to illustrate the molecular mechanisms involved in HA progression, which is crucial for developing targeted therapeutic strategies. The study identifies Tenascin XB (TNXB) as a central mediator in cartilage matrix degradation. It provides mechanistic insights into how TNXB influences cartilage matrix degradation by regulating the activation of AKT. It opens avenues for future research and potential therapeutic interventions using AKT agonists for cartilage protection in hemophilic arthropathy. The conclusions drawn from the study are clear and directly tied to the findings.

      Weaknesses:

      (1) The study utilizes a small sample size (N=5 for both osteoarthritis and hemophilic arthropathy). A larger sample size would enhance the generalizability and statistical power of the findings.

      Thank you for pointing out this deficiency. Indeed, our sample size is relatively small, although the overall sample size was sufficient for statistical analyses. And we have added this limitation in discussion in revised manuscript. Please see line 445-448. Considering the small sample size, we subsequently performed functional validation study for TNXB, one of the most significant genes, and demonstrated that TNXB exerted critical impacts on chondrocytes apoptosis in HA pathogenesis in vivo and in vitro.

      (2) The use of an animal model (F8-/- mouse) to investigate the role of TNXB may not fully capture the complexity of human hemophilic arthropathy. Differences in the biology between species may affect the translatability of the findings to human patients.

      Thank you for your valuable comments. We recognize that biological differences between species can affect the clinical translation of research findings. In our work, we sequenced human cartilage samples to obtain the differentially methylated gene-TNXB. Meanwhile, we demonstrated that protein expression of TNXB protein was significantly down-regulated in HA human cartilage and F8-/- transgenic mouse cartilage. The F8-/- transgenic mouse serves as a well-accepted model for the study of hemophilia, which is phenotypically similar to that of human patients suffering from the disease and spontaneously bleeds into the joints and soft tissues. Besides, this model mouse has been widely used in the study of hemophilia and hemophilic arthritis 9-11.

      (3) The study primarily focuses on TNXB as a central mediator, but it might overlook other potentially relevant factors contributing to cartilage degradation in hemophilic arthropathy. A more holistic exploration of genetic and molecular factors could provide a broader understanding of the condition.

      Thanks for your suggestion. Since our human sample size is relatively small, we should interpret differentially methylated genes cautiously. Therefore, we mainly focused on the most top significant gene TNXB for functional study. In our further study, we will expand the sample size to more comprehensively explore the molecular mechanisms of HA.

      Recommendations For The Authors:

      The following are my suggestions:

      (1) Why do the authors choose to concentrate on the knee joint in the introduction when hemophilia, characterized by a deficiency in clotting factor F8, is recognized as a systemic disease?

      Thank you for your valuable comments. Although hemophilia a systemic disease, approximately 80%-90% of bleeding episodes in patients with hemophilia occur within the musculoskeletal system, especially in the knee joint 12.

      (2) While Figure 1 illustrates distinct expressions of Dnmt1 and Dnmt3a, only Dnmt1 results are presented in HA mice models in Figure 3. To address this, it is suggested that the expression of Dnmt3a be explored in animal models.

      Thank you for your suggestion. According to the reviewer's suggestion, we examined the expression of Dnmt3a in mouse articular cartilage, and the expression level of Dnmt3a was significantly up-regulated in both the 4W and 8W model groups compared with the control group (Figure 3). Please see line 364.

      (3) In Figure 3, the sample size for Dnmt1 is smaller than the other indicators; therefore, supplementing the sample count is recommended.

      Thanks for your reminding. We have corrected it in the revision.

      (4) Regarding Figure 4G, a few apoptotic cells were observed in the AAV NC group. It is advised that this figure be reviewed for accuracy.

      Thanks for your suggestion. In Figure 5D, the AAV-NC group is the case of needle-injected with AAV. Therefore, it is normal for apoptotic cells to appear in the cartilage layer.

      (5) The authors concluded that TNXB plays a role in apoptosis and AKT signaling. Providing expression data for Caspase9 would be valuable to strengthen this assertion, as PI3K/AKT signaling directly influences its activation during apoptosis.

      Thank you for your comments. We have examined the expression of Cleaved-Caspase9 protein, and found that knockdown of TNXB resulted in upregulation of Cleaved-Caspase9 protein expression, which was reversed by addition of SC79. This result has added in revised Figure 6 and manuscript. Please see line 414.

      (6) Quantitative analysis of the differences between the two groups in Supplemental Figures is necessary.

      Thank you for your suggestion. We have added the quantitative analysis of the differences between the two groups in Supplemental Figures.

      (7) With three major isoforms (homologs) of AKT in mammals-AKT1, 2, and 3 - why did the authors specifically focus on AKT1?

      Thank you for your comments. Based on the results of the KEGG enrichment analysis of differential methylated genes, we investigated the role of PI3K/AKT pathway in apoptosis of HA chondrocytes. AKT is universally acknowledged as a core factor in the PI3K/AKT pathway that plays critical roles in various cellular activities such as cell proliferation, cell differentiation, cell apoptosis, metabolism and so on 13,14, More notably, several studies demonstrated that in AKT family, Akt1 primarily was involved in regulation of chondrocyte survival and proteoglycan synthesis 15. Therefore, we detected phosphorylation of AKT1 in HA cartilages and TNXB-KD chondrocytes, and found that TNXB regulation chondrocytes ECM and apoptosis by AKT1. Reference:

      (1) Cooke, E.J., Zhou, J.Y., Wyseure, T., Joshi, S., Bhat, V., Durden, D.L., Mosnier, L.O., and von Drygalski, A. (2018). Vascular Permeability and Remodelling Coincide with Inflammatory and Reparative Processes after Joint Bleeding in Factor VIII-Deficient Mice. Thromb Haemost 118, 1036-1047. 10.1055/s-0038-1641755.

      (2) Kleiboer, B., Layer, M.A., Cafuir, L.A., Cuker, A., Escobar, M., Eyster, M.E., Kraut, E., Leavitt, A.D., Lentz, S.R., Quon, D., et al. (2022). Postoperative bleeding complications in patients with hemophilia undergoing major orthopedic surgery: A prospective multicenter observational study. J Thromb Haemost 20, 857-865. 10.1111/jth.15654.

      (3) Weiland, T., Weiller, M., Kunstle, G., and Wendel, A. (2009). Sensitization by 5-azacytidine toward death receptor-induced hepatic apoptosis. J Pharmacol Exp Ther 328, 107-115. 10.1124/jpet.108.143560.

      (4) Anaparti, V., Agarwal, P., Smolik, I., Mookherjee, N., and El-Gabalawy, H. (2020). Whole Blood Targeted Bisulfite Sequencing and Differential Methylation in the C6ORF10 Gene of Patients with Rheumatoid Arthritis. J Rheumatol 47, 1614-1623. 10.3899/jrheum.190376.

      (5) Porter, L.F., Saptarshi, N., Fang, Y., Rathi, S., den Hollander, A.I., de Jong, E.K., Clark, S.J., Bishop, P.N., Olsen, T.W., Liloglou, T., et al. (2019). Whole-genome methylation profiling of the retinal pigment epithelium of individuals with age-related macular degeneration reveals differential methylation of the SKI, GTF2H4, and TNXB genes. Clin Epigenetics 11, 6. 10.1186/s13148-019-0608-2.

      (6) Mao, J.R., Taylor, G., Dean, W.B., Wagner, D.R., Afzal, V., Lotz, J.C., Rubin, E.M., and Bristow, J. (2002). Tenascin-X deficiency mimics Ehlers-Danlos syndrome in mice through alteration of collagen deposition. Nat Genet 30, 421-425. 10.1038/ng850.

      (7) Zhang, K., Wang, X., Zeng, L.T., Yang, X., Cheng, X.F., Tian, H.J., Chen, C., Sun, X.J., Zhao, C.Q., Ma, H., and Zhao, J. (2023). Circular RNA PDK1 targets miR-4731-5p to enhance TNXB expression in ligamentum flavum hypertrophy. FASEB J 37, e22877. 10.1096/fj.202200022RR.

      (8) Guo, H., Yin, W., Zou, Z., Zhang, C., Sun, M., Min, L., Yang, L., and Kong, L. (2021). Quercitrin alleviates cartilage extracellular matrix degradation and delays ACLT rat osteoarthritis development: An in vivo and in vitro study. J Adv Res 28, 255-267. 10.1016/j.jare.2020.06.020.

      (9) Weitzmann, M.N., Roser-Page, S., Vikulina, T., Weiss, D., Hao, L., Baldwin, W.H., Yu, K., Del Mazo Arbona, N., McGee-Lawrence, M.E., Meeks, S.L., and Kempton, C.L. (2019). Reduced bone formation in males and increased bone resorption in females drive bone loss in hemophilia A mice. Blood Adv 3, 288-300. 10.1182/bloodadvances.2018027557.

      (10) Haxaire, C., Hakobyan, N., Pannellini, T., Carballo, C., McIlwain, D., Mak, T.W., Rodeo, S., Acharya, S., Li, D., Szymonifka, J., et al. (2018). Blood-induced bone loss in murine hemophilic arthropathy is prevented by blocking the iRhom2/ADAM17/TNF-alpha pathway. Blood 132, 1064-1074. 10.1182/blood-2017-12-820571.

      (11) Vols, K.K., Kjelgaard-Hansen, M., Ley, C.D., Hansen, A.K., and Petersen, M. (2019). Bleed volume of experimental knee haemarthrosis correlates with the subsequent degree of haemophilic arthropathy. Haemophilia 25, 324-333. 10.1111/hae.13672.

      (12) Lobet, S., Peerlinck, K., Hermans, C., Van Damme, A., Staes, F., and Deschamps, K. (2020). Acquired multi-segment foot kinematics in haemophilic children, adolescents and young adults with or without haemophilic ankle arthropathy. Haemophilia 26, 701-710. 10.1111/hae.14076.

      (13) Garcia, D., and Shaw, R.J. (2017). AMPK: Mechanisms of Cellular Energy Sensing and Restoration of Metabolic Balance. Mol Cell 66, 789-800. 10.1016/j.molcel.2017.05.032.

      (14) Johnson, J., Chow, Z., Lee, E., Weiss, H.L., Evers, B.M., and Rychahou, P. (2021). Role of AMPK and Akt in triple negative breast cancer lung colonization. Neoplasia 23, 429-438. 10.1016/j.neo.2021.03.005.

      (15) Rao, Z., Wang, S., and Wang, J. (2017). Peroxiredoxin 4 inhibits IL-1beta-induced chondrocyte apoptosis via PI3K/AKT signaling. Biomed Pharmacother 90, 414-420. 10.1016/j.biopha.2017.03.075.

    2. Reviewer #2 (Public Review):

      Summary:

      This manuscript mainly studied the biological effect of tenascin XB (TNXB) on hemophilic arthropathy (HA) progression. Using bioinformatic and histopathological approaches, the authors identified the novel candidate gene TNXB for HA. Next, authors showed that TNXB knockdown lead to chondrocyte apoptosis, matrix degeneration and subchondral bone loss in vivo/vitro. Furthermore, AKT agonist promoted extracellular matrix synthesis and prevented apoptosis in TNXB knockdown chondrocytes.

      Strengths:

      In general, this study significantly advances our understanding of HA pathogenesis. The authors utilize comprehensive experimental strategies to demonstrate the role of TNXB in cartilage degeneration associated with HA. The results are clearly presented, and the conclusions appear appropriate.

      Weaknesses:

      Additional clarification is required regarding the gender of the F8-/- mouse in the study. Is the mouse male or female?

    3. eLife assessment

      This important study identifies the TNXB-AKT pathway as a potential mechanism underlying hemophilia-associated cartilage degeneration. The evidence supporting the conclusions is convincing, with murine and human patient evidence as well as genome-wide DNA methylation analysis. This paper would be of interest to cell biologists and biochemists working on the field of musculoskeletal disorders.

    4. Reviewer #1 (Public Review):

      Summary:

      Chen and colleagues first compared the cartilage tissues collected from OA and HA patients using histology and immunostaining. Then, a genome-wide DNA methylation analysis was performed, which informed the changes of a novel gene, TNXB. IHC confirmed that TNXB has a lower expression level in HA cartilage than OA. Next, the authors demonstrated that TNXB levels were reduced in HA animal model, and intraarticular injection of AAV carrying TNXB siRNA induced cartilage degradation and promoted chondrocyte apoptosis. Based on KEGG enrichment, histopathological analysis, and western blot, the authors also showed the relationship between TNXB and AKT phosphorylation. Lastly, AKT agonist, specifically SC79 in this study, was shown to partially rescue the changes of in vitro-cultured chondrocytes induced by Tnxb knock-down. Overall, this is an interesting study and provided sufficient data to support their conclusion.

      Strengths:

      (1) Both human and mouse samples were examined.<br /> (2) The HA model was used.<br /> (3) genome-wide DNA methylation analysis was performed.

      Weaknesses:

      (1) In some experiments, the selection of the control groups was not ideal.<br /> (2) More details on analyzing methods and information on replicates need to be included.<br /> (3) Discussion can be improved by comparing findings to other relevant studies.<br /> (4) The use of transgenic mice with conditional Tnxb depletion can further define the physiological roles of Tnxb.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      We thank the reviewers for their thorough review of and overall positive comments on our manuscript. We have revised the manuscript to address the one remaining concern raised by one of the reviewers. This is described below.

      Fig.1B-C: To give a standard deviation from 2 data points has no statistical significance. In this case it would be better to define as range/difference of the 2 data points.

      We have modified the legend for Figure 1 to now read, “The average of two experiments is plotted with the bars representing the range of each time point.”

    2. eLife assessment

      This important study contributes insights into the regulatory mechanisms of a protein governing cell migration at the membrane. The integration of approaches revealing protein structure and dynamics provides convincing data for a model of regulation and suggests a new allosteric role for a solubilized phospholipid headgroup. The work will be interesting to researchers focusing on signaling mechanisms, cell motility, and cancer metathesis.

    3. Reviewer #1 (Public Review):

      Summary:

      The authors perform a multidisciplinary approach to describe the conformational plasticity of P-Rex1 in various states (autoinhibited, IP4 bound and PIP3 bound). Hydrogen-deuterium exchange (HDX) is used to reveal how IP4 and PIP3 binding affect intramolecular interactions. While IP4 is found to stabilize autoinhibitory interactions, PIP3 does the opposite, leading to deprotection of autoinhibitory sites. Cryo-EM of IP4 bound P-Rex1 reveals a structure in the autoinhibited conformation, very similar to the unliganded structure reported previously (Chang et al. 2022). Mutations at observed autoinhibitory interfaces result in a more open structure (as shown by SAXS), reduced thermal stability and increased GEF activity in biochemical and cellular assays. Together their work portrays a dynamic enzyme that undergoes long-range conformational changes upon activation on PIP3 membranes. The results are technically sound and the conclusions are justified. The main drawback is the limited novelty due to the recently published structure of unliganded P-Rex1, which is virtually identical to the IP4 bound structure presented here. Novel aspects suggest a regulatory role for IP4, but the exact significance and mechanism of this regulation has not been explored.

      Strengths:

      The authors use a multitude of techniques to describe the dynamic nature and conformational changes of P-Rex1 upon binding to IP4 and PIP3 membranes. The different approaches together fit well with the overall conclusion that IP4 binding negatively regulates P-Rex1, while binding to PIP3 membranes leads to conformational opening and catalytic activation. The experiments are performed very thoroughly and are technically sound. The results are clear and support the conclusions.

      Weaknesses:

      (1) The novelty of the study is compromised due to the recently published structure of unliganded P-Rex1 (Chang et al. 2022). The unliganded and IP4 bound structure of P-Rex1 appear virtually identical, however, no clear comparison is presented in the manuscript. In the same paper a very similar model of P-Rex1 activation upon binding to PIP3 membranes and Gbeta-gamma is presented.

      (2) The authors demonstrate that IP4 binding to P-Rex1 results in catalytic inhibition and increased protection of autoinhibitory interfaces, as judged by HDX. The relevance of this in a cellular setting is not clear and is not experimentally demonstrated. Further, mechanistically, it is not clear whether the biochemical inhibition by IP4 of PIP3 activated P-Rex1 is due to competition of IP4 with activating PIP3 binding to the PH domain of P-Rex1, or due to stabilizing the autoinhibited conformation, or both.

    4. Reviewer #2 (Public Review):

      Summary:

      In this new paper, the authors used biochemical, structural, and biophysical methods to elucidate the mechanisms by which IP4, the PIP3 headgroup, can induce an autoinhibit form of P-Rex1 and propose a model of how PIP3 can trigger long-range conformational changes of P-Rex1 to relieve this autoinhibition. The main findings of this study are that a new P-Rex1 autoinhibition is driven by an IP4-induced binding of the PH domain to the DH domain active site and that this autoinhibit form stabilized by two key interactions between DEP1 and DH and between PH and IP4P 4-helix bundle (4HB) subdomain. Moreover, they found that the binding of phospholipid PIP3 to the PH domain can disrupt these interactions to relieve P-Rex1 autoinhibition.

      Strengths:

      The study provides good evidence that binding of IP4 to the P-Rex1 PH domain can make the two long-range interactions between the catalytic DH domain and the first DEP domain, and between the PH domain and the C-terminal IP4P 4HB subdomain that generate a novel P-Rex1 autoinhibition mechanism. This valuable finding adds an extra layer of P-Rex1 regulation (perhaps in the cytoplasm) to the synergistic activation by phospholipid PIP3 and the heterotrimeric Gβγ subunits at the plasma membrane. Overall, this manuscript's goal sounds interesting, the experimental data were carried out carefully and reliably.

      Weakness:

      The set of experiments with the disulfide bond S235C/M244C caused a bit of confusion for interpretation, it should be moved into the supplement, and the text and Figure 4 were altered accordingly.

    5. Reviewer #3 (Public Review):

      Summary:

      In this report, Ravala et al demonstrate that IP4, the soluble head-group of phosphatiylinositol 3,4,5 - trisphosphate (PIP3), is an inhibitor of pREX-1, a guanine nucleotide exchange factor (GEF) for Rac1 and related small G proteins that regulate cell cell migration. This finding is perhaps unexpected since pREX-1 activity is PIP3-dependent. By way of Cryo-EM (revealing the structure of the p-REX-1/IP4 complex at 4.2Å resolution), hydrogen-deuterium mass spectrometry and small angle X-ray scattering, they deduce a mechanism for IP4 activation, and conduct mutagenic and cell-based signaling assays that support it. The major finding is that IP4 stabilizes two interdomain interfaces that block access of the DH domain, which conveys GEF activity towards small G protein substrates. One of these is the interface between the PH domain that binds to IP4 and a 4-helix bundle extension of the IP4 Phosphatase domain and the DEP1 domain. The two interfaces are connected by a long helix that extends from PH to DEP1. Although the structure of fully activated pREX-1 has not been determined, the authors propose a "jackknife" mechanism, similar to that described earlier by Chang et al (2022) (referenced in the author's manuscript) in which binding of IP3 relieves a kink in a helix that links the PH/DH modules and allows the DH-PH-DEP triad to assume an extended conformation in which the DH domain is accessible. While the structure of the activated pREX-1 has not been determined, cysteine mutagenesis that enforces the proposed kink is consistent with this hypothesis. SAXS and HDX-MS experiments suggest that IP4 acts by stiffening the inhibitory interfaces, rather than by reorganizing them. Indeed, the cryo-EM structure of ligand-free pREX-1 shows that interdomain contacts are largely retained in the absence of IP4.

      Strengths:

      The manuscript thus describes a novel regulatory role for IP4 and is thus of considerable significance to our understanding of regulatory mechanisms that control cell migration, particularly in immune cell populations. Specifically, they show how the inositol polyphosphate IP4 controls the activity of pREX-1, a guanine nucleotide exchange factor that controls the activity of small G proteins Rac and CDC42. In their clearly-written discussion, the authors explain how PIP3, the cell membrane and the Gbeta-gamma subunits of heterotrimeric membranes together localize pREX-1 at the membrane and induce activation. The quality of experimental data is high and both in vitro and cell-based assays of site-directed mutants designed to test the author's hypotheses are confirmatory. The results strongly support the conclusions. The combination of cryo-EM data, that describe the static (if heterogeneous) structures with experiments (small angle x-ray scattering and hydrogen-deuterium exchange-mass spectrometry) that report on dynamics are well employed by the authors

      Manuscript revision:

      The reviewers noted a number of weaknesses, including error analysis of the HDX data, interpretation of the mutagenesis data, the small fraction of the total number of particles used to generate the EM reconstruction, the novelty of the findings in light of the previous report by Cheng et al, 2022, various details regarding presentation of structural results and questions regarding the interpretation of the inhibition data (Figure 1D). The authors have responded adequately to these critiques. It appears that pREX-1 is a highly dynamic molecule, and considerable heterogeneity among particles might be expected.

      While, indeed, the conformation of pREX presented in this report is not novel, the finding that this inactive conformational state is stabilized by IP4 is significant and important. The evidence for this is both structural and biochemical, as indicated by micromolar competition of IP4 with PI3-enriched vesicles resulting in the inhibition of pREX-1 GEF activity.

    1. eLife assessment

      This study provides valuable insight into the role of miR-199a/b-5p in cartilage formation. The evidence supporting the significance of the identified miRNA and its target mRNA transcripts is convincing. This paper will likely primarily benefit scientists focused on diseases related to this biological process, such as osteoarthritis. Furthermore, researchers interested in miRNAs as a broader subject may find the computational model development methodology helpful.

    2. Reviewer #1 (Public Review):

      The comments below are from my review of the first submission of this article. I would now like to thank the authors for their hard work in responding to my comments. I am happy with the changes they have made, in particular the inclusion of further experimental evidence in Figures 2 and 4. I have no further comments to make.

      In 'Systems analysis of miR-199a/b-5p and multiple miR-199a/b-5p targets during chondrogenesis', Patel et al. present a variety of analyses using different methodologies to investigate the importance of two miRNAs in regulating gene expression in a cellular model of cartilage development. They first re-analysed existing data to identify these miRNAs as one of the most dynamic across a chondrogenesis development timecourse. Next, they manipulated the expression of these miRNAs and showed that this affected the expression of various marker genes as expected. An RNA-seq experiment on these manipulations identified putative mRNA targets of the miRNAs which were also supported by bioinformatics predictions. These top hits were validated experimentally and, finally, a kinetic model was developed to demonstrate the relationship between the miRNAs and mRNAs studied throughout the paper.

      I am convinced that the novel relationships reported here between miR-199a/b-5p and target genes FZD6, ITGA3 and CAV1 are likely to be genuine. It is important for researchers working on this system and related diseases to know all the miRNA/mRNA relationships but, as the authors have already published work studying the most dynamic miRNA (miR-140-5p) in this biological system I was not convinced that this study of the second miRNA in their list provided a conceptual advance on their previous work.

      I was also concerned with the lack of reporting of details of the manipulation experiments. The authors state that they have over-expressed miR-199a-5p (Figure 2A) and knocked down miR-199b-5p (Figure 2B) but they should have reported their proof that these experiments had worked as predicted, e.g. showing the qRT-PCR change in miRNA expression. Similarly, I was concerned that one miRNA was over-expressed while the other was knocked down - why did the authors not attempt to manipulate both miRNAs in both directions? Were they unable to achieve a significant change in miRNA expression or did these experiments not confirm the results reported in the manuscript?

      I had a number of issues with the way in which some of the data is presented. Table 1 only reported whether a specific pathway was significant or not for a given differential expression analysis but this concealed the extent of this enrichment or the level of statistical significance reported. Could it be redrawn to more similarly match the format of Figure 3A? The various shades of grey in Figure 2 and Figure 4 made it impossible to discriminate between treatments and therefore identify whether these data supported the conclusions made in the text. It also appeared that the same results were reported in Figure 3B and 3C and, indeed, Figure 3B was not referred to in the main text. Perhaps this figure could be made more concise by removing one of these two sets of panels?

      Overall, while I think that this is an interesting and valuable paper, I think its findings are relatively limited to those interested in the role of miRNAs in this specific biomedical context.

    3. Reviewer #2 (Public Review):

      Summary:

      This study represents an ambitious endeavor to comprehensively analyze the role of miR-199a/b-5p and its networks in cartilage formation. By conducting experiments that go beyond in vitro MSC differentiation models, more robust conclusions can be achieved.

      Strengths:

      This research investigates the role of miR-199a/b-5p during chondrogenesis using bioinformatics and in vitro experimental systems. The significance of miRNAs in chondrogenesis and OA is crucial, warranting further research, and this study contributes novel insights.

      Weaknesses:

      While miR-140 and miR-455 are used as controls, these miRNAs have been demonstrated to be more relevant to Cartilage Homeostasis than chondrogenesis itself. Their deficiency has been genetically proven to induce Osteoarthritis in mice. Therefore, the results of this study should be considered in comparison with these existing findings.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      In 'Systems analysis of miR-199a/b-5p and multiple miR-199a/b-5p targets during chondrogenesis', Patel et al. present a variety of analyses using different methodologies to investigate the importance of two miRNAs in regulating gene expression in a cellular model of cartilage development. They first re-analysed existing data to identify these miRNAs as one of the most dynamic across a chondrogenesis development time course. Next, they manipulated the expression of these miRNAs and showed that this affected the expression of various marker genes as expected. An RNA-seq experiment on these manipulations identified putative mRNA targets of the miRNAs which were also supported by bioinformatics predictions. These top hits were validated experimentally and, finally, a kinetic model was developed to demonstrate the relationship between the miRNAs and mRNAs studied throughout the paper.

      I am convinced that the novel relationships reported here between miR-199a/b-5p and target genes FZD6, ITGA3, and CAV1 are likely to be genuine. It is important for researchers working on this system and related diseases to know all the miRNA/mRNA relationships but, as the authors have already published work studying the most dynamic miRNA (miR-140-5p) in this biological system I was not convinced that this study of the second miRNA in their list provided a conceptual advance on their previous work.

      We believe this study is an enhancement on our previous work for two reasons, which have been alluded to in new text within the introduction. Firstly, our previous work used experimental and bioinformatic analysis to identify microRNAs with significant regulatory roles during chondrogenesis. This new manuscript additionally uses  a systems biology approaches to identify novel miRNA-mRNA interactions and capture these within an in silico model. Secondly, this work was initiated by the analysis of our previously generated data – using a novel tool we developed for this type of data (Bioconductor - TimiRGeN).  

      I was also concerned with the lack of reporting of details of the manipulation experiments. The authors state that they have over-expressed miR-199a-5p (Figure 2A) and knocked down miR-199b-5p (Figure 2B) but they should have reported their proof that these experiments had worked as predicted, e.g. showing the qRT-PCR change in miRNA expression. Similarly, I was concerned that one miRNA was over-expressed while the other was knocked down - why did the authors not attempt to manipulate both miRNAs in both directions? Were they unable to achieve a significant change in miRNA expression or did these experiments not confirm the results reported in the manuscript?

      We agree with the reviewer that some additional data were needed to demonstrate the effective regulation of miR-199-5p.  Hence, Supplementary Figure 1 is now included which provides validation of the effects of miR-199a-5p overexpression (Supplementary Figure 1A) and inhibition of miR-199a/b-5p (Supplementary Figure 1B). Within the main manuscript, Figure 2B has been amended to include the consequences of inhibition of miR-199a-5p, with 2C showing the consequences of miR-199b-5p inhibition. Further, we include new data with regards to miR-199a/b-5p inhibition on CAV1 (Figure 4A). 

      I had a number of issues with the way in which some of the data was presented. Table 1 only reported whether a specific pathway was significant or not for a given differential expression analysis but this concealed the extent of this enrichment or the level of statistical significance reported. Could it be redrawn to more similarly match the format of Figure 3A? The various shades of grey in Figure 2 and Figure 4 made it impossible to discriminate between treatments and therefore identify whether these data supported the conclusions made in the text. It also appeared that the same results were reported in Figure 3B and 3C and, indeed, Figure 3B was not referred to in the main text. Perhaps this figure could be made more concise by removing one of these two sets of panels.

      We agree with all points made here and have amended these within the manuscript. Figure 1A is now pathway enrichment plots from the TimiRGeN R Bioconductor package, and the table which previously showed the pathways enriched at each time point is now in the supplementary materials (supp. Table 1). Figure 2 and 4 now have color instead of shades of grey. Figure 3C has now been moved to supplementary materials (Supplementary Figure 2) and is referenced in the text. 

      Overall, while I think that this is an interesting and valuable paper, I think its findings are relatively limited to those interested in the role of miRNAs in this specific biomedical context.

      Reviewer #2 (Public review):

      Summary:

      This study represents an ambitious endeavor to comprehensively analyze the role of miR199a/b-5p and its networks in cartilage formation. By conducting experiments that go beyond in vitro MSC differentiation models, more robust conclusions can be achieved.

      Strengths:

      This research investigates the role of miR-199a/b-5p during chondrogenesis using bioinformatics and in vitro experimental systems. The significance of miRNAs in chondrogenesis and OA is crucial, warranting further research, and this study contributes novel insights.

      Weaknesses:

      While miR-140 and miR-455 are used as controls, these miRNAs have been demonstrated to be more relevant to Cartilage Homeostasis than chondrogenesis itself. Their deficiency has been genetically proven to induce Osteoarthritis in mice. Therefore, the results of this study should be considered in comparison with these existing findings.

      We agree with the reviewers comments. miR-455-null mice develop normally but miR-140-null (or mutated) mice and humans do have skeletal abnormalities (e.g. Nat Med. 2019 Apr;25(4):583-590. doi: 10.1038/s41591-019-0353-2), indicating a role in chondrogenesis.  We have made an addition in the description to point towards the need to assess the roles miR-199a/b-5p may play during skeletogenesis and OA. We anticipate miR-199a/b-5p to be relevant in OA and have ongoing additional work for this – but this beyond the scope of this manuscript. 

      Recommendations to Authors:

      Reviewer #1 (Recommendations to authors):

      Beyond the issues raised in the public review, I had a few minor recommendations that are largely designed to help improve the understanding of the manuscript as it is currently written.

      (1) Please provide the statistical tests used to obtain p-values in the Figure 2 and 4 legends.

      We have now added statistical test information to the figure legends of figures 2 and 4.

      (2) It is stated on p. 9 that both miRNAs may share a functional repertoire because 25 and 341 genes are interested between their inhibition experiments. Please provide statistical support that this overlap is an enrichment over the null background in this experiment. Total DE genes – chi squared. Expected / Observed. 

      A chi-squared test is now presented in the manuscript which shows that the number of significant genes which were found in common between miR-199a-5p knockdown and miR-199b-5p knockdown were significantly more than expected for day 0 or day 1 of the experiments. 

      (3) The final sentence on p. 12 (beginning 'Size of the points reflect...') seemed out of place - is it part of a legend?

      Thank you for pointing out this mistake - it was part of figure 3C and now is in the supplementary materials.

      (4) A sentence on p. 14 reads that 'FZD6 and ITGA3 levels increased significantly' but this should read decreased, rather than increased. Quite an important typo!

      Thank you for pointing this error out. It has been corrected.

      (5) Theoretical transcripts are mentioned in the legend of Figure 5A but these were not present in the figure. Please include these or remove them from the legend.

      This error has been removed form Figure 5A.

      (6) On p 20, the references 22 and 27 should I think be moved to earlier in the sentence (after 'miR-199a-5p-FZD6 has been predicted previously'). Currently, it reads as if these references support your luciferase assays which you claim are the first evidence for this target relationship.

      We agree with this change and have corrected the manuscript.

      (7) The reference to Figure 5D on p. 20 should be a reference to Figure 5C.

      Thank you for pointing this error out – this has been corrected.

      Reviewer #2 (Recommendations to authors):

      (1) The paper is based on the importance of miR-140 and miR-455 as miRNAs in chondrogenesis, citing only Barter, M. J. et al. Stem Cells 33, (2015). Considering the scope and results of this study, this citation is insufficient.

      We agree with this reviewers comments. For many year miR-140 and miR-455 have been experimented on and their importance in OA research has become apparent. We included additional references within the introduction to address this.

      (2) Analyzing chondrogenesis solely through differentiation experiments from MSCs is inadequate. It is essential to perform experiments involving the network within normal cartilage tissue and/or the generation of knockout mice to understand the precise role of miR199a/b-5p in chondrogenesis.

      We have added an additional paragraph in the discussion to state this, and do believe it is highly important that miR-199a/b-5p be tested in OA samples – however this would be beyond the intended scope of this article.

      (3) In light of the above points, it is imperative to investigate the role of miR-199a/b-5p beyond the in vitro differentiation model from MSCs, encompassing mouse OA models or human disease samples.

      In tangent with the previous address, we agree with the pretense and believe additional experiments should be performed to gain more insight to the mechanism of how miR-199a/b-5p regulate OA. But development of a new mouse line to investigate this is not in the scope of this manuscript.

    1. eLife assessment

      This study provides valuable evidence that differentiated cells of the zebrafish skin form membrane protrusions called cytonemes, that contact and potentially transmit Notch signals to cells of the intermediate layer below. Evidence that periderm cells send out cytoneme-like protrusions is solid, and perturbations that affect cytoneme number clearly affect periderm structure and gene expression. However, evidence that these effects are directly due to cytoneme mediated-Notch signaling is incomplete.

    2. Reviewer #1 (Public Review):

      Summary:

      In this paper, Wang et al show that differentiated peridermal cells of the zebrafish epidermis extend cytoneme-like protrusions toward the less differentiated, intermediate layer below. They present evidence that expression of a dominant-negative cdc42, inhibits cytoneme formation and leads to elevated expression of a marker of undifferentiated keratinocytes, krtt1c19e, in the periderm layer. Data is presented suggesting the involvement of Delta-Notch signaling in keratinocyte differentiation. Finally, changes in expression of the inflammatory cytokine IL-17 and its receptors is shown to affect cytoneme number and periderm structure in a manner similar to Notch and cdc42 perturbations.

      Strengths:

      Overall, the idea that differentiated cells signal to underlying undifferentiated cells via membrane protrusions in skin keratinocytes is interesting and novel, and it is clear that periderm cells send out thin membrane protrusions that contain a Notch ligand. Further, perturbations that affect cytoneme number, Notch signaling, and IL-17 expression clearly lead to changes in periderm structure and gene expression.

      Weaknesses:

      More work is needed to determine whether the effects on keratinocyte differentiation are due to a loss of cytonemes themselves, or to broader effects of inhibiting cdc42. Moreover, more evidence is needed to support the claim that periderm cytonemes deliver Delta ligands to induce Notch signaling below. Without these aspects of the study being solidified, understanding how IL-17 affects these processes seems premature.

    3. Reviewer #2 (Public Review):

      Summary:

      The aim of the study was to understand how cells of the skin communicate across dermal layers. The research group has previously demonstrated that cellular connections called airinemes contribute to this communication. The current work builds upon this knowledge by showing that differentiated keratinocytes also use cytonemes, specialized signaling filopodia, to communicate with undifferentiated keratinocytes. They show that cytonemes are the more abundant type of cellular extension used for communication between the differentiated keratinocyte layer and the undifferentiated keratinocytes. Disruption of cytoneme formation led to the expansion of the undifferentiated keratinocytes into the periderm, mimicking skin diseases like psoriasis. The authors go on to show that disruption of cytonemes results in perturbations in Notch signaling between the differentiated keratinocytes of the periderm and the underlying proliferating undifferentiated keratinocytes. Further, the authors show that Interleukin-17, also known to drive psoriasis, can restrict the formation of periderm cytonemes, possibly through the inhibition of Cdc42 expression. This work suggests that cytoneme-mediated Notch signaling plays a central role in normal epidermal regulation. The authors propose that disruption of cytoneme function may be an underlying cause of various human skin diseases.

      Strengths:

      The authors provide strong evidence that periderm keratinocytes cytonemes contain the notch ligand DeltaC to promote Notch activation in the underlying intermediate layer to regulate accurate epidermal maintenance.

      Weaknesses:

      The impact of the study would be increased if the mechanism by which Interlukin-17 and Cdc42 collaborate to regulate cytonemes was defined. Experiments measuring Cdc42 activity, rather than just measuring expression, would strengthen the conclusions.

    4. Reviewer #3 (Public Review):

      Summary:

      Leveraging zebra fish as a research model, Wang et al identified "cytoneme-like structures" as a mechanism for mediating cell-cell communications among skin epidermal cells. The authors further demonstrated that the "cytoneme-like structures" can mediate Notch signaling, and the "cytoneme-like structures" are influenced by IL17 signaling.

      Strengths:

      Elegant zebrafish genetics, reporters, and live imaging.

      Weaknesses: (minor)<br /> This paper focused on characterizing the "cytoneme-like structures" between different layers and the NOTCH signaling. However, these "cytoneme-like structures" observed in undifferentiated KC (Figure 2B), although at a slightly lower frequency, were not interpreted. In addition, it is unclear if these "cytoneme-like structures" can mediate other signaling pathways than NOTCH.

      Overall, this is a solid paper with convincing data reporting the "cytoneme-like structures" in vivo, and with compelling data demonstrating the roles in NOTCH signaling and the regulation by IL17.

      These findings provide a foundation for future work exploring the "cytoneme-like structures" in the mammalian system and other epithelial tissue types. This paper also suggests a potential connection between the "cytoneme-like structures" and psoriasis, which needs to be further explored in clinical samples.

    1. eLife assessment

      In this important study, Li and others identified cell membrane receptors for juvenile hormone (JH), a terpenoid hormone in insects that regulates their development and reproduction. While intracellular receptors for JH have been well characterized, membrane receptors for JH have remained elusive. Although the authors provide convincing evidence to indicate that the receptor tyrosine kinases they identified bind to JH in vitro and induce responses in cultured cells, their loss-of-function phenotypes are not consistent with known JH functions, leaving obscure the physiological roles of these receptors in mediating in vivo JH function.

    2. Reviewer #1 (Public Review):

      Summary:

      Juvenile Hormone (JH) plays a key role in insect development and physiology. Although the intracellular receptor for JH was identified long ago, a number of studies have shown that part of JH functions should be fulfilled through binding to an unknown membrane receptor, which was proposed to belong to the RTK family. In this study, the authors screened all RTKs from the H. armigera genome for their ability to mediate responses to JH III treatment both in cultured cells and in developing animals. They also present convincing evidence that CAD96CA and FGFR1 directly bind JH III, and that their role might be conserved in other insect species.

      Strengths:

      Altogether, the experimental approach is very complete and elegant, providing evidence for the role of CAD96CA and FGFR1 in JH signalling using different techniques and in different contexts. I believe that this work will open new perspectives to study the role of JH and better understand what is the contribution of signalling through membrane receptors for JH-dependent developmental processes.

      Weaknesses:

      I don't see major weaknesses in this study. However, I think that the manuscript would benefit from further information or discussion regarding the relationship between the two newly identified receptors. Experiments (especially in HEK-293T cells) suggest that CAD96CA and FGFR1 are sufficient on their own to transduce JH signalling. However, they are also necessary since loss-of-function conditions for each of them are sufficient to trigger strong effects (while the other is supposed to be still present).

      In addition, despite showing different expression patterns, the two receptors seem to display similar developmental functions according to loss-of-function phenotypes. It is therefore unclear how to draw a model for membrane receptor-mediated JH signalling that includes both CAD96CA and FGFR1.

    3. Reviewer #2 (Public Review):

      Summary:

      Juvenile hormone (JH) is a pleiotropic terpenoid hormone in insects that mainly regulates their development and reproduction. In particular, its developmental functions are described as the "status quo" action, as its presence in the hemolymph (the insect blood) prevents metamorphosis-initiating effects of ecdysone, another important hormone in insect development, and maintains the juvenile status of insects.

      While such canonical functions of JH are known to be mediated by its intracellular receptor complex composed of Met and Tai, there have been multiple reports suggesting the presence of cell membrane receptor(s) for JH, which mediate non-genomic effects of this terpenoid hormone. In particular, the presence of receptor tyrosine kinase(s) that phosphorylate Met/Tai in response to JH and thus indirectly affect the canonical JH signaling pathway has been strongly suggested. Given the importance of JH in insect physiology and the fact that the JH signaling pathway is a major target of insect growth regulators, elucidating the identification and functions of putative JH membrane receptors is of great significance from both basic and applied perspectives.

      In the present study, the authors identified candidate receptors for such cell membrane JH receptors, CAD96CA and FGFR1, in the cotton bollworm Helicoverpa armigera.

      Strengths:

      Their in vitro analyses are conducted thoroughly using multiple methods, which overall supports their claim that these receptors can bind to JH and mediate their non-genomic effects.

      Weaknesses:

      Results of their in vivo experiments, particularly those of their loss-of-function analyses using CRISPR mutants are still preliminary, and the results rather indicate that these membrane receptors do not have any physiologically significant roles in vivo. More specifically, previous studies in lepidopteran species have clearly and repeatedly shown that precocious metamorphosis is the hallmark phenotype for all JH signaling-deficient larvae. In contrast, the present study showed that Cad96ca and Fgfr1 G0 mutants only showed a slight acceleration in their pupation timing, which is not a typical phenotype one would expect from JH signaling deficiency. This is inconsistent with their working model provided in Figure 6, which indicates that these cell membrane JH receptors promote the canonical JH signaling by phosphorylating Met/Tai.

      If the authors argue that this slight acceleration of pupation is indeed a major JH signaling-deficient phenotype in Helicoverpa, they need to provide more data to support their claim by analyzing CRISPR mutants of other genes involved in JH signaling, such as Jhamt and Met. An alternative explanation is that there is functional redundancy between CAD96CA and FGFR1 in mediating phosphorylation of Met/Tai. This possibility can be tested by analyzing double knockouts of these two receptors.

      Currently, the validity of their calcium imaging analysis in Figure 5 is also questionable. When performing calcium imaging in cultured cells, it is critically important to treat all the cells at the end of each experiment with a hormone or other chemical reagents that universally induce calcium increase in each particular cell line. Without such positive control, the validity of calcium imaging data remains unknown, and readers cannot properly evaluate their results.

    4. Reviewer #3 (Public Review):

      Summary:

      In this study, Li et al. identified CAD96CA and FGF1 among 20 receptor tyrosine kinase receptors as mediators of JH signaling. By performing a screen in HaEpi cells with overactivated JH signaling, the authors pinpointed two main RTKs that contribute to the transduction of JH. Using the CRISPR/Cas9 system to generate mutants, the authors confirmed that these RTKs are required for normal JH activation, as precocious pupariation was observed in their absence. Additionally, the authors demonstrated that both CAD96CA and FGF1 exhibit a high affinity for JH, and their activation is necessary for the proper phosphorylation of Tai and Met, transcription factors that promote the transcriptional response. Finally, the authors provided evidence suggesting that the function of CAD96CA and FGF1 as JH receptors is conserved across insects.

      Strengths:

      The data provided by the authors are convincing and support the main conclusions of the study, providing ample evidence to demonstrate that phosphorylation of the transducers Met and Tai mainly depends on the activity of two RTKs. Additionally, the binding assays conducted by the authors support the function of CAD96CA and FGF1 as membrane receptors of JH. The study's results validate, at least in H. amigera, the predicted existence of membrane receptors for JH.

      Weaknesses:

      The study has several weaknesses that need to be addressed. Firstly, it is not clear what criteria were used by the authors to discard several other RTKs that were identified as repressors of JH signaling. For example, while NRK and Wsck may not fulfill all the requirements to become JH receptors, other evidence, such as depletion analysis and target gene expression, suggests they are involved in proper JH signaling activation.

      Secondly, the expression of the six RTKs, which, when knocked down, were able to revert JH signaling activation, was mainly detected in the last larval stage of H. amigera. However, since JH signaling is active throughout larval development, it is unclear whether these RTKs are completely required for pathway activation or only needed for high activation levels at the last larval stage.<br /> Additionally, the mechanism by which different RTKs exert their functions in a specific manner is not clear. According to the expression profile of the different RTKs, one might expect some redundant role of those receptors. In fact the no reversion of phosphorilation of tai and met upon depletion of Wsck in cells with overactivated JH signalling seems to support this idea.

      Nevertheless, and despite the overlapping expression of the different receptors, all RTKs seem to be required for proper pathway activation, even in the case of FGF1 which seems to be only expressed in the midgut. This is an intriguing point unresolved in the study.

      Finally, the study does not explain how RTKs with known ligands could also bind JH and contribute to JH signaling activation. in Drosophila, FGF1 is activated by pyramus and thisbe for mesoderm development, while CAD96CA is activated by collagen during wound healing. Now the authors claim that in addition to these ligands, the receptors also bind to JH. However, it is unclear whether these RTKs are activated by JH independently of their known ligands, suggesting a specific binding site for JH, or if they are only induced by JH activation when those ligands are present in a synergistic manner. Alternatively, another explanation could be that the RTK pathways by their known ligands activation may induce certain levels of JH transducer phosphorylation, which, in the presence of JH, contributes to the full pathway activation without JH-RTK binding being necessary.

    1. eLife assessment

      Combining experimental and computation approaches, this manuscript provides solid evidence for a post-transcriptional mechanism that provides robust control over the protein expression level of RecB in E. coli. In addition to uncovering how DNA damage drives more efficient translation of RecB protein, this work also reveals important tenets for how broader mechanisms that suppress noise and underlie responsive tuning of protein levels can be achieved.

    2. Reviewer #1 (Public Review):

      Summary:

      In this study the authors use an elegant set of single-molecule experiments to assess the transcriptional and post-transcriptional regulation of RecB. The question stems from a previous observation from the same lab, that RecB protein levels are low and not induced under DNA damage. The authors first show that recB transcript levels are low and have a short half-life. They further show that RecB levels are likely regulated via translational control. They provide evidence for low noise in RecB protein levels across cells and show that the translation of the mRNA increases under double-strand break conditions. Authors identify Hfq binding sites in the recBCD operon and show that Hfq regulates the levels of RecB protein without changing the mRNA levels. They suggest that RecB translation is directly controlled by Hfq binding to mRNA, as mutating one of the binding sites has a direct effect on RecB protein levels.

      Strengths:

      The implication of Hfq in regulation of RecB translation is important and suggests mechanisms of cellular response to DNA damage that are beyond the canonically studied mechanisms (such as transcriptional regulation by LexA). Data are clearly presented and the writing is direct and easy to follow. Overall, the study is well-designed and provides novel insights into the regulation of RecB, that is part of the complex required to process break ends.

      Weaknesses:

      Some key findings need additional support/ clarifications to strengthen the conclusions. These are suggested to the authors.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors carry out a careful and rigorous quantitative analysis of RecB transcript and protein levels at baseline and in response to DNA damage. Using single-molecule FISH and Halo-tagging in order to achieve sensitive measurements, they provide evidence that enhanced RecB protein levels in response to DNA damage are achieved through a post-transcriptional mechanism mediated by the Sm-like RNA binding protein, Hfq. In terms of biological relevance, the authors suggest that this mechanism provides a way to control the optimum level of RecB expression as both deletion and over-expression are deleterious. In addition, the proposed mechanism provides a new framework for understanding how transcriptional noise can be suppressed at the protein level.

      Strengths:

      Strengths of the manuscript include the rigorous approaches and orthogonal evidence to support the core conclusions, for example, the evidence that altering either Hhq or its recognition sequence on the RNA similarly enhance the protein to RNA ratio of RecB. The writing is clear and the experiments are well-controlled. The modeling approaches provide essential context to interpret the data, particularly given the small numbers of molecules per cell. The interpretations are careful and well supported.

      Weaknesses:

      The authors make a compelling case for the biological need to exquisitely control RecB levels, which they suggest is achieved by the pathway they have uncovered and described in this work. However, this conclusion is largely inferred as the authors only investigate the effect on cell survival in response to (high levels of) DNA damage and in response to two perturbations - genetic knock-out or over-expression, both of which are likely more dramatic than the range of expression levels observed in unstimulated and DNA damage conditions.

    4. Reviewer #3 (Public Review):

      Summary:

      The work by Kalita et al. reports regulation of RecB expression by Hfq protein in E.coli cell. RecBCD is an essential complex for DNA repair and chromosome maintenance. The expression level needs to be regulated at low level under regular growth conditions but upregulated upon DNA damage. Through quantitative imaging, the authors demonstrate that recB mRNAs and proteins are expressed at low level under regular conditions. While the mRNA copy number demonstrates high noise level due to stochastic gene expression, the protein level is maintained at a lower noise level compared to expected value. Upon DNA damage, the authors claim that the recB mRNA concentration is decreased, however RecB protein level is compensated by higher translation efficiency. Through analyzing CLASH data on Hfq, they identified two Hfq binding sites on RecB polycistronic mRNA, one of which is localized at the ribosome binding site (RBS). Through measuring RecB mRNA and protein level in the ∆hfq cell, the authors conclude that binding of Hfq to the RBS region of recB mRNA suppresses translation of recB mRNA. This conclusion is further supported by the same measurement in the presence of Hfq sequestrator, the sRNA ChiX, and the deletion of the Hfq binding region on the mRNA.

      Strengths:

      (1) The manuscript is well-written and easy to understand.<br /> (2) While there are reported cases of Hfq regulating translation of bound mRNAs, its effect on reducing translation noise is relatively new.<br /> (3) The imaging and analysis are carefully performed with necessary controls.

      Weaknesses:

      The major weaknesses include a lack of mechanistic depth, and part of the conclusions are not fully supported by the data.

      (1) Mechanistically, it is still unclear why upon DNA damage, translation level of recB mRNA increases, which makes the story less complete. The authors mention in the Discussion that a moderate (30%) decrease in Hfq protein was observed in previous study, which may explain the loss of translation repression on recB. However, given that this mRNA exists in very low copy number (a few per cell) and that Hfq copy number is on the order of a few hundred to a few thousand, it's unclear how 30% decrease in the protein level should resides a significant change in its regulation of recB mRNA.<br /> (2) Based on the experiment and the model, Hfq regulates translation of recB gene through binding to the RBS of the upstream ptrA gene through translation coupling. In this case, one would expect that the behavior of ptrA gene expression and its response to Hfq regulation would be quite similar to recB. Performing the same measurement on ptrA gene expression in the presence and absence of Hfq would strengthen the conclusion and model.<br /> (3) The authors agree that they cannot exclude the possibility of sRNA being involved in the translation regulation. However, this can be tested by performing the imaging experiments in the presence of Hfq proximal face mutations, which largely disrupt binding of sRNAs.<br /> (4) The data on construct with a long region of Hfq binding site on recB mRNA deleted is less convincing. There is no control to show that removing this sequence region itself has no effect on translation, and the effect is solely due to the lack of Hfq binding. A better experiment would be using a Hfq distal face mutant that is deficient in binding to the ARN motifs.<br /> (5) Ln 249-251: The authors claim that the stability of recB mRNA is not changed in ∆hfq simply based on the steady-state mRNA level. To claim so, the lifetime needs to be measured in the absence of Hfq.<br /> (6) What's the labeling efficiency of Halo-tag? If not 100% labeled, is it considered in the protein number quantification? Is the protein copy number quantification through imaging calibrated by an independent method? Does Halo tag affect the protein translation or degradation?<br /> (7) Upper panel of Fig S8a is redundant as in Fig 5B. Seems that Fig S8d is not described in the text.

    5. Author response:

      Reviewer #1 (Public Review):

      Summary:

      In this study the authors use an elegant set of single-molecule experiments to assess the transcriptional and post-transcriptional regulation of RecB. The question stems from a previous observation from the same lab, that RecB protein levels are low and not induced under DNA damage. The authors first show that recB transcript levels are low and have a short half-life. They further show that RecB levels are likely regulated via translational control. They provide evidence for low noise in RecB protein levels across cells and show that the translation of the mRNA increases under double-strand break conditions. Authors identify Hfq binding sites in the recbcd [recBCD] operon and show that Hfq regulates the levels of RecB protein without changing the mRNA levels. They suggest that RecB translation is directly controlled by Hfq binding to mRNA, as mutating one of the binding sites has a direct effect on RecB protein levels.

      Strengths:

      The implication of Hfq in regulation of RecB translation is important and suggests mechanisms of cellular response to DNA damage that are beyond the canonically studied mechanisms (such as transcriptional regulation by LexA). Data are clearly presented and the writing is direct and easy to follow. Overall, the study is well-designed and provides novel insights into the regulation of RecB, that is part of the complex required to process break ends.

      Weaknesses:

      Some key findings need additional support/ clarifications to strengthen the conclusions. These are suggested to the authors.

      Reviewer #2 (Public Review):

      Summary:

      The authors carry out a careful and rigorous quantitative analysis of RecB transcript and protein levels at baseline and in response to DNA damage. Using single-molecule FISH and Halo-tagging in order to achieve sensitive measurements, they provide evidence that enhanced RecB protein levels in response to DNA damage are achieved through a post-transcriptional mechanism mediated by the La-like RNA binding protein, Hhq1 [Sm-like RNA binding protein, Hfq]. In terms of biological relevance, the authors suggest that this mechanism provides a way to control the optimum level of RecB expression as both deletion and over-expression are deleterious. In addition, the proposed mechanism provides a new framework for understanding how transcriptional noise can be suppressed at the protein level.

      Strengths:

      Strengths of the manuscript include the rigorous approaches and orthogonal evidence to support the core conclusions, for example, the evidence that altering either Hhq1 [Hfq] or its recognition sequence on the RNA similarly enhance the protein to RNA ratio of RecB. The writing is clear and the experiments are well-controlled. The modeling approaches provide essential context to interpret the data, particularly given the small numbers of molecules per cell. The interpretations are careful and well supported.

      Weaknesses:

      The authors make a compelling case for the biological need to exquisitely control RecB levels, which they suggest is achieved by the pathway they have uncovered and described in this work. However, this conclusion is largely inferred as the authors only investigate the effect on cell survival in response to (high levels of) DNA damage and in response to two perturbations - genetic knock-out or over-expression, both of which are likely more dramatic than the range of expression levels observed in unstimulated and DNA damage conditions.

      In the discussion, we proposed that the post-transcriptional regulation of recB that we have uncovered could be involved in keeping RecB levels within an optimal range. We agree that testing the phenotypic impact of small changes in RecB levels would add additional strength to this suggestion. However, this is experimentally very challenging because of the low copy number of RecB molecules, which makes it difficult to slightly alter RecB levels in a controlled and homogeneous (across cells) manner. Developing the synthetic biology tools necessary for such an experiment is beyond the scope of this article. In the manuscript, we will clarify the limits of our interpretation of the role of the uncovered regulation.

      Reviewer #3 (Public Review):

      Summary:

      The work by Kalita et al. reports regulation of RecB expression by Hfq protein in E.coli cell. RecBCD is an essential complex for DNA repair and chromosome maintenance. The expression level needs to be regulated at low level under regular growth conditions but upregulated upon DNA damage. Through quantitative imaging, the authors demonstrate that recB mRNAs and proteins are expressed at low level under regular conditions. While the mRNA copy number demonstrates high noise level due to stochastic gene expression, the protein level is maintained at a lower noise level compared to expected value. Upon DNA damage, the authors claim that the recB mRNA level is not significantly affected, but RecB protein level increases due to a higher translation efficiency. [Upon DNA damage, the authors claim that the recB mRNA concentration is decreased, however RecB protein level is compensated by higher translation efficiency]. Through analyzing CLASH data on Hfq, they identified two Hfq binding sites on RecB polycistronic mRNA, one of which is localized at the ribosome binding site (RBS). Through measuring RecB mRNA and protein level in the ∆hfq cell, the authors conclude that binding of Hfq to the RBS region of recB mRNA suppresses translation of recB mRNA. This conclusion is further supported by the same measurement in the presence of Hfq sequestrator, the sRNA ChiX, and the deletion of the Hfq binding region on the mRNA.

      Strengths:

      (1) The manuscript is well-written and easy to understand.

      (2) While there are reported cases of Hfq regulating translation of bound mRNAs, its effect on reducing translation noise is relatively new.

      (3) The imaging and analysis are carefully performed with necessary controls.

      Weaknesses:

      The major weaknesses include a lack of mechanistic depth, and part of the conclusions are not fully supported by the data.

      (1) Mechanistically, it is still unclear why upon DNA damage, translation level of recB mRNA increases, which makes the story less complete. The authors mention in the Discussion that a moderate (30%) decrease in Hfq protein was observed in previous study, which may explain the loss of translation repression on recB. However, given that this mRNA exists in very low copy number (a few per cell) and that Hfq copy number is on the order of a few hundred to a few thousand, it's unclear how 30% decrease in the protein level should resides a significant change in its regulation of recB mRNA.

      While Hfq is a highly abundant protein, it has many mRNA and sRNA targets, some of which are also present in large amounts (DOI: 10.1046/j.1365-2958.2003.03734.x). As recently shown, the competition among the targets over Hfq proteins results in unequal (across various targets) outcomes, where the targets with higher Hfq affinity have an advantage over the ones with less efficient binding (DOI: 10.1016/j.celrep.2020.02.016). In line with these findings, we reason that upon DNA damage, a moderate decrease in the Hfq protein abundance (30%) can lead to a similar competition among Hfq targets where high-affinity targets outcompete low- affinity ones as well as low-abundant ones (such as recB mRNAs). Therefore, we hypothesise that the regulation of low abundant targets of Hfq by moderate perturbations of Hfq protein level is a potential explanation for the change in RecB translation that we have observed. We will expand this part of the discussion to explain our reasoning in a more explicit and coherent way.

      (2) Based on the experiment and the model, Hfq regulates translation of recB gene through binding to the RBS of the upstream ptrA gene through translation coupling. In this case, one would expect that the behavior of ptrA gene expression and its response to Hfq regulation would be quite similar to recB. Performing the same measurement on ptrA gene expression in the presence and absence of Hfq would strengthen the conclusion and model

      Indeed, based on our model, we expect PtrA expression to be regulated by Hfq in a similar manner to RecB. However, the product encoded by the ptrA gene, Protease III, (i) has been poorly characterised; (ii) unlike RecB, is located in the periplasm (DOI: 10.1128/jb.149.3.1027-1033.1982); and (iii) is not involved in any DNA repair pathway. Therefore, analysing PtrA expression would take us away from the key questions of our study.

      (3) The authors agree that they cannot exclude the possibility of sRNA being involved in the translation regulation. However, this can be tested by performing the imaging experiments in the presence of Hfq proximal face mutations, which largely disrupt binding of sRNAs.

      (4) The data on construct with a long region of Hfq binding site on recB mRNA deleted is less convincing. There is no control to show that removing this sequence region itself has no effect on translation, and the effect is solely due to the lack of Hfq binding. A better experiment would be using a Hfq distal face mutant that is deficient in binding to the ARN motifs.

      We thank the referee for these suggestions. We have performed the requested experiments, and the quantification of RecB abundance in the presence of Hfq proteins mutated in the proximal and distal face will be added to the revised version of the manuscript.

      (5) Ln 249-251: The authors claim that the stability of recB mRNA is not changed in ∆hfq simply based on the steady-state mRNA level. To claim so, the lifetime needs to be measured in the absence of Hfq.

      We agree that this statement is not fully supported by our data and will address this issue in the revised version.

      (6) What's the labeling efficiency of Halo-tag? If not 100% labeled, is it considered in the protein number quantification? Is the protein copy number quantification through imaging calibrated by an independent method? Does Halo tag affect the protein translation or degradation?

      Our previous study (DOI: 10.1038/s41598-019-44278-0) described a detailed characterisation of the HaloTag labelling technique for quantifying low-copy proteins in single E. coli cells.

      In that study, we used RecB-HaloTag as an example of a low-copy number protein. We showed a complete quantitative agreement of RecB detection between two fully independent methods: HaloTag-based labelling with cell fixation and RecB-sfGFP combined with a microfluidic device that lowers protein diffusion in the bacterial cytoplasm. This second method has previously been validated for protein quantification (DOI: 10.1038/ncomms11641) and provides detection of 80-90% of the labelled protein. Additionally, in our protocol, immediate chemical fixation of cells after the labelling and quick washing steps ensure that new, unlabelled RecB proteins are not produced. We, therefore, conclude that our approach to RecB detection is highly reliable and sufficient for comparing RecB production in different conditions and mutants.

      The RecB-HaloTag construct has been designed for minimal impact on RecB production and function. The HaloTag is translationally fused to RecB in a loop positioned after the serine present at position 47 where it is unlikely to interfere with (i) the formation of RecBCD complex (based on RecBCD structure, DOI: 10.1038/nature02988), (ii) the initiation of translation (as it is far away from the 5’UTR and the beginning of the open reading frame) and (iii) conventional C-terminal-associated mechanisms of protein degradation (DOI: 10.15252/msb.20199208). In our manuscript, we showed that the RecB-HaloTag degradation rate is similar to the dilution rate due to bacterial growth. This is in line with a recent study on unlabelled proteins, which shows that RecB’s lifetime is set by the cellular growth rate (https://doi.org/10.1101/2022.08.01.502339) and indicates that the HaloTag fusion is not affecting RecB stability.

      Furthermore, we have demonstrated (DOI: 10.1038/s41598-019-44278-0) that (i) bacterial growth is not affected by replacing the native RecB with RecB-HaloTag, (ii) RecB-HaloTag is fully functional upon DNA damage, and (iii) no proteolytic processing of the RecB-HaloTag is detected by Western blot.

      These results suggest that RecB expression and functionality are unlikely to be affected by the translational HaloTag insertion at Ser-47 in RecB. In the revised version of the manuscript, we will add information about the construct and discuss the reliability of the quantification.

      (7) Upper panel of Fig S8a is redundant as in Fig 5B. Seems that Fig S8d is not described in the text.

      Indeed, the data in the upper panel in Fig S8a was repeated (from Fig 5B) for visual purposes to facilitate comparison with the panel below. We will modify the figure legend to indicate this repetition clearly.

      In Fig S8d, we confirmed the functionality of the Hfq protein expressed from the pQE-Hfq plasmid in our experimental conditions, which was not described in the text. We will include this clarification in the updated manuscript.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      We would like to first thank the Editor as well as the three reviewers for their enthusiasm and conducting another careful evaluation of our manuscript. We appreciate their thoughtful and constructive comments and suggestions. Some concerns regarding experimental design, data analysis, and over-interpretation of our findings still remains unresolved after the initial revision. Here we endeavored to address these remaining concerns through further refinement of our writing, and inclusion of these concerns in the discussion session. We hope our response can better explain the rationale of our experimental design and data interpretation. In addition, we also acknowledge the limitations of our present study, so that it will benefit future investigations into this topic. Our detail responses are provided below.

      Reviewer #1 (Public Review):

      This study examines whether the human brain uses a hexagonal grid-like representation to navigate in a non-spatial space constructed by competence and trustworthiness. To test this, the authors asked human participants to learn the levels of competence and trustworthiness for six faces by associating them with specific lengths of bar graphs that indicate their levels in each trait. After learning, participants were asked to extrapolate the location from the partially observed morphing bar graphs. Using fMRI, the authors identified brain areas where activity is modulated by the angles of morphing trajectories in six-fold symmetry. The strength of this paper lies in the question it attempts to address. Specifically, the question of whether and how the human brain uses grid-like representations not only for spatial navigation but also for navigating abstract concepts, such as social space, and guiding everyday decision-making. This question is of emerging importance.

      I acknowledge the authors' efforts to address the comments received. However, my concerns persist:

      Thanks very much again for the re-evaluation and comments. Please find our revision plans to each comment below.

      (1) The authors contend that shorter reaction times correlated with increased distances between individuals in social space imply that participants construct and utilize two-dimensional representations. This method is adapted from a previous study by Park et al. Yet, there is a fundamental distinction between the two studies. In the prior work, participants learned relationships between adjacent individuals, receiving feedback on their decisions, akin to learning spatial locations during navigation. This setup leads to two different predictions: If participants rely on memory to infer relationships, recalling more pairs would be necessary for distant individuals than for closer ones. Conversely, if participants can directly gauge distances using a cognitive map, they would estimate distances between far individuals as quickly as for closer ones. Consequently, as the authors suggest, reaction times ought to decrease with increasing decision value, which, in this context, corresponds to distances. However, the current study allowed participants to compare all possible pairs without restricting learning experiences, rendering the application of the same methodology for testing two-dimensional representations inappropriate. In this study, the results could be interpreted as participants not forming and utilizing two-dimensional representations.

      We apologize for not being clear enough about our task design, we have made relevant changes in the methodology section in the manuscript to make it clearer. The reviewer’s concern is that participants learned about all the pairs in the comparison task which makes the distance effect invalid. We would like to clarify that during all the memory test tasks (the comparison task, the collect task and the recall task outside and inside scanner), participants never received feedback on whether their responses were correct or not. Therefore, the comparison task in our study is similar to the previous study by Park et al. (2021). Participants do not have access to correct responses for all possible pairs of comparison prior to or during this task, they would need to make inference based on memory retrieval.

      (2) The confounding of visual features with the value of social decision-making complicates the interpretation of this study's results. It remains unclear whether the observed grid-like effects are due to visual features or are genuinely indicative of value-based decision-making, as argued by the authors. Contrary to the authors' argument, this issue was not present in the previous study (Constantinescu et al.). In that study, participants associated specific stimuli with the identities of hidden items, but these stimuli were not linked to decision-making values (i.e., no image was considered superior to another). The current study's paradigm is more akin to that of Bao et al., which the authors mention in the context of RSA analysis. Indeed, Bao et al. controlled the length of the bars specifically to address the problem highlighted here. Regrettably, in the current paradigm, this conflation remains inseparable.

      We’d like to thank the reviewer for facilitating the discussion on the question of ‘social space’ vs. ‘sensory space’. The task in scanner did not require value-based decision making. It is akin to both the Bao et al. (2019) study and Constantinescu et al. (2016) study in a sense that all three tasks are trying to ask participants to imagine moving along a trajectory in an abstract, non-physical space and the trajectory is grounded in sensory cue. Participants were trained to associate the sensory cue with abstract (social/nonsocial) concepts. We think that the paradigm is a relatively faithful replication of the study by Constantinescu et al. Nonetheless, we agreed that a design similar to Bao et al. (2019) which controls for sensory confounds would be more ideal to address this concern, or adopting a value-based decision-making task in the scanner similar to that by Park et al. (2021), and we have included this limitation in the discussion section.

      (3) While the authors have responded to comments in the public review, my concerns noted in the Recommendation section remain unaddressed. As indicated in my recommendations, there are aspects of the authors' methodology and results that I find difficult to comprehend. Resolving these issues is imperative to facilitate an appropriate review in subsequent stages.

      Considering that the issues raised in the previous comments remain unresolved, I have retained my earlier comments below for review.

      We apologize for not addressing the recommendations properly, please find detailed our response and plans for revision.

      I have some comments. I hope that these can help.

      (1) While the explanation of Fig.4A-C is lacking in both the main text and figure legend, I am not sure if I understand this finding correctly. Did the authors find the effects of hexagonal modulation in the medial temporal gyrus and lingual gyrus correlate with the individual differences in the extent to which their reaction times were associated with the distances between faces when choosing a better collaborator? If so, I am not sure what argument the authors try to draw from these findings. Do the authors argue that these brain areas show hexagonal modulation, which was not supported in the previous analysis (Fig.3)? What is the level of correlation between these behavioral measures and the grid consistency effects in the vmPFC and EC, where the authors found actual grid-like activity? How do the authors interpret this finding? More importantly, how does this finding associate with other findings and the argument of the study?

      We apologize for not being clear enough in the manuscript and we will improve the clarity in our revision. This exploratory analysis reported in Figure 4 aims to use whole-brain analysis to examine: 1) if there is any correlation between the strength of grid-like representation of social value map and behavioral indicators of map-like representation; and 2) if there are any correlation between the strength of grid-like representation of this social value map and participants’ social trait.

      To be more specific, for the behavioral indicator, we used the distance effect in the reaction time of the comparison task outside the scanner. We interpreted stronger distance effect as a behavioral index of having better internal map-like representation. We interpreted stronger grid consistency effect as a neural index of better representation of the 2D social space. Therefore, we’d like to see if there exists correlation between behavioral and neural indices of map-like representation.

      To achieve this goal, behavioral indicators are entered as covariates in second-level analysis of the GLM testing grid consistency effect (GLM2). Figure3 showed results from GLM2 without the covariates. Figure4 showed results of clusters whose neural indices of map-like representation covaried with that from behavior and survived multiple-comparison correction. Indeed, in these regions, the grid consistency effect was not significant at group level (so not shown in Figure 3). We tried to interpret this finding in our discussion (line 374-289 for temporal lobe correlation, line 395-404 for precuneus correlation).

      Finally, we would like to point out that including the covariates in GLM2 did not change results in Figure3, the clusters in Figure3 still survives correction. Meanwhile, these clusters in Figure 3 did not show correlation with behavioral indicators of map-like representation.

      Author response image 1.

      (2) There are no behavioral results provided. How accurately did participants perform each of the tasks? How are the effects of grid consistency associated with the level of accuracy in the map test?

      Why did participants perform the recall task again outside the scanner?

      We will endeavor to improve signposting the corresponding figures in the main text. For the behavioral results, we reported the stats in section “Participants construct social value map after associative learning of avatars and corresponding characteristics” in the main text, and the plots are shown in Figure 1. Particularly, figure 1F showed accuracy of tasks in training, as well as the recall task in the scanner. For the correlation, we did not find significant correlation between behavioural accuracy and grid consistency effect. We will make it clearer in the result section.

      (3) The methods did not explain how the grid orientation was estimated and what the regressors were in GLM2. I don't think equations 2 and 3 are quite right.

      For the grid orientation estimation method, we provided detailed description in the Supplementary methods 2.2.2. We will add links to this section in the main text.

      Equation 2 and 3 describes how the parametric regressors entered into GLM2 were formed and provided prerequisites on calculation of grid orientations. Equation 2 was the results of directly applying the angle addition and subtraction theorems so they should be correct. We will try to make the rationale clearer in the supplementary text.

      (4) With the increase in navigation distances, more grid cells would activate. Therefore, in theory, the activity in the entorhinal cortex should increase with the Euclidean distances, which has not been found here. I wonder if there was enough variability in the Euclidean distances that can be captured by neural correlates. This would require including the distributions of Euclidean distances according to their trajectory angles. Regarding how Fig.1E is generated, I don't understand what this heat map indicates. Additionally, it needs to be confirmed if the grid effects remain while controlling for the Euclidean distances of navigation trajectories.

      We did not specifically control for the trajectory length, we only controlled for the distribution of trajectory to be uniform. We have included a figure of the distribution of Euclidean distances in Figure S9 and the distribution of trajectory direction in Figure S8.

      Author response image 2.

      As for Figure 1E, we aim to reproduce the findings from Figure 1F in Constantinescu et al. (2016) where they showed that participants progressively refined the locations of the outcomes through training. We divided the space into 15×15 subregions and computed the amount of time spent in each subregion and plotted Figure 1E. Brighter color in Figure 1E indicate greater amount of time spent in the corresponding subregion. Note that all these timing indices were computed as a percentage of the total time spent in the explore task in a given session. If participants were well-acquainted with the space and avatars, they would spend more time at the avatar (brighter color in avatar locations) in the review session compared to the learning session.

      As for the effect of distances on grid-like representation, we did not include the distance as a parametric modulator in grid consistency effect GLM (GLM2) due to insufficient trials in each bin (6-8 trials). But there is side evidence that could potentially rule out this confound. In the distance representation analysis, we did not find distance representation in any of the clusters that have significant grid-like representation (regions in Figure 2).

      Reviewer #2 (Public Review):

      Summary:

      In this work, Liang et al. investigate whether an abstract social space is neurally represented by a grid-like code. They trained participants to 'navigate' around a two-dimensional space of social agents characterized by the traits warmth and competence, then measured neural activity as participants imagined navigating through this space. The primary neural analysis consisted of three procedures: 1) identifying brain regions exhibiting the hexagonal modulation characteristic of a grid-like code, 2) estimating the orientation of each region's grid, and 3) testing whether the strength of the univariate neural signal increases when a participant is navigating in a direction aligned with the grid, compared to a direction that is misaligned with the grid. From these analyses, the authors find the clearest evidence of a grid-like code in the prefrontal cortex and weaker evidence in the entorhinal cortex.

      Strengths:

      The work demonstrates the existence of a grid-like neural code for a socially-relevant task, providing evidence that such coding schemes may be relevant for a variety of two-dimensional task spaces.

      Weaknesses:

      In the revised manuscript, the authors soften their claims about finding a grid code in the entorhinal cortex and provide additional caveats about limitations in their findings. It seems that the authors and reviewers are in agreement about the following weaknesses, which were part of my original review: Claims about a grid code in the entorhinal cortex are not well-supported by the analyses presented. The whole-brain analysis does not suggest that the entorhinal cortex exhibits hexagonal modulation; the strength of the entorhinal BOLD signal does not track the putative alignment of the grid code there; multivariate analyses do not reveal any evidence of a grid-like representational geometry.

      In the authors' response to reviews, they provide additional clarification about their exploratory analyses examining whether behavior (i.e., reaction times) and individual difference measures (i.e., social anxiety and avoidance) can be predicted by the hexagonal modulation strength in some region X, conditional on region X having a similar estimated grid alignment with some other region Y. My guess is that readers would find it useful if some of this language were included in the main text, especially with regard to an explanation regarding the rationale for these exploratory studies.

      Thank you very much again for your careful re-evaluation and suggestions. We have tried to improve our writing and incorporate the suggestions in the new revision.

      Reviewer #3 (Public Review):

      Liang and colleagues set out to test whether the human brain uses distance and grid-like codes in social knowledge using a design where participants had to navigate in a two-dimensional social space based on competence and warmth during an fMRI scan. They showed that participants were able to navigate the social space and found distance-based codes as well as grid-like codes in various brain regions, and the grid-like code correlated with behavior (reaction times).

      On the whole, the experiment is designed appropriately for testing for distant-based and grid-like codes, and is relatively well powered for this type of study, with a large amount of behavioral training per participant. They revealed that a number of brain regions correlated positively or negatively with distance in the social space, and found grid-like codes in the frontal polar cortex and posterior medial entorhinal cortex, the latter in line with prior findings on grid-like activity in entorhinal cortex. The current paper seems quite similar conceptually and in design to previous work, most notably Park et al., 2021, Nature Neuroscience.

      (1) The authors claim that this study provides evidence that humans use a spatial / grid code for abstract knowledge like social knowledge.

      This data does specifically not add anything new to this argument. As with almost all studies that test for a grid code in a similar "conceptual" space (not only the current study), the problem is that, when the space is not a uniform, square/circular space, and 2-dimensional then there is no reason the code will be perfectly grid like, i.e., show six-fold symmetry. In real world scenarios of social space (as well as navigation, semantic concepts), it must be higher dimensional - or at least more than two dimensional. It is unclear if this generalizes to larger spaces where not all part of the space is relevant. Modelling work from Tim Behrens' lab (e.g., Whittington et al., 2020) and Bradley Love's lab (e.g., Mok & Love, 2019) have shown/argued this to be the case. In experimental work, like in mazes from the Mosers' labs (e.g., Derdikman et al., 2009), or trapezoid environments from the O'Keefe lab (Krupic et al., 2015), there are distortions in mEC cells, and would not pass as grid cells in terms of the six-fold symmetry criterion.

      The authors briefly discuss the limitations of this at the very end but do not really say how this speaks to the goal of their study and the claim that social space or knowledge is organized as a grid code and if it is in fact used in the brain in their study and beyond. This issue deserves to be discussed in more depth, possibly referring to prior work that addressed this, and raise the issue for future work to address the problem - or if the authors think it is a problem at all.

      Thanks very much again for your careful re-evaluation and comments. We have tried to incorporate some of the suggested papers into our discussion. In summary, we agree that there is more to six-fold symmetric code that can be utilized to represent “conceptual space”. We think that the next step for a stronger claim would be to find the representation of more spontaneous non-spatial maps.

      References

      Bao, X., Gjorgieva, E., Shanahan, L. K., Howard, J. D., Kahnt, T., & Gottfried, J. A. (2019). Grid-like Neural Representations Support Olfactory Navigation of a Two-Dimensional Odor Space. Neuron, 102(5), 1066-1075 e1065. https://doi.org/10.1016/j.neuron.2019.03.034

      Constantinescu, A. O., O'Reilly, J. X., & Behrens, T. E. J. (2016). Organizing conceptual knowledge in humans with a gridlike code. Science, 352(6292), 1464-1468. https://doi.org/10.1126/science.aaf0941

      Park, S. A., Miller, D. S., & Boorman, E. D. (2021). Inferences on a multidimensional social hierarchy use a grid-like code. Nat Neurosci, 24(9), 1292-1301. https://doi.org/10.1038/s41593-02100916-3

    2. eLife assessment

      This study tackles a significant question: Does the brain apply spatial navigation systems to evaluate decision options in conceptual social spaces? The investigation is useful as it seeks to address this intriguing hypothesis. The findings offer partial support: a solid analysis revealed characteristic grid-like patterns associated with decision-making directions. However, it remains uncertain whether these effects are genuinely due to navigating a conceptual social space or potentially confounded by changes in visual stimuli. The experimental design may not be capable of definitively resolving this issue.

    3. Reviewer #1 (Public Review):

      The study offers intriguing insights, yet interpretations warrant caution, as the authors themselves acknowledged in their discussion of limitations.

      The observed grid-like neural activity might not signify navigating a social landscape but rather a sensory feature space. The study's design had participants associate each face with a pair of bar lengths, with the purported 'navigation' being merely a response to the morphing of bar graph images. Crucially, the task did not necessitate any social cognitive processing to estimate grid-like activity. When making social decisions in a separate task, it is unclear whether participants were actually traversing a social space mentally or simply recalling the bar graphs linked to each face to calculate decision values. Notably, during the trust game, competence and trustworthiness did not equally influence decision-making (as illustrated by Equation 1), implying the possibility that the space represented may be more perceptual than social in nature.

      The existence of a universal brain representation for faces within a social context is still debatable. Participants were not required to form a cognitive map of the six faces based on social traits; they could simply remember each face's trait values. While the study suggests that reaction times correlated with the perceived social distances between faces hint at the creation of internal representations, this phenomenon could occur without a true cognitive map of the face relationships. To convincingly argue for such internal representations in the brain, additional multivariate pattern analysis would be necessary to demonstrate that these are not merely the result of perceptual differences in the bar graphs associated with each face.

    4. Reviewer #3 (Public Review):

      Liang and colleagues set out to test whether the human brain uses distance and grid-like codes in social knowledge using a design where participants had to navigate in a two-dimensional social space based on competence and warmth during an fMRI scan. They showed that participants were able to navigate the social space and found distance-based codes as well as grid-like codes in various brain regions, and the grid-like code correlated with behavior (reaction times).

      On the whole, the experiment is designed appropriately for testing for distant-based and grid-like codes, and is relatively well powered for this type of study, with a large amount of behavioral training per participant. They revealed that a number of brain regions correlated positively or negatively with distance in the social space, and found grid-like codes in the frontal polar cortex and posterior medial entorhinal cortex, the latter in line with prior findings on grid-like activity in entorhinal cortex. The current paper seems quite similar conceptually and in design to previous work, most notably Park et al., 2021, Nature Neuroscience.

      (1) The authors claim that this study provides evidence that humans use a spatial / grid code for abstract knowledge like social knowledge.

      This data does specifically not add anything new to this argument. As with almost all studies that test for a grid code in a similar "conceptual" space (not only the current study), the problem is that, when the space is not a uniform, square/circular space, and 2-dimensional then there is no reason the code will be perfectly grid like, i.e., show six-fold symmetry. In real world scenarios of social space (as well as navigation, semantic concepts), it must be higher dimensional - or at least more than two dimensional. It is unclear if this generalizes to larger spaces where not all part of the space is relevant. Modelling work from Tim Behrens' lab (e.g., Whittington et al., 2020) and Bradley Love's lab (e.g., Mok & Love, 2019) have shown/argued this to be the case. In experimental work, like in mazes from the Mosers' labs (e.g., Derdikman et al., 2009), or trapezoid environments from the O'Keefe lab (Krupic et al., 2015), there are distortions in mEC cells, and would not pass as grid cells in terms of the six-fold symmetry criterion.

      After revision, the authors now discuss some of this and the limitations and notes that future work is required to address the problem.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Although this manuscript contains a potentially interesting piece of work that delineates a mechanism of IQCH that associates with spermatogenesis, this reviewer feels that a number of issues require clarification and re-evaluation for a better understanding of the role of IQCH in spermatogenesis. With the shortage of logics and supporting data, causal relationships are still not clear among IQCH, CaM, and HNRPAB. The most serious point in this manuscript could be that the authors try to generalize their interpretations with too simplified model from limited pieces of their data. The way the data and the logic are presented needs to be largely revised, and several interpretations should be supported by direct evidence.

      Response: Thank you for the reviewer’s comment. IQCH is a calmodulin-binding protein, and the binding of IQCH and CaM was confirmed by LC-MS/MS analysis and co-IP assay using sperm lysate. We thus speculated that if the interaction of IQCH and CaM might be a prerequisite for IQCH function. To prove that speculation, we took HNRPAB as an example. We knocked down IQCH in cultured cells, and a decrease in the expression of HNRPAB was observed. Similarly, when we knocked down CaM in cultured cells, and a decrease in the expression of HNRPAB was also detected. However, these results cannot exclude that IQCH or CaM could regulate HNRPAB expression alone. To investigate that if IQCH or CaM could regulate HNRPAB expression alone, we overexpressed IQCH in cells that knocked down CaM, while the expression of HNRPAB cannot be rescued, suggesting that IQCH cannot regulate HNRPAB expression when CaM is reduced. In consistent, we overexpressed CaM in cells that knocked down IQCH, while the expression of HNRPAB cannot be rescued, suggesting that CaM cannot regulate HNRPAB expression when IQCH is reduced. Thus, IQCH or CaM cannot regulate HNRPAB expression alone. Moreover, we deleted the IQ motif of IQCH, which is required for binding to CaM. The co-IP results showed that the interaction of IQCH and CaM was disrupted when deleting the IQ motif of IQCH, and the expression of HNRPAB was decreased. Therefore, we suggested that the interaction of IQCH and CaM might be required for IQCH regulating HNRPAB. In future studies, we will further investigate the relationships among IQCH, CaM, and HNRPAB.

      Reviewer #3 (Public Review):

      (1) More background details are needed regarding the proteins involved, in particular IQ proteins and calmodulin. The authors state that IQ proteins are not well-represented in the literature, but do not state how many IQ proteins are encoded in the genome. They also do not provide specifics regarding which calmodulins are involved, since there are at least 5 family members in mice and humans. This information could help provide more granular details about the mechanism to the reader and help place the findings in context.

      Response: Thanks to reviewer’s suggestion. We have provided additional background information regarding IQ-containing protein family members in humans and mice, as well as other IQ-containing proteins implicated in male fertility, in the Introduction section. Furthermore, we have supplemented the Introduction with background information concerning the association between CaM and male infertility.

      (2) The mouse fertility tests could be improved with more depth and rigor. There was no data regarding copulatory plug rate; data was unclear regarding how many WT females were used for the male breeding tests and how many litters were generated; the general methodology used for the breeding tests in the Methods section was not very explicitly or clearly described; the sample size of n=3 for the male breeding tests is rather small for that type of assay; and, given that ICHQ appears to be expressed in testicular interstitial cells (Fig. S10) and somewhat in other organs (Fig. S2), another important parameter of male fertility that should be addressed is reproductive hormone levels (e.g., LH, FSH, and testosterone). While normal epididymal size in Fig. S3 suggests that hormone (testosterone) levels are normal, epididymal size and/or weight were not rigorously quantified.

      Response: Thanks to reviewer’s comment. We have provided the data regarding copulatory plug rate and the average number of litters for breeding tests in revised Figure 3—figure supplement 2. The methodology used for the breeding tests has been revised to be more detailed and explicit in the revised Method section. Moreover, we have increased the sample size for male breeding tests to n=6. We measured the serum levels of FSH, LH, and Testosterone in the WT (9.3±1.9 ng/ml, 0.93±0.15 ng/ml, and 0.2±0.03 ng/ml) and Iqch KO mice (12±2 ng/ml, 1.17±0.2 ng/ml, and 0.2±0.04 ng/ml). There was no significant difference observed in the serum levels of reproductive hormones between WT and Iqch KO mice; therefore, we did not include the data in the study. Furthermore, we have added quantitative data on epididymal size in the revised Figure 3—figure supplement 2.

      (3) The Western blots in Figure 6 should be rigorously quantified from multiple independent experiments so that there is stronger evidence supporting claims based on those assays.

      Response: We appreciate the reviewer's comment. As suggested, we have added quantified data in Figure 6—figure supplement 2 from the results of Western blotting in Figure 6.

      (4) Some of the mouse testis images could be improved. For example, the PNA and PLCz images in Figure S7 are difficult to interpret in that the tubules do not appear to be stage-matched, and since the authors claimed that testicular histology is unaffected in knockout testes, it should be feasible to stage-match control and knockout samples. Also, the anti-ICHQ and CaM immunofluorescence in Figure S10 would benefit from some cell-type-specific co-stains to more rigorously define their expression patterns, and they should also be stage-matched.

      Response: Thanks to reviewer’s suggestions. We have included immunofluorescence images of anti-PLCz, anti-PNA and anti-IQCH and CaM during spermatogenesis development.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) There are multiple grammatical errors and statements drawn beyond the results. The entire manuscript would benefit from professional editing.

      Response: We are sorry for the grammatical errors. We have enlisted professional editing services to refine our manuscript.

      (2) Line 40, "Firstly" is not appropriate here.

      Response: Thanks to reviewer’s comment. The word "Firstly" has been removed from the revised manuscript.

      (3) Line 44, "processes".

      Response: Thanks to reviewer’s suggestion. We have changed “process” in to “processes” on line 45.

      (4) "spermatocytogenesis (mitosis)" is incorrect.

      Response: Thanks to reviewer’s comment. We have changed “spermatocytogenesis (mitosis)” in to “mitosis” on line 47.

      (5) Ca and Ca2+ are both used in line 67 - 77. Be consistent.

      Response: We appreciate the reviewer's detailed checks. We have maintained consistency by revising instances of "Ca" to "Ca2+" in revised manuscript.

      (6) Line 238 to 240, "To elucidate the molecular mechanism by which IQCH regulates male fertility, we performed liquid chromatography tandem mass spectrometry (LC-MS/MS) analysis using mouse sperm lysates and detected 288 interactors of IQCH (Data S1)."It is not clear how LC-MS/MS using mouse sperm lysates could detect "288 interactors of IQCH"? A co-IP experiment for IQCH using sperm lysates prior to LC-MS/MS is needed to detect "interactors of IQCH". However, in the Methods section, consistent with the main text, proteomic quantification was conducted for protein extract from sperm. Figure legend for Fig. 5 did not explain this, either.Thus, it is unable to evaluate Figure 5.

      Response: We sincerely apologize for the oversight. Following reviewer’s suggestions, we have supplemented the method details of LC-MS/MS experiment in the Methods section of revised manuscript. Additionally, we conducted a co-IP experiment for IQCH using sperm lysates prior to LC-MS/MS and we did not include the corresponding figure in the manuscript. The results are as follows:

      Author response image 1.

      The results of a co-IP experiment for IQCH using sperm lysates from WT mice.

      (7) Line 246, "... key proteins that might be activated by IQCH". What does "activated" here refer to? Should it be "upregulated"?

      Response: We are sorry to our inexact statement. Instead, "upregulated" would better convey the intended meaning. According to reviewer’s suggestions, we have modified "activated" into "upregulated".

      (8) Line 252 to 254, "the cross-analysis revealed that 76 proteins were shared between the IQCH-bound proteins and the IQCH-activated proteins (Fig. 5E), implicating this subset of genes as direct targets." This is a confusing statement. Is the author trying to say, IQCH-bound proteins have upregulated expression, suggesting that IQCH enhances their expression?

      Response: We appreciate the reviewer's comment regarding the clarity of the statement in Line 252 to 254 of the manuscript. We have modified this sentence into “Importantly, cross-analysis revealed that 76 proteins were shared between the IQCH-bound proteins and the downregulated proteins in Iqch KO mice (Figure 5E), suggesting that IQCH might regulate their expression by the interaction.”

      (9) Line 260 to 261, "SYNCRIP, HNRNPK, FUS, EWSR1, ANXA7, SLC25A4, and HNRPAB ... the loss of which showed the greatest influence on the phenotype of the Iqch KO mice." There is no evidence suggesting that the loss of SYNCRIP, HNRNPK, FUS, EWSR1, ANXA7, SLC25A4, and HNRPAB leads to Iqch KO phenotype.

      Response: We apologize for our inaccurate statement. According to the literature, Fus KO, Ewsr1 KO, and Hnrnpk KO male mice were infertile, showing the spermatogenic arrest with absence of spermatozoa (Kuroda et al. 2000; Tian et al. 2021; Xu et al. 2022). Syncrip is involved meiotic process in Drosophila by interacting with Doublefault (Sechi et al. 2019). HNRPAB might be associated with mouse spermatogenesis by binding to Protamine 2 and contributing its translational regulation. Specifically, ANXA7 is a calcium-dependent phospholipid-binding protein that is a negative regulator of mitochondrial apoptosis (Du et al. 2015). Loss of SLC25A4 results in mitochondrial energy metabolism defects in mice (Graham et al. 1997). Moreover, RNA immunoprecipitation on formaldehyde cross-linked sperm followed by qPCR detected the interactions between HNRPAB and Catsper1, Catsper2, Catsper3, Ccdc40, Ccdc39, Ccdc65, Dnah8, Irrc6, and Dnhd1, which are essential for sperm development (Fukuda et al. 2013). Our Iqch KO mice showed abnormal sperm count, motility, morphology, and mitochondria, so we inferenced that IQCH might play a role in spermatogenesis by regulating the expression of SYNCRIP, HNRNPK, FUS, EWSR1, ANXA7, SLC25A4, and HNRPAB to some extent. We have changed an appropriate stamen that “We focused on SYNCRIP, HNRNPK, FUS, EWSR1, ANXA7, SLC25A4, and HNRPAB, which play important roles in spermatogenesis.”

      (10) Fig. 6C and 6D use different styles of error bars.

      Response: We are sorry for our oversight. In accordance with the reviewer's recommendations, we have modified the representation of error bars in the revised Fig. 6C.

      (11) Line 296 to 297, "As expected, CaM interacted with IQCH, as indicated by LC-MS/MS analysis". It is not clear how LC-MS/MS detects protein interaction.

      Response: As reviewer’s suggestions, we have supplemented the method details of LC-MS/MS experiment in the Methods section of revised manuscript. The results of proteins interacting with IQCH in sperm lysates from the LC-MS/MS experiment analysis were submitted as Figure 5—source data 1.

      (12) It is still not clear how the interaction between IQCH, CaM, and HNRPAB is required for the expression of each other.

      Response: Thank you for the reviewer’s comment. IQCH is a calmodulin-binding protein, and the binding of IQCH and CaM was confirmed by LC-MS/MS analysis and co-IP assay using sperm lysate. We thus speculated that if the interaction of IQCH and CaM might be a prerequisite for IQCH function. To prove that speculation, we took HNRPAB as an example. We knocked down IQCH in cultured cells, and a decrease in the expression of HNRPAB was observed. Similarly, when we knocked down CaM in cultured cells, and a decrease in the expression of HNRPAB was also detected. However, these results cannot exclude that IQCH or CaM could regulate HNRPAB expression alone. To investigate that if IQCH or CaM could regulate HNRPAB expression alone, we overexpressed IQCH in cells that knocked down CaM, while the expression of HNRPAB cannot be rescued, suggesting that IQCH cannot regulate HNRPAB expression when CaM is reduced. In consistent, we overexpressed CaM in cells that knocked down IQCH, while the expression of HNRPAB cannot be rescued, suggesting that CaM cannot regulate HNRPAB expression when IQCH is reduced. Thus, IQCH or CaM cannot regulate HNRPAB expression alone. Moreover, we deleted the IQ motif of IQCH, which is required for binding to CaM. The co-IP results showed that the interaction of IQCH and CaM was disrupted when deleting the IQ motif of IQCH, and the expression of HNRPAB was decreased. Therefore, we suggested that the interaction of IQCH and CaM might be required for IQCH regulating HNRPAB. In future studies, we will further investigate the relationships among IQCH, CaM, and HNRPAB.

      Reviewer #3 (Recommendations For The Authors):

      The authors have addressed my minor concerns. However, they neglected to address any of my more significant concerns in the public review. I assume that they simply overlooked these critiques, despite the fact that eLife explicitly states that "...as a general rule, concerns about a claim not being justified by the data should be explained in the public review." Therefore, the authors should have looked more carefully at the public reviews. As a result, my major concerns about the manuscript remain.

      Response: We apologize for overlooking the public review process. We have improved our study based on the feedback received during the public review.

    2. eLife assessment

      This valuable study describes mice with a knock out of the IQ motif-containing H (IQCH) gene, to model a human loss-of-function mutation in IQCH associated with male sterility. The infertility is reproduced in the mouse, making it a compelling model, but some of the mechanistic experiments provide only indirect and thus incomplete evidence for interaction between IQCH and potential RNA binding proteins. With more rigorous approaches, the paper should be of interest to cell biologists and male reproductive biologists working on the sperm flagellar cytoskeleton and mitochondrial structure.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This important study advances our understanding of why diabetes is a risk factor for more severe Covid-19 disease. The authors offer solid evidence that cathepsin L is more active in diabetic individuals, that this higher activity is recapitulated at the cellular level in the presence of high glucose, and that high glucose leads to higher cathepsin L maturation. While not all aspects of the relationship between diabetes and cathepsin L (e.g., effects of metabolic acidosis) have been investigated, the work should be of interest to researchers in diabetes, virology, and immunology.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The study by He et al. investigates the relationship of an increased susceptibility of diabetes patients to COVID-19. The paper raises the possibility that hyperglycemia-induced cathepsin L maturation could be one of the driving forces in this pathology, suggesting that an increased activity of CTSL leads to accelerated virus infection rates due to an elevated processing of the SARS-CoV-2 spike protein.

      In a clinical case-control study, the team found that the severity of corona infections was higher in diabetic patients, and their CTSL levels correlated well with the progression of the disease. They further showed an increase in CTSL activity in the long term as well as acute hyperglycemia. SARS-CoV-2 increasingly infected cells that were cultured in serum from diabetic patients, the same was observed using high glucose medium. No effect was observed in the medium with increased concentrations of insulin. CTSL knockout abolished the glucose-dependent increase in infection.

      Increased glucose levels did not correlate with an increase in CTSL transcription. Rather He et al. could show that high glucose levels led to CTSL translocation from the ER into the lysosome. It was the glucose-dependent processing of the protease to its active form which promoted infection.

      Strengths:

      It is a complete study starting from a clinical observation and ending on the molecular mechanism. A strength is certainly the wide selection of experiments. The clinical study to investigate the effect of glucose on CTSL concentrations in healthy individuals sets the stage for experiments in cell culture, animal models, and human tissue. The effect of CTSL knockout cell lines on glucose-induced SARS-CoV2 infection rates is convincing. Finally, the team used a combination of Western blots and confocal microscopy to identify the underlying molecular mechanisms. The authors manage to keep the diabetic condition at the center of their study and therefore extend on previous knowledge of glucose-induced CTSL activation and their consequences for COVID-19 infections. By doing so, they create a novel connection between CTSL involvement in SARS-CoV2 infections and diabetes.

      Weaknesses:

      (1) The authors suggest that hyperglycemia as a symptom of diabetes leads to an increased infection rate in those patients. Throughout their study, the team focuses on two select symptoms of a diabetic condition, hyperglycemia and hyperinsulinemia. The team acknowledges in the discussion that there could be various other reasons. Hyperglycemia can lead to metabolic acidosis and a shift in blood pH. As CTSL activity is highly dependent on pH, it would have been crucial to include this parameter in the study.

      We sincerely appreciate your valuable comment. We agree that hyperglycemia can lead to metabolic acidosis and alter blood pH. However, the normal range for blood pH in humans is relatively narrow, typically ranging from 7.35 to 7.45. In our study, we ensured that blood pH remained within this normal range for both diabetic and healthy control samples. To address your concern, we conducted experiments to investigate CTSL activity in response to pH fluctuations within this physiological range. The updated Fig. 4a now presents these findings, demonstrating consistent CTSL activity despite pH variations. Statistical analysis was performed using one-way ANOVA with Tukey’s post hoc test to ensure robustness. We have also amended the figure legend and provided corresponding descriptions in the final edition manuscript (line 15-18, page 7).

      Author response image 1.

      (2) The study rarely differentiates between cellular and extracellular CTSL activity. A more detailed explanation for the connection between the intracellular CTSL and serum CTSL in diabetic individuals, presumably via lysosomal exocytosis, could be helpful with regard to the final model to give a more complete picture.

      Thank you for your insightful comments. Previous studies have elucidated the process by which lysosomal CTSL is transported via vesicles and subsequently secreted from the cell membrane through exocytosis (references 1-5). To provide a more comprehensive understanding, we have incorporated this information on Fig. 6h, page 32 of the final edition manuscript. This addition aims to enhance clarity regarding the connection between intracellular and serum CTSL activity in diabetic individuals, particularly through lysosomal exocytosis.

      Author response image 2.

      References:

      (1) Reddy A et al. Plasma membrane repair is mediated by Ca(2+)-regulated exocytosis of lysosomes. Cell. 2001 Jul 27;106(2):157-69. doi: 10.1016/s0092-8674(01)00421-4. PMID: 11511344.

      (2) Hasanagic M et al. Different Pathways to the Lysosome: Sorting out Alternatives. Int Rev Cell Mol Biol. 2015;320:75-101. doi: 10.1016/bs.ircmb.2015.07.008. Epub 2015 Aug 19. PMID: 26614872.

      (3) Reiser J et al. Specialized roles for cysteine cathepsins in health and disease. J Clin Invest. 2010 Oct;120(10):3421-31. doi: 10.1172/JCI42918. Epub 2010 Oct 1. PMID: 20921628; PMCID: PMC2947230.

      (4) Jaiswal JK et al. Membrane proximal lysosomes are the major vesicles responsible for calcium-dependent exocytosis in nonsecretory cells. J Cell Biol. 2002 Nov 25;159(4):625-35. doi: 10.1083/jcb.200208154. Epub 2002 Nov 18. PMID: 12438417; PMCID: PMC2173094.

      (5) Coutinho MF et al. Mannose-6-phosphate pathway: a review on its role in lysosomal function and dysfunction. Mol Genet Metab. 2012 Apr;105(4):542-50. doi: 10.1016/j.ymgme.2011.12.012. Epub 2011 Dec 23. PMID: 22266136.

      (3) In the early result section, an effect of hyperglycemia on total CTSL concentrations is described, but the data is not very convincing. Over the course of the manuscript, the hypothesis shifts increasingly towards an increase in protease trans-localization and processing to the active form rather than a change in total protease amounts. The overall importance of CTSL concentrations remains questionable.

      Thank you for your insightful feedback. We have addressed your concerns regarding the impact of hyperglycemia on CTSL concentrations. Fig. 2h-j illustrate the effect of acute hyperglycemia on both CTSL concentration and activity in 15 healthy male volunteers over a 160-minute period. During this short timeframe, CTSL concentration remained stable, as evidenced by consistent RNA results from cells exposed to varying glucose levels (Supplementary Fig.1). However, there was a significant increase in CTSL activity, indicating that glucose elevation rapidly triggers CTSL maturation through propeptide cleavage. This activation process occurs more rapidly than CTSL protein synthesis. In summary, acute hyperglycemia specifically elevates CTSL activity, while chronic hyperglycemia may impact both CTSL activity and concentration (Fig. 2a-d). Additionally, Tournu C, et al. (1998) (reference 1) and Shi Q, et al. (2018) (reference 2) have reported that increased glucose metabolism promotes the maturation and secretion of CTSL and other proteases. These findings align with our evidence that hyperglycemia drives CTSL maturation, as discussed at line 10-25, page 12 in the final edition manuscript.

      References:

      (1) Tournu C et al. Glucose controls cathepsin expression in Ras-transformed fibroblasts. Arch Biochem Biophys. 1998 Dec 1;360(1):15-24. doi: 10.1006/abbi.1998.0916. PMID: 9826424.

      (2) Shi Q et al. Increased glucose metabolism in TAMs fuels O-GlcNAcylation of lysosomal Cathepsin B to promote cancer metastasis and chemoresistance. Cancer Cell. 2022 Oct 10;40(10):1207-1222.e10. doi: 10.1016/j.ccell.2022.08.012. Epub 2022 Sep 8. PMID: 36084651.

      Reviewer #2 (Public Review):

      Summary:

      In this study, the authors hypothesized that individuals with diabetes have elevated blood CTSL levels, which facilitates SARS-CoV-2 infection. The authors conducted in vitro experiments, revealing that elevated glucose levels promote SARS-CoV-2 infection in wild-type cells. In contrast, CTSL knockout cells show reduced susceptibility to high glucose-promoted effects. Additionally, the authors utilized lung tissue samples obtained from both diabetic and non-diabetic patients, along with db/db diabetic and control mice. Their findings indicate that diabetic conditions lead to an elevation in CTSL activity in both humans and mice.

      Strengths:

      The authors have effectively met their research objectives, and their conclusions are supported by the data presented. Their findings suggest that high glucose levels promote CTSL maturation and translocation from the endoplasmic reticulum to the lysosome, potentially contributing to diabetic comorbidities and complications.

      Weaknesses:

      (1) In Figure 1e, the authors measured plasma levels of COVID-19 related proteins, including ACE2, CTSL, and CTSB, in both diabetic and non-diabetic COVID-19 patients. Notably, only CTSL levels exhibited a significant increase in diabetic patients compared to non-diabetic patients, and these levels varied throughout the course of COVID-19. Given that the diabetes groups encompass both male and female patients, it is essential to ascertain whether the authors considered the potential impact of gender on CTSL levels. The diabetes groups comprised a higher percentage of male patients (61.3%) compared to the non-diabetes group, where males constituted only 38.7%.

      Thank you for your insightful feedback. In response to your concerns regarding the potential impact of gender on CTSL levels in diabetic and non-diabetic COVID-19 patients, we conducted analyses to address this issue. While our initial study involved 62 COVID-19 patients, with 31 having diabetes and 31 without, matching based on gender and age, we acknowledged the challenge of obtaining balanced gender distribution in both groups due to the difficulty of collecting blood samples from COVID-19 patients. To mitigate potential gender bias resulting from small sample sizes, we conducted a supplementary clinical study involving 122 non-COVID-19 volunteers, including 61 individuals with diabetes and 61 without. The percentage of males in the diabetes group was 50.8%, while in the healthy group, males constituted 44.3% (P value = 0.468), indicating no significant gender bias. We have incorporated this information into the discussion section on line 4-13, page 11 in the final edition manuscript, to provide clarity on this aspect of our study.

      (2) Lines 145-149: "The results showed that WT Huh7 cell cultured in high glucose medium exhibited a much higher infective rate than those in low glucose medium. However, CTSL KO Huh7 cells maintained a low infective rate of SARS-CoV-2 regardless of glucose or insulin levels (Fig. 3f-h). Therefore, hyperglycemia enhanced SARS-CoV-2 infection dependent on CTSL." However, this evidence may be insufficient to support the claim that hyperglycemia enhances SARS-CoV-2 infection dependent on CTSL. The human hepatoma cell line Huh7 might not be an ideal model to validate the authors' hypothesis regarding high blood glucose promoting SARS-CoV-2 infection through CTSL.

      Thank you for your valuable feedback. We have addressed the concerns regarding the sufficiency of evidence supporting the claim that hyperglycemia enhances SARS-CoV-2 infection dependent on CTSL. Specifically, we have revised the expression to state, “Therefore, hyperglycemia enhanced SARS-CoV-2 infection through CTSL.” as suggested, in line 9, page 7 in the final edition manuscript. Additionally, we acknowledge the potential involvement of other bioactive factors, such as 1,5-anhydro-D-glucitol (1,5-AG), in mediating SARS-CoV-2 infection in patients with diabetes, as outlined in the discussion section from line 13-21, page 13 in the final edition manuscript.

      Regarding the choice of the human hepatoma cell line Huh7 as a model for investigating hyperglycemia-induced CTSL maturation and SARS-CoV-2 infection, we recognize the importance of tissue specificity and the liver’s significance as a target organ for COVID-19. Despite potential limitations, such as generalization of liver function abnormalities and lack of tissue specificity in SARS-CoV-2 impact, Huh7 cells offer practical advantages as a mature cell model for studying SARS-CoV-2 infection, including accessibility, susceptibility to infection, and stable proliferation (reference 1-3). We have elaborated on these considerations in the discussion section at line 19-23, page 11 in the final edition manuscript, to provide context for our choice of experimental model.

      References:

      (1) Gupta A et al. Extrapulmonary manifestations of COVID-19. Nat Med. 2020 Jul;26(7):1017-1032. doi: 10.1038/s41591-020-0968-3. Epub 2020 Jul 10. PMID: 32651579.

      (2) Nie X et al. Multi-organ proteomic landscape of COVID-19 autopsies. Cell. 2021 Feb 4;184(3):775-791.e14. doi: 10.1016/j.cell.2021.01.004. Epub 2021 Jan 9. PMID: 33503446; PMCID: PMC7794601.

      (3) Ciotti M et al. The COVID-19 pandemic. Crit Rev Clin Lab Sci. 2020 Sep;57(6):365-388. doi: 10.1080/10408363.2020.1783198. Epub 2020 Jul 9. PMID: 32645276.

      (3) The Abstract and Introduction sections lack effective organization.

      Thank you for your valuable comments. We have rewritten the Abstract and Introduction sections and incorporated the updated descriptions in the final edition manuscript.

      Reviewer #1 (Recommendations For The Authors):

      (1) When referring to diabetes, does this exclusively include diabetes type 2?

      Thank you for your inquiry. In our study, the term “diabetes” encompasses the condition of hyperglycemia in a broad sense, rather than specifically indicating type 1 diabetes (T1DM) or type 2 diabetes (T2DM). This broader definition aligns with the scope of our research objectives and findings, particularly observed in the cell experiments conducted. We have clarified this point in the revised discussion section, from line 6-9, page 12 in the final edition manuscript, to provide additional context for readers.

      (2) The titles of the individual paragraphs are not very strong and descriptive. More precise titles help to structure the paper better for the reader.

      Thank you for your valuable comments. We have rewritten the title of each section to make it more precise for readers and incorporated the updated descriptions in the manuscript.

      (3) Fig.3c, adding a 0 nM insulin control would be nice.

      Thank you for your suggestion. We have revised Fig.3c according to your advice. The revised figure was located at page 29 in the final edition manuscript. The corresponding figure legend has also been revised.

      Author response image 3.

      (4) Fig.3e non-infection control would be nice.

      Thank you for your suggestion. We have incorporated your feedback by adding a non-infection control in Fig. 3e. In this revised figure, we included a measurement of SARS-CoV-2 pseudovirus infection assessed through the fluorescence captured by a reader. Cells infected by the pseudovirus exhibited activation of the firefly luciferase, resulting in the release of fluorescence. Conversely, non-infected control cells showed no fluorescence, with the reader recording a value of zero. The updated figure can now be found on page 29 in the final edition manuscript, and we have adjusted the corresponding figure legend accordingly.

      Author response image 4.

      (5) In Figure 5, the processing of CTSL in cells (b-c) strongly differs from processing in tissue (d-e) focusing on amounts of dc-mCTSL. Do you have an explanation for this? Overall, blots are hard to judge by eye and it would be nice to include blots with shorter exposure.

      Thank you for your insightful feedback. The differences observed in the processing of CTSL between cells (Fig. 5b) and tissues (Fig. 5d-e) may be attributed to the complexities inherent in tissue samples, which can impact the clarity of the images. Furthermore, in human tissue samples, it is pertinent to consider that patients in the diabetes group had their blood glucose levels controlled within or near the normal range prior to lung surgery. As a result, the evidence supporting CTSL maturation in human lung tissue blotting images may be less compelling. We have addressed this aspect in the revised results section (lines 10-13, page 9). Additionally, we will consider including blots with shorter exposure to enhance visual clarity in future studies.

      (6) Considering Fig2B and Figure S1, the evidence of an effect of hyperglycemia or high glucose medium on total CTSL protein concentration is not very strong. In my opinion, this claim in the results section for Fig2 should be revisited.

      Thank you for your valuable suggestion. We have revisited the section in question and made appropriate revisions. The original sentence has been modified to accurately reflect the findings: "We found that plasma CTSL activity was strongly positively correlated with chronic hyperglycemia indicated by HbA1c and was significantly higher in diabetic patients than in euglycemic individuals (Fig. 2a, c). Additionally, plasma CTSL concentration showed a positive trend with chronic hyperglycemia indicated by HbA1c (Fig. 2b, d)". These changes have been incorporated into the revised results section (lines 12-16, page 5).

      (7) Overall, data hinting to increased CTSL activity is stronger than protein amount. This being said, in hyperglycemia, blood pH can be affected (metabolic acidosis). As CTSL has higher activity at low pH, could the increase in activity be caused by a drop in pH? Can you include this aspect in your manuscript? For example, is there a pH difference in serum of nondiabetic vs diabetic patients?

      Thank you for your valuable input. We have already addressed the potential impact of pH changes on CTSL activity in our response to Weakness No. 1. As indicated, although hyperglycemia can lead to metabolic acidosis and changes in blood pH, the pH levels observed in our study remained within the normal range (7.35 to 7.45). Therefore, we conducted experiments to investigate CTSL activity in response to changes in pH, which showed consistent activity levels within this range. This information has been included in our revised manuscript (line 15-18, page 7).

      Reviewer #2 (Recommendations For The Authors):

      (1) The Abstract and Introduction sections lack effective organization. The manuscript's style resembles that of Cell Journal rather than aligning with the customary format of eLife.

      Thank you for your valuable comments. The Abstract and Introduction sections have been reorganized to be more precise for readers has been included in our revised manuscript. Additionally, we have meticulously updated the manuscript's style to align with the standard format of eLife in our revised manuscript, especially key resources table of materials and methods sections.

    2. eLife assessment

      This valuable study advances our understanding of why diabetes is a risk factor for more severe Covid-19 disease. The authors offer convincing evidence that cathepsin L is more active in diabetic individuals because of the presence of high glucose, where the main mechanism is increased cathepsin L maturation. This study should be of interest to researchers in diabetes, virology and immunology.

    3. Reviewer #1 (Public Review):

      Summary:

      The study by He et al. investigates the relationship of an increased susceptibility of diabetes patients towards COVID-19. The paper raises the possibility that hyperglycemia-induced cathepsin L maturation could be one of the driving forces in this pathology, suggesting that an increased activity of CTSL leads to accelerated virus infection rates due to an elevated processing of the SARS-CoV-2 spike protein.

      In a clinical case-control study, the team found that severity of corona infections was higher in diabetic patients, and their CTSL levels correlated well with the progression of the disease. They further showed an increase in CTSL activity in long term as well as acute hyperglycemia. SARS-CoV-2 increasingly infected cells that were cultured in serum from diabetic patients, the same was observed using high glucose medium. No effect was observed in the medium with increased concentrations of insulin. CTSL knockout abolished the glucose-dependent increase in infection.

      Increased glucose levels did not correlate with an increase in CTSL transcription. Rather He et al. could show that high glucose levels led to CTSL translocation from the ER into the lysosome. It was the glucose-dependent processing of the protease to its active form which promoted infection.

      Overall, it is a very complete study starting from a clinical observation and ending on the molecular mechanism. A strength is certainly the wide selection of experiments. The clinical study to investigate the effect of glucose on CTSL concentrations in healthy individuals sets the stage for experiments in cell culture, animal models and human tissue. The effect of CTSL knockout cell lines on glucose-induced SARS-CoV2 infection rates are convincing. Finally, the team used a combination of Western blots and confocal microscopy to identify the underlying molecular mechanisms.

      The authors keep the diabetic condition at the center of their study and extend on previous knowledge of glucose-induced CTSL activation and their consequences for Covid19 infections. By doing so, they create a novel connection between CTSL involvement in SARS-CoV2 infections and diabetes. This enables novel, public awareness of the susceptibility of diabetes patients to the disease.

    4. Reviewer #2 (Public Review):

      Summary:

      In this study, the authors hypothesized that individuals with diabetes have elevated blood CTSL levels, which facilitates SARS-CoV-2 infection. The authors conducted in vitro experiments, revealing that elevated glucose levels promote SARS-CoV-2 infection in wild-type cells. In contrast, CTSL knockout cells show reduced susceptibility to high glucose-promoted effects. Additionally, the authors utilized lung tissue samples obtained from both diabetic and non-diabetic patients, along with db/db diabetic and control mice. Their findings indicate that diabetic conditions lead to an elevation in CTSL activity in both human and mice.

      Strengths:

      The authors have effectively met their research objectives, and their conclusions are supported by the data presented. Their findings suggest that high glucose levels promote CTSL maturation and translocation from the endoplasmic reticulum to the lysosome, potentially contributing to diabetic comorbidities and complications.

      Weaknesses:

      (1) In Figure 1e, the authors measured plasma levels of COVID-19 related proteins, including ACE2, CTSL, and CTSB, in both diabetic and non-diabetic COVID-19 patients. Notably, only CTSL levels exhibited a significant increase in diabetic patients compared to non-diabetic patients, and these levels varied throughout the course of COVID-19. Given that the diabetes groups encompass both male and female patients, it is essential to ascertain whether the authors considered the potential impact of gender on CTSL levels. The diabetes groups comprised a higher percentage of male patients (61.3%) compared to the non-diabetes group, where males constituted only 38.7%.

      (2) lines145-149: "The results showed that WT Huh7 cell cultured in high glucose medium exhibited a much higher infective rate than those in low glucose medium. However, CTSL KO Huh7 cells maintained a low infective rate of SARS-CoV-2 regardless of glucose or insulin levels (Fig. 3f-h). Therefore, hyperglycemia enhanced SARS-CoV-2 infection dependent on CTSL." However, this evidence may be insufficient to support the claim that hyperglycemia enhances SARS-CoV-2 infection dependent on CTSL. The human hepatoma cell line Huh7 might not be an ideal model to validate the authors' hypothesis regarding high blood glucose promoting SARS-CoV-2 infection through CTSL.

      (3) The Abstract and Introduction sections lack effective organization.

      In this revised version of the study, the authors have addressed my concerns by providing additional experiments, references and discussing further the points of controversy. I think that the authors have made improvements to the manuscript.

    1. eLife assessment

      This paper reports on the transcriptional changes upon chloramphenicol-induced surface mobility of Bacillus subtilis, a phenomenon that can occur during co-incubation with Streptomyces venezuelae, a chloramphenicol producer. The work presented includes valuable and thorough transcriptomics data, which convincingly indicate that sub-lethal chloramphenicol triggers substantial changes in B. subtilis gene expression. There are, however, significant limitations and concerns whether the documented changes are causal for the phenotypes observed or simply correlated with these phenotypes; additionally, the notion that chloramphenicol triggers a 'division of labor' was incomplete and should be backed up experimentally.

    2. Reviewer #1 (Public Review):

      Summary:

      In this study, Liu et al. investigate the signaling pathway that triggers sliding motility in the bacterium B. subtilis in response to subinhibitory concentrations of the antibiotic chloramphenicol. The authors used a genetic approach to identify the master regulator CodY playing a regulatory role in this behavior. They used transcriptional and metabolomic profiling to delineate the spatiotemporal separation of the regulatory networks that define distinct metabolic states related to purine metabolism and pyruvate utilization, which are ultimately responsible for the induction of sliding in response to chloramphenicol. Many readers would be interested to read this work showing how extracellular signals modulate microbial physiology and metabolism.

      Strengths:

      This work presents numerous technical and conceptual strengths. In the opinion of this referee, the most significant conceptual strength of this work is to (once again) provide evidence that antibiotics are not merely produced by bacteria to eliminate competitors. Bacteria have evolved to respond to their presence and activate a range of physiological responses, which are poorly understood. Understanding these responses is critical to fully understand the evolutionary consequences associated with the use of antibiotics. From a technical standpoint, perhaps the most relevant aspect is the robust phenotypic assay developed by the authors to study sliding motility in the presence of chloramphenicol. This robustness enables genetic work using mutants and performing omics assays to characterize the response to chloramphenicol in detail. Additionally, two sets of results stood out and provided important value to this work. One is the comparison established between the sliding induced by chloramphenicol and the sliding generated in the ΔcodY mutant, to determine the genes and the metabolites (using transcriptomics and metabolomics) specifically associated with the response to chloramphenicol without being part of the general Cody-mediated induction of sliding. The second set of results led the authors to identify precise genes of bacterial metabolism (pdhA) responsible for the sliding phenotype in response to chloramphenicol, and conducted genetic experiments to demonstrate that the pdhA mutant does not respond to the presence of chloramphenicol.

      Weaknesses:

      This work has three main weaknesses, all related to transcriptomic and metabolomic analyses. Firstly, there is the challenge of understanding the essence of the omics results. This section presents an overwhelming array of genes involved in different metabolic pathways, without an obvious thread to tie these hits together. It is easy to get lost in this section. For instance, one cannot be certain if the hits from one particular metabolic pathway are significant enough to figure out to which degree is this pathway responsible for the sliding phenotype. This section contains a huge diversity of genes and pathways and needs to be streamlined. Related to this, the message of the omics experiments highlights a very close relationship between purine and pyruvate metabolism in sliding motility. However, it is unclear how these metabolic pathways may influence sliding or any other specific bacterial behavior. I do not mean to say that it is not possible, just that the connection/mechanism is missing. The third weakness concerns the omics results that sometimes are in conflict. The authors proposed that this may stem from a division of labor and the coexistence of different subpopulations with different metabolisms within the microbial community. While plausible, other possibilities are equally plausible and should be tested in a revised version of the work.

    3. Reviewer #2 (Public Review):

      Summary:

      Liu and colleagues describe the transcriptional changes observed during chloramphenicol-induced surface mobility of Bacillus subtilis. Practically, they describe that numerous transcriptional regulatory pathways are influenced by the subinhibitory concentration of a translational inhibitor and some of these regulatory changes might contribute to the induction of sliding. Nevertheless, how such translational stress is translated to induction of sliding remains undetermined. The authors clearly describe their aim (line 457): "Our goal for this study was to gain insight into how B. subtilis mobilizes a colony in response to subinhibitory exposure to translation inhibitors.", this is unfortunately not solved here, only the authors characterize the transcriptional landscape differences.

      Strengths:

      The very thorough analysis of transcriptional changes in the wild type and codY mutant strains is appreciated, and there are definitely a plethora of changes observed related to several global transcriptional regulators in B. subtilis. I compliment the authors for this very detailed and thorough description of transcriptional changes.

      Weaknesses:

      While the transcriptional changes are well and carefully described, the discussion practically interprets the correlations as causations. I am not disputing that the authors are not on the correct path with their assumptions, but their conclusions are not supported by direct experimental data, especially on (1) translational stress directly inducing mobility and (2) division of labor.

      Major 1:

      The authors conclude that their results point towards a putative mechanism, e.g. line 460 "which suggests translation stress is a trigger for colony mobilization"; however, no experiment demonstrates this aspect. The authors do not test ppGpp-related stress (mutants in ppGpp-related genes, or mutating the functional domain of CodY), nor do they directly connect ppGpp levels dynamics with induction of subsequent pathways. Again, I understand that the authors are on the right path to connect these pathways and identify what is causing mobility induction, but no direct data is represented, solely the transcriptional changes, therefore remains slightly descriptive.

      The statement in the chapter title (line 474) is not demonstrated directly and should be revised. Similarly, in line 476, the authors claim that their "data supports a model", but "support" would require direct experimental data on this aspect.

      The authors even clearly indicate in lines 504-506 that they do not reveal the direct mechanism, but the rest of the discussion delivers statements that do not consider the lack of direct data.

      Major 2:

      Line 427: "The results are consistent with a division of metabolic labor among cells in the expanding population" - the data shows heterogeneity, but the direct division of labor is not demonstrated.

      Line 442: So in this case, the proposed division of labor is disrupted in the codY mutant (no inner localisation), and hence expansion appears, suggesting a lack of a putative division of labor is not necessary for induced mobility. On the contrary, there could be heterogeneous gene expression, division of labor requires demonstration of fitness benefit from such interaction.

      Division of labor assumes that a mixture of mutants would complement full sliding dynamics, and this could be easily demonstrated by fluorescent labeled cells that should be organized in a similar fashion to those observed with luciferase reporters (pucA mutant on the outer ring, while pdhA mutant interior colony part). Without such experimental demonstration, the authors can only conclude spatially heterogeneous gene expresstion without clear functional contribution to subinhibitory chrolamphenociol-induced surface mobility.

      Again, the authors' statement in line 472 "reveal a regulated, spatiotemporal division of metabolism" is not demonstrated by experiment, but spatial heterogeneity is revealed here.<br /> The statement in the Discussion chapter (line 499) is also not demonstrated by experimental data: "Metabolic coordination enables surface expansion of mobilitzed B. subtilis"

      Line 550: while I agree with the authors' statement that these functions work cooperatively as demonstrated by van Gestel and colleagues (2015 PloS Biol), the exploitation of these shared goods is not quantitatively equivalent, see Jautzus et al 2022 ISME J (DOI: 10.1038/s41396-022-01279-8).

      In summary: the two major conclusions of the manuscript are unfortunately not demonstrated, the presented transcriptional data delivers suggestions, supported with specific mutants displaying certain phenotypes (lack of mobility induction or constitutive mobility without inducer), but it remains unclear how translational stress induces mobility and whether the transcriptional heterogeneity detected directly contributes to metabolic division of labor.

      The authors should present direct evidence on the major concerns: how translational stress induces surface mobility (using ppGpp synthesis and turnover mutants and specific CdoY mutant lacking ppGpp sensing) and whether the metabolic division of labor contributes to induced surface mobility (mixing mutants and following their distribution).