6,659 Matching Annotations
  1. Aug 2024
    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this study, the authors used a stopped-flow method to investigate the kinetics of substrate translocation through the channel in hexameric ClpB, an ATP-dependent bacterial protein disaggregase. They engineered a series of polypeptides with the N-terminal RepA ClpB-targeting sequence followed by a variable number of folded titin domains. The authors detected translocation of the substrate polypeptides by observing the enhancement of fluorescence from a probe located at the substrate's C-terminus. The total time of the substrates' translocation correlated with their lengths, which allowed the authors to determine the number of residues translocated by ClpB per unit time.

      Strengths:

      This study confirms a previously proposed model of processive translocation of polypeptides through the channel in ClpB. The novelty of this work is in the clever design of a series of kinetic experiments with an engineered substrate that includes stably folded domains. This approach produced a quantitative description of the reaction rates and kinetic step sizes. Another valuable aspect is that the method can be used for other translocases from the AAA+ family to characterize their mechanism of substrate processing.

      Weaknesses:

      The main limitation of the study is in using a single non-physiological substrate of ClpB, which does not replicate the physical properties of the aggregated cellular proteins and includes a non-physiological ClpB-targeting sequence. Another limitation is in the use of ATPgammaS to stimulate the substrate processing. It is not clear how relevant the results are to the ClpB function in living cells with ATP as the source of energy, a multitude of various aggregated substrates without targeting sequences that need ClpB's assistance, and in the presence of the co-chaperones.

      Indeed, we agree that our RepA-Titinx substrates are not aggregates but are model, soluble, substrates used to reveal information about enzyme catalyzed protein unfolding and translocation.  Our substrates are similar to RepA-GFP and GFP-SsrA used by multiple labs including Wickner, Horwich, Sauer, Baker, Shorter, Bukua, to name only a few.  The fact that “this is what everyone does” does not make the substrates physiological or the most ideal. However, this is the technology we currently have until we and others develop something better. In the meantime, we contend that  the results presented here do advance our knowledge on enzyme catalyzed protein unfolding

      Part of what this manuscript seeks to accomplish is presenting the development of a single-turnover experiment that reports on processive protein unfolding by AAA+ molecular motors, in this case, ClpB.  Importantly, we are treating translocation on an unfolded polypeptide chain and protein unfolding of stably folded proteins as two distinct reactions catalyzed by ClpB. If these functions are used to disrupt protein aggregates, in vivo, then this remains to be seen.

      We contend that processive ClpB catalyzed protein unfolding has not been rigorously demonstrated prior to our results presented here.  Avellaneda et al mechanically unfolded their substrate before loading ClpB (Avellaneda, Franke, Sunderlikova et al. 2020).  Thus, their experiment represents valuable observations reflecting polypeptide translocation on a pre-unfolded protein.  Our previous work using single-turnover stopped-flow experiments employed unstructured synthetic polypeptides and therefore reflects polypeptide translocation and not protein unfolding (Li, Weaver, Lin et al. 2015).  Weibezahn et al used unstructured substrates in their study with ClpB (BAP/ClpP), and thus their results represent translocation of a pre-unfolded polypeptide and not enzyme catalyzed protein unfolding (Weibezahn, Tessarz, Schlieker et al. 2004). 

      Many studies have reported the use of  GFP with tags or RepA-GFP and used the loss of GFP fluorescence to conclude protein unfolding.  However, such results do not reveal if ClpB processively and fully translocates the substrate through its axial channel.  One cannot rule out, even when trapping with “GroEL trap”, the possibility that ClpB only needs to disrupt some of the fold in GFP before cooperative unfolding occurs leading to loss of fluorescence.  Once the cooperative collapse of the structure occurs and fluorescence is lost it has not been shown that ClpB will continue to translocate on the newly unfolded chain or dissociate. In fact, the Bukau group showed that folded YFP remained intact after luciferase was unfolded (Haslberger, Zdanowicz, Brand et al. 2008).  Our approach, reported here, yields signal upon arrival of the motor at the c-terminus or within the PIFE distance thus we can be certain that the motor does arrive at the c-terminus after unfolding up to three tandem repeats of the Titin I27 domain.

      ATPgS is a non-physiological nucleotide analog.  However, ClpB has been shown to exhibit curious behavior in its presence that we and others, as the reviewer acknowledges, do not fully understand (Doyle, Shorter, Zolkiewski et al. 2007).  Some of the experiments reported here are seeking to better understand that fact.  Here we have shown that ATPgS alone will support processive protein unfolding. With this assay in hand, we are now seeking to go forward and address many of the points raised by this reviewer. 

      The authors do not attempt to correlate the kinetic step sizes detected during substrate translocation and unfolding with the substrate's structure, which should be possible, given how extensively the stability and unfolding of the titin I27 domain were studied before. Also, since the substrate contains up to three I27 domains separated with unstructured linkers, it is not clear why all the translocation steps are assumed to occur with the same rate constant.

      We assume that all protein unfolding steps occur with the same rate constant, ku.  We conclude that we are not detecting the translocation rate constant, kt, as our results support a model where kt is much faster than ku.  We do think it makes sense that the same slow step occurs between each cycle of protein unfolding.

      We have added a discussion relating our observations to mechanical unfolding of tandem repeats of Titin I27 from AFM experiments  (Oberhauser, Hansma, Carrion-Vazquez and Fernandez 2001). Most interestingly, they report unfolding of Titin I27 in 22 nm steps.  Using 0.34 nm per amino acids this yields ~65 amino acids per unfolding step, which is comparable to our kinetic step-size of 57 – 58 amino acids per step.

      Some conclusions presented in the manuscript are speculative:

      The notion that the emission from Alexa Fluor 555 is enhanced when ClpB approaches the substrate's C-terminus needs to be supported experimentally. Also, evidence that ATPgammaS without ATP can provide sufficient energy for substrate translocation and unfolding is missing in the paper.

      In our previous work we have used fluorescently labeled 50 amino acid peptides as substrates to examine ClpB binding (Li, Lin and Lucius 2015, Li, Weaver, Lin et al. 2015).  In that work we have used fluorescein, which exhibits quenching upon ClpB binding.  We have added a control experiment where we have attached alexa fluor 555 to the 50 amino acid substrate so we can be assured the ClpB binds close to the fluorophore.  As seen in supplemental Fig. 1 A  upon titration with ClpB, in the presence of ATPγS, we observe an increase in fluorescence from AF555, consistent with PIFE.  Supplemental Fig. 1 B shows the relative fluorescence enhancement at the peak max increases up to ~ 0.2 or a 20 % increase in fluorescence, due to PIFE, upon ClpB binding.   

      Further, peak time is our hypothesized measure of ClpB’s arrival at the dye. Our results indicate that the peak time linearly increases as a function of an increase in the number of folded TitinI27 repeats in the substrates which also supports the PIFE hypothesis. Finally, others have shown that AF555 exhibits PIFE and we have added those references.

      The evidence that ATPγS alone can support translocation is shown in Fig. 2 and supplemental Figure 1.  Fig. 2 and supplemental Figure 1 are two different mixing strategies where we use only ATPgS and no ATP at all.  In both cases the time courses are consistent with processive protein unfolding by ClpB with only ATPγS.

      Reviewer #2 (Public Review):

      Summary:

      The current work by Banwait et al. reports a fluorescence-based single turnover method based on protein-induced fluorescence enhancement (PIFE) to show that ClpB is a processive motor. The paper is a crucial finding as there has been ambiguity on whether ClpB is a processive or non-processive motor. Optical tweezers-based single-molecule studies have shown that ClpB is a processive motor, whereas previous studies from the same group hypothesized it to be a non-processive motor. As co-chaperones are needed for the motor activity of the ClpB, to isolate the activity of ClpB, they have used a 1:1 ratio ATP and ATPgS, where the enzyme is active even in the absence of its co-chaperones, as previously observed. A sequential mixing stop-flow protocol was developed, and the unfolding and translocation of RepA-TitinX, X = 1,2,3 repeats was monitored by measuring the fluorescence intensity with the time of Alexa F555 which was labelled at the C-terminal Cysteine. The observations were a lag time, followed by a gradual increase in fluorescence due to PIFE, and then a decrease in fluorescence plausibly due to the dissociation from the substrate allowing it to refold. The authors observed that the peak time depends on the substrate length, indicating the processive nature of ClpB. In addition, the lag and peak times depend on the pre-incubation time with ATPgS, indicating that the enzyme translocates on the substrates even with just ATPgS without the addition of ATP, which is plausible due to the slow hydrolysis of ATPgS. From the plot of substrate length vs peak time, the authors calculated the rate of unfolding and translocation to be ~0.1 aas-1 in the presence of ~1 mM ATPgS and increases to 1 aas-1 in the presence of 1:1 ATP and ATPgS. The authors have further performed experiments at 3:1 ATP and ATPgS concentrations and observed ~5 times increase in the translocation rates as expected due to faster hydrolysis of ATP by ClpB and reconfirming that processivity is majorly ATP driven. Further, the authors model their results to multiple sequential unfolding steps, determining the rate of unfolding and the number of amino acids unfolded during each step. Overall, the study uses a novel method to reconfirm the processive nature of ClpB.

      Strengths:

      (1) Previous studies on understanding the processivity of ClpB have primarily focused on unfolded or disordered proteins; this study paves new insights into our understanding of the processing of folded proteins by ClpB. They have cleverly used RepA as a recognition sequence to understand the unfolding of titin-I27 folded domains.

      (2) The method developed can be applied to many disaggregating enzymes and has broader significance.

      (3) The data from various experiments are consistent with each other, indicating the reproducibility of the data. For example, the rate of translocation in the presence of ATPgS, ~0.1 aas-1 from the single mixing experiment and double mixing experiment are very similar.

      (4) The study convincingly shows that ClpB is a processive motor, which has long been debated, describing its activity in the presence of only ATPgS and a mixture of ATP and ATPgS.

      (5) The discussion part has been written in a way that describes many previous experiments from various groups supporting the processive nature of the enzyme and supports their current study.

      Weaknesses:

      (1) The authors model that the enzyme unfolds the protein sequentially around 60 aa each time through multiple steps and translocates rapidly. This contradicts our knowledge of protein unfolding, which is generally cooperative, particularly for titinI27, which is reported to unfold cooperatively or utmost through one intermediate during enzymatic unfolding by ClpX and ClpA.

      We do not think this represents a contradiction.  In fact, our observations are in good agreement with mechanical unfolding of tandem repeats of Titin I27 using AFM experiments (Oberhauser, Hansma, Carrion-Vazquez and Fernandez 2001).  They showed that tandem repeats of TitinI27 unfolded in steps of ~22 nm.  Dividing 22 nm by 0.34 nm/Amino Acid gives ~65 amino acids per unfolding event.  This implies that, under force, ~65 amino acids of folded structure unfolds in a single step.  This number is in excellent agreement with our kinetic step-size of 65 AA/step. 

      Importantly, the experiments cited by the reviewer on ClpA and ClpX are actually with ClpAP and ClpXP.  We assert that this is an important distinction as we have shown that ClpA employs a different mechanism than ClpAP (Rajendar and Lucius 2010, Miller, Lin, Li and Lucius 2013, Miller and Lucius 2014).  Thus, ClpA and ClpAP should be treated as different enzymes but, without question, ClpB and ClpA are different enzymes.

      (2) It is also important to note that the unfolding of titinI27 from the N-terminus (as done in this study) has been reported to be very fast and cannot be the rate-limiting step as reported earlier(Olivares et al, PNAS, 2017). This contradicts the current model where unfolding is the rate-limiting step, and the translocation is assumed to be many orders faster than unfolding.

      Most importantly, the Olivares paper is examining ClpXP and ClpAP catalyzed protein unfolding and translocation and not ClpB.  These are different enzymes.  Additionally, we have shown that ClpAP and ClpA translocate unfolded polypeptides with different rates, rate constants, and kinetic step-sizes indicating that ClpP allosterically impacts the mechanism employed by ClpA to the extent that even ClpA and ClpAP should be considered different enzymes (Rajendar and Lucius 2010, Miller, Lin, Li and Lucius 2013).  We would further assert that there is no reason to assume ClpAP and ClpXP would catalyze protein unfolding using the same mechanism as ClpB as we do not think it should be assumed ClpA and ClpX use the same mechanism as ClpAP and ClpXP, respectively. 

      The Olivares et al paper reports a dwell time preceding protein unfolding of ~0.9 and ~0.8 s for ClpXP and ClpAP, respectively.   The inverse of this can be taken as the rate constant for protein unfolding and would yield a rate constant of ~1.2 s-1, which is in good agreement with our observed rate constant of 0.9 – 4.3 s-1 depending on the ATP:ATPγS mixing ratio.  For ClpB, we propose that the slow unfolding is then followed by rapid translocation on the unfolded chain where translocation by ClpB must be much faster than for ClpAP and ClpXP.  We think this is a reasonable interpretation of our results and not a contradiction of the results in Olivares et al. Moreover, this is completely consistent with the mechanistic differences that we have reported, using the same single-turnover stopped flow approach on the same unfolded polypeptide chains with ClpB, ClpA, and ClpAP (Rajendar and Lucius 2010, Miller, Lin, Li and Lucius 2013, Miller and Lucius 2014, Li, Weaver, Lin et al. 2015).

      (3) The model assumes the same time constant for all the unfolding steps irrespective of the secondary structural interactions.

      Yes, we contend that this is a good assumption because it represents repetition of protein unfolding catalyzed by ClpB upon encountering the same repeating structural elements, i.e. Beta sheets. 

      (4) Unlike other single-molecule optical tweezer-based assays, the study cannot distinguish the unfolding and translocation events and assumes that unfolding is the rate-limiting step.

      Although we cannot, directly, distinguish between protein unfolding and translocation we have logically concluded that protein unfolding is likely rate limiting. This is because the large kinetic step-size represents the collapse of ~60 amino acids of structure between two rate-limiting steps, which we interpret to represent cooperative protein unfolding induced by ClpB.  It is not an assumption it is our current best interpretation of the observations that we are now seeking to further test. 

      Reviewer #3 (Public Review):

      Summary:

      The authors have devised an elegant stopped-flow fluorescence approach to probe the mechanism of action of the Hsp100 protein unfoldase ClpB on an unfolded substrate (RepA) coupled to 1-3 repeats of a folded titin domain. They provide useful new insight into the kinetics of ClpB action. The results support their conclusions for the model setup used.

      Strengths:

      The stopped-flow fluorescence method with a variable delay after mixing the reactants is informative, as is the use of variable numbers of folded domains to probe the unfolding steps.

      Weaknesses:

      The setup does not reflect the physiological setting for ClpB action. A mixture of ATP and ATPgammaS is used to activate ClpB without the need for its co-chaperones, Hsp70. Hsp40 and an Hsp70 nucleotide exchange factor. This nucleotide strategy was discovered by Doyle et al (2007) but the mechanism of action is not fully understood. Other authors have used different approaches. As mentioned by the authors, Weibezahn et al used a construct coupled to the ClpA protease to demonstrate translocation. Avellaneda et al used a mutant (Y503D) in the coiled-coil regulatory domain to bypass the Hsp70 system. These differences complicate comparisons of rates and step sizes with previous work. It is unclear which results, if any, reflect the in vivo action of ClpB on the disassembly of aggregates.

      We agree with the reviewer, there are several strategies that have been employed to bypass the need for Hsp70/40 or KJE to simplify in vitro experiments.  Here we have developed a first of its kind transient state kinetics approach that can be used to examine processive protein unfolding.  We now seek to go forward with examining the mechanisms of hyperactive mutants, like Y503D, and add the co-chaperones so that we can address the limitations articulated by the reviewer.   In fact we already began adding DnaK to the reaction and found that DnaK induced ClpB to release the polypeptide chain (Durie, Duran and Lucius 2018).  However, the sequential mixing strategy developed here was needed to go forward with examining the impact of co-chaperones. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Line 1: I recommend changing the title of the paper to remove the terms that are not clearly defined in the text: "robust" and "processive". What are the Authors' criteria for describing a molecular machine as "robust" vs. "not robust"? A definition of processivity is given in equation 2, but its value for ClpB is not reported in the text, and the criteria for classifying a machine as "processive" vs. "non-processive" are not included. Besides, the Authors have previously reported that ClpB is non-processive (Biochem. J., 2015), so it is now clear that a more nuanced terminology should be applied to this protein. Also, Escherichia coli should be fully spelled out in the title.

      The title has been changed.  We have removed “robust” as we agree with the reviewer, there is no way to quantify “robust”.  However, we have kept “processive” and have added to the discussion a calculation of processivity since we can quantify processivity.  Importantly, the unstructured substrates used in our previous studies represent translocation and not protein unfolding.  here, on folded substrates, we detect rate-limiting protein unfolding followed by rapid translocation.  Thus, we report a lower bound on protein unfolding processivity of 362 amino acids. 

      Line 20: The comment about mitochondrial SKD3 should be removed. SKD3, like ClpB, belongs to the AAA+ family, and it is simply a coincidence that the original study that discovered SKD3 termed it an Hsp100 homolog. The similarity between SKD3 and ClpB is limited to the AAA+ module, so there are many other metazoan ATPases, besides SKD3, that could be called homologs of ClpB, including mitochondrial ClpX, ER-localized torsins, p97, etc.

      Removed.

      Lines 133-139. Contrary to what the authors state, it is not clear that the "lag-phase" becomes significantly shorter for subsequent mixing experiments (Figure 1E) perhaps except for the last one (2070 s). It is clear, however, that the emission enhancement becomes stronger for later mixes. This effect should be discussed and explained, as it suggests that the pre-equilibrations shorter than ~2000 sec do not produce saturation of ClpB binding to the substrate.

      We have added supplemental figure 2, which represents a zoom into the lag region.  This better illustrates what we were seeing but did not clearly show to the reader.  In addition, we address all three changes in the time courses, i.e. extend of lag, change in peak position, and the change in peak height. 

      Line 175. The hydrolysis rate of ATPgammaS in the presence of ClpB should be measured and compared to the hydrolysis rate with ATP/ATPgammaS to check if the ratio of those rates agrees with the ratio of the translocation rates. These experiments should be performed with and without the RepA-titin substrate, which could reveal an important linkage between the ATPase engine and substrate translocation. These experiments are essential to support the claim of substrate translocation and unfolding with ATPgammaS as the sole energy source.

      The time courses shown in figure 2 and supplemental Figure 1 are collected with only ATPgS and no ATP.  The time courses show a clear increase in lag and appearance of a peak with increasing number of tandem repeats of titin domains.  We do not see an alternate explanation for this observation other than ATPγS supports ClpB catalyzed protein unfolding and translocation.  What is the reviewers alternate explanation for these observations?

      We agree with the reviewer that the linkage of ATP hydrolysis to protein unfolding and translocation is essential and we are seeking to acquire this knowledge.  However, a simple comparison of the ratio of rates is not adequate. We contend that a complete mechanistic study of ATP turnover by ClpB is required to properly address this linkage and such a study is too substantial to be included here but is currently underway. 

      All that said, the statement on line 175 was removed since we do not report any ATPase measurements in this paper.

      Line 199: It is an over-simplification to state that "1:1 mix of ATP to ATPgammaS replaces the need for co-chaperones". This sentence should be corrected or removed. The ClpB co-chaperones (DnaK, DnaJ, GrpE) play a major role in targeting ClpB to its aggregated substrates in cells and in regulating the ClpB activity through interactions with its middle domain. ATPgammaS does not replace the co-chaperones; it is a chemical probe that modifies the mechanism of ClpB in a way that is not entirely understood.

      We agree with the reviewer.  The sentence has been modified to point out that the mix of ATP and ATPγS activates ClpB.

      Figure 3B, Supplementary Figure 5A. The solid lines from the model fit cannot be distinguished from the data points. Please modify the figures' format to clearly show the fits and the data points.

      Done.

      Lines 326, 329. It is not clear why the authors mention a lack of covalent modification of substrates by ClpB. AAA+ ATPases do not produce covalent modifications of their substrates.

      The issue of covalent modification was presented in the introduction lines 55 – 60 pointing out that much of what we have learned about protein unfolding and translocation catalyzed by ClpA and ClpX is from the observations of proteolytic degradation catalyzed by the associated protease ClpP.  However, this approach is not possible for ClpB/Hsp104 as these motors do not associate with a protease unless they have been artificially engineered to do so. 

      Lines 396-399. I am puzzled why the authors try to correlate the size of the detected kinetic step with the length of the ClpB channel instead of the size characteristics of the substrate.

      We are attempting to discuss/rationalize the observed large kinetic step-size which, in part, is defined by the structural properties of the enzyme as well as the size characteristics of the substrate.  We have attempted to clarify this and better discuss the properties of the substrate as well as ClpB.

      As I mentioned in the Public Review, it is essential to demonstrate that the emission increase used as the only readout of the ClpB position along the substrate is indeed caused by the proximity of ClpB to the fluorophore. One way to accomplish that would be to place the fluorophore upstream from the first I27 domain and determine if the "lag phase" in the emission enhancement disappears.

      Alexa Fluor 555 is well established to exhibit PIFE.  However, as in the response to the public review, we have included an appropriate control showing this in supplemental Fig. 1.

      Finally, the authors repetitively place their results in opposition to the study of Weibezahn et al. published in 2004 which first demonstrated substrate translocation by engineering a peptidase-associated variant of ClpB. It should be noted that the field of protein disaggregases has moved since the time of that publication from the initial "from-start-to-end" translocation model to a more nuanced picture of partial translocation of polypeptide loops with possible substrate slipping through the ClpB channel and a dynamic assembly of ClpB hexamers with possible subunit exchange, all of which may affect the kinetics in a complex way. However, the present study confirmed the "start-to-end" translocation model, albeit for a non-physiological ClpB substrate, and that is the take-home message, which should be included in the text.

      It is not clear to us that the field has “moved on” since Weibezahn et al 2004.  Their engineered construct that they term “BAP” with ClpP is still used in the field despite us reporting that proteolytic degradation is observed in the absence of ATP with that system  (Li, Weaver, Lin et al. 2015) and should, therefore, not be used to conclude processive energy driven translocation. The “partial translocation” by ClpB is also grounded in observations of partial degradation catalyzed by ClpP with BAP from the same group (Haslberger, Zdanowicz, Brand et al. 2008). It is not clear to us that the idea of subunit exchange leading to the possibility of assembly around internal sequences is being considered.  We do agree that this is an important mechanistic possibility that needs further interrogation. We agree with the reviewer, all these factors are confounding and lead to a more nuanced view of the mechanism.

      All that said, we have removed some of the opposition in the discussion.

      Reviewer #2 (Recommendations For The Authors):

      (1) It is assumed that the lag phase will be much longer than the phase in which we see a gradual increase in fluorescence, as the effect of PIFE is significant only when the enzyme is very close to the fluorophore. Particularly for RepA-titin3, the enzyme has to translocate many tens of nm before it is closer to the C-terminus fluorophore. However, in all cases, the lag time is lower or similar to the gradual increase phase (for example, Figure 3B). Could the authors explain this?

      The extent of the lag, or time zero until the signal starts to increase, is interpreted to indicate the time the motor moves from it’s initial binding site until it gets close enough to the fluorophore that PIFE starts to occur.  In our analysis we apply signal change to the last intermediate and dissociation or release of unfolded RepA-TitinX.  The increase in PIFE is not “all or nothing”.  Rather, it is starting to increase gradually.  Further, because these are ensemble measurements, and each molecule will exhibit variability in rate there is increased breadth of the peak due to ensemble averaging. 

      (2) Although the reason for differences in the peak position (for example, Figure 1E, 2B) is apparent, the reason for variations in the relative intensities has to be given or speculated.

      We have addressed the reason for the different peak heights in the revised manuscript.  It is the consequence of the fact that each substrate has slightly different fluorescent labeling efficiencies.  Thus, for each sample there is a mix of labeled and unlabeled substrates both of which will bind to ClpB but the unlabeled ClpB bound substrates do not contribute to the fluorescence signal, but will represent a binding competitor.  Thus, for low labeling efficiency there is a lower concentration of ClpB bound to fluorescent RepA-Titinx and for higher labeling efficiency there is higher concentration of ClpB bound to RepA-Titinx leading to an increased peak height.  RepA-Titin2 has the highest labeling efficiency and thus the largest peak height.

      Reviewer #3 (Recommendations For The Authors):

      The authors should make it clear that they and previous authors have used different constructs or conditions to bypass the physiological regulation of ClpB action by Hsp70 and its co-factors as mentioned above. In particular, the construct used by Avellaneda et al should be explained when they challenge the findings of those authors.

      Minor points:

      The lines fitting the experimental points are difficult or impossible to see in Figures 2B, 3B, and s5B.

      Fixed

      Typo bottom of p6 - "averge"

      Fixed

      Avellaneda, M. J., K. B. Franke, V. Sunderlikova, B. Bukau, A. Mogk and S. J. Tans (2020). "Processive extrusion of polypeptide loops by a Hsp100 disaggregase." Nature.

      Doyle, S. M., J. Shorter, M. Zolkiewski, J. R. Hoskins, S. Lindquist and S. Wickner (2007). "Asymmetric deceleration of ClpB or Hsp104 ATPase activity unleashes protein-remodeling activity." Nature structural & molecular biology 14(2): 114-122.

      Durie, C. L., E. C. Duran and A. L. Lucius (2018). "Escherichia coli DnaK Allosterically Modulates ClpB between High- and Low-Peptide Affinity States." Biochemistry 57(26): 3665-3675.

      Haslberger, T., A. Zdanowicz, I. Brand, J. Kirstein, K. Turgay, A. Mogk and B. Bukau (2008). "Protein disaggregation by the AAA+ chaperone ClpB involves partial threading of looped polypeptide segments." Nat Struct Mol Biol 15(6): 641-650.

      Li, T., J. Lin and A. L. Lucius (2015). "Examination of polypeptide substrate specificity for Escherichia coli ClpB." Proteins 83(1): 117-134.

      Li, T., C. L. Weaver, J. Lin, E. C. Duran, J. M. Miller and A. L. Lucius (2015). "Escherichia coli ClpB is a non-processive polypeptide translocase." Biochem J 470(1): 39-52.

      Miller, J. M., J. Lin, T. Li and A. L. Lucius (2013). "E. coli ClpA Catalyzed Polypeptide Translocation is Allosterically Controlled by the Protease ClpP." Journal of Molecular Biology 425(15): 2795-2812.

      Miller, J. M. and A. L. Lucius (2014). "ATP-gamma-S Competes with ATP for Binding at Domain 1 but not Domain 2 during ClpA Catalyzed Polypeptide Translocation." Biophys Chem 185: 58-69.

      Oberhauser, A. F., P. K. Hansma, M. Carrion-Vazquez and J. M. Fernandez (2001). "Stepwise unfolding of titin under force-clamp atomic force microscopy." Proc Natl Acad Sci U S A 98(2): 468-472.

      Rajendar, B. and A. L. Lucius (2010). "Molecular mechanism of polypeptide translocation catalyzed by the Escherichia coli ClpA protein translocase." J Mol Biol 399(5): 665-679.

      Weibezahn, J., P. Tessarz, C. Schlieker, R. Zahn, Z. Maglica, S. Lee, H. Zentgraf, E. U. Weber-Ban, D. A. Dougan, F. T. Tsai, A. Mogk and B. Bukau (2004). "Thermotolerance requires refolding of aggregated proteins by substrate translocation through the central pore of ClpB." Cell 119(5): 653-665.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We would like to thank the reviewers and editor for their helpful comments. We have addressed their concerns as detailed below.

      It would have been nice to have included a bona-fide SIRT2 target as a control throughout the study.

      We agree that including a bona-fide SIRT2 target as a control is important for validating our results. Previous data from our work has shown that SIRT2 demyristoylates ARF6. Thus, we have included a blot in Figure S15 demonstrating that SIRT2 knockdown results in increased myristoylation of ARF6. This serves as a control to confirm the activity and role of SIRT2 in our study.

      Did the authors also consider investigating SIRT1 in their assays? SIRT1 activates ACSS2 while SIRT2 leads to degradation of ACSS2. They should at least discuss these seemingly opposing roles of SIRT1 and SIRT2 in the regulation of ACSS2 and acetate metabolism in more depth particularly as it concerns situations (i.e., diseases, pathologies) where either SIRT1, SIRT2, or both sirtuins, are active. This would enhance the significance of the findings to the broader research community.

      The study by Hallows et al. showed increased SIRT1 deacetylate K661 of ACSS2 and increase its catalytic activity. Subsequently, a follow-up investigation unveiled the role of the circadian clock in modulating intracellular acetyl-CoA levels through SIRT1-catalyzed K661 deacetylation of. Conversely, our research elucidates a contrasting mechanism wherein SIRT2 inhibits ACSS2 by deacetylating K271 under conditions of nutrient stress. The dual regulation of ACSS2 by SIRT1 through the circadian clock and SIRT2 under nutrient stress underscores the intricate and multifaceted nature of regulatory mechanisms involved in lipid metabolism. These findings underscore the versatility of lysine acetylation in modulating cellular metabolic pathways.

      Collectively, these studies contribute to a better understanding of how SIRT1 and SIRT2 regulate ACSS2 activity in various metabolic contexts, thereby enhancing our knowledge of acetate metabolism and its implications in health and disease.

      We have included such discussion of the manuscript.

      In Figure 3, the authors should consider immunoblotting for endogenous ACSS2 throughout the differentiation and lipogenesis study since the total ACSS2 levels is the crucial aspect to affecting acetate-dependent promotion of lipogenesis in adipocytes, and to confirm TM-dependent stabilization of ACSS2 in that assay.

      We have updated Figure 3 to include immunoblotting for endogenous ACSS2 levels. Additionally, we have confirmed the TM-dependent stabilization of ACSS2, which is now shown in Figure S12.

      Do the authors have any data proving the K271 mutants of ACSS2 are still functional? Or that K271 ACSS2 protein is folded correctly?

      To assess the functionality of the mutants, we isolated Flag-tagged wildtype, K271R, and K271Q ACSS2 proteins from SIRT2 knockdown HEK293T cells. Subsequently, we examined acetyl-CoA formation from acetate and CoA using high-performance liquid chromatography (HPLC). Our findings indicate that while the wildtype ACSS2 exhibits slightly higher activity compared to the K271R and K271Q mutants, but all variants remain functional (Figure S13).

      Nearly all experiments are performed in a single cell line. Authors should test whether SIRT2 regulates ACSS2 acetylation in at least 1 or 2 more cell lines. Does SIRT2 regulate ACSS2 acetylation in 3T3-L1 preadipocytes?

      Experiments showing that endogenous ACSS2 levels change in EBSS and nutrient-deprived media were repeated in A549 cells (Figure S5). However, due to the poor transfection efficiency of A549 cells, we were unable to obtain acetylation data. Similarly, conducting acetylation experiments in 3T3-L1 preadipocytes is challenging due to poor transfection efficiency.

      The article does not explicitly address whether the absence of amino acids impacts the acetylation and subsequent degradation of ACSS2 by activating SIRT2. If so, one would expect the level of ACSS2 acetylation or ACSS2 expression under amino acid deprivation to be lower than that under normal conditions, as depicted in Fig. 1C and Fig. S3.

      The experiments shown in Fig. 1C and Fig. S3 were using overexpressed Flag-tagged ACSS2 and we actually adjust the amount of DNA used to have similar Flag-ACSS2 levels.

      To address the comment raised by the reviewer, we added Figure S14, which shows that endogenous ACSS2 acetylation is decreased under amino acid deprivation in SIRT2 control KD cells, indicating that the absence of amino acids impacts ACSS2 acetylation. The decreased expression of ACSS2 under amino acid deprivation is also addressed in Figure S6.

      Several reviewers noted discrepancies between what is occurring to basal levels of ACSS2 vs in SIRT2 KD conditions. Fig. 2H shows higher basal level of acetylated ACSS2 in K271R mutant compared to wildtype (input may be an issue). If Fig. 2H is a critical piece of data, authors are recommended to show this using FLAP-IP & then Ac-K.

      The increased stability of the K271R mutant compared to the wildtype (WT) results in higher protein levels, which results in the different input levels. However, this does not affect the conclusion that K271 is the acetylation site as the quantification result shows that K271R mutant has lower acetylation level and is not regulated by SIRT2 (Figure S16).

      Regarding the basal levels of ACSS2 in control and SIRT2 KD conditions, it was because the experiments in question were using overexpressed Flag-tagged ACSS2 and we actually adjust the amount of DNA used to have similar Flag-ACSS2 levels. To address the concern, we monitored endogenous ACSS2 protein and acetylation levels and the results are shown in Figure S14.

      Also, in Fig 2I there is no difference in basal ubiquitination between WT and K271R mutant. Related, based on model you would expect that overexpression of ACSS2-K271R mutant compared to wildtype would be at higher levels. In many figures authors do not see this (Fig. 2I, 3A, 3B). This needs to be explained.

      This is related to some previous comments. In these experiments, we actually adjusted the DNA used in the transfection to obtain equal protein levels so that we can quantify other things (acetylation or ubiquitination levels). As stated in the manuscript regarding Figures 3A and 3B, "To ensure comparable expression levels at the beginning, we adjusted the amount of transfected DNA for both wild-type and the K271R mutant ACSS2." This approach allowed us to accurately compare the ubiquitination status between the wildtype and K271R mutant ACSS2 variants.

      Data showing role of ACSS2-K271 mutant in lipid accumulation requires clarification. Based on model overexpression of ACSS2-K271 mutant should by itself cause increased lipid accumulation compared to wildtype.

      This is indeed the case and we have added this in the revised manuscript “Consistent with our above observation that ACSS2 K271R mutant is more stable than the WT, expressing the K271R mutant lead to more lipid droplets than expressing the WT ACSS2 (Figure S12).”

      Loading controls are notably absent at certain instances, such as IPs in Fig. 1A, 1C, and the IP in Fig. 2H. Such controls are required to interpret potential changes in acetylation.

      For this experiment, we employed an approach where we overexpressed Flag-tagged wild-type (WT) and mutant forms of ACSS2. We conducted an immunoprecipitation (IP) targeting acetyl-lysine residues to enrich lysine-acetylated proteins, followed by immunoblotting for the Flag tag to specifically detect ACSS2 acetylation levels. To ensure the reliability of our results, we included a Flag blot to confirm equal expression levels of ectopically expressed ACSS2 across our samples before IP. Given the nature of our experimental design and the specific aim of investigating ACSS2 acetylation, we believe that additional loading controls beyond the input Flag blot are not required for the interpretation of our results. The inclusion of the input Flag blot serves as a control for protein expression levels, which is crucial for accurate assessment of ACSS2 acetylation status.

      While CHX treatment is known to inhibit protein synthesis, it appears contradictory that CHX treatment in Fig. 2C seemingly leads to ACSS2 accumulation in SIRT2 knockdown HEK293T cells. This discrepancy requires clarification.

      We conducted quantitative analysis of the immunoblot with replicates to ensure the reliability of our findings. Our analysis indicates that the protein level of ACSS2 remains relatively stable over the time course of CHX treatment. The observed slight increase at the 8-hour time point can be attributed to inherent experimental variability, as evidenced by the presence of large error bars in the graph. We have included a graph in Figure S7 to show that there is no significant change in the level of ACSS2 in the SIRT2 HEK293T cells.

      In Fig. 2F-H, the authors argue that SIRT2 deacetylates ACSS2 to facilitate its ubiquitination and subsequent proteasomal degradation. However, these results are depicted under normal conditions, whereas findings in Fig. 1 suggest that SIRT2 deacetylates ACSS2 exclusively under nutrient stress. An explanation for this inconsistency is warranted.

      These experiments were done in amino acid deprived (EBSS) media. We have corrected this in the manuscript.

      Line 160 authors conclude "amino acid limitation..deacetylates K271"..but this was not directly demonstrated. Authors should add this data or change conclusion.

      Addressed in response to some of the comments above.

      Figures 1A and 1B, acetylation quantification, not clear if it is relative to the Flag tag or actin.

      Acetylation quantification is relative to Flag tag. This is clarified in the figure legend.

      Methods section lacking details & not well referenced (how did authors express wildtype & mutant in 3T3-L1 cells?) 

      ACSS2 wildtype and K271R mutant Flag-tagged expression plasmids were transfected into ACSS2 knockdown 3T3-L1 cells using PEI transfection reagent following the manufacturer’s protocol. The pCMV-Tag4a empty vector was used as the negative control. Differentiation of 3T3L1 cell lines were done according to manufacturer’s protocol (DIF001-1KT, Sigma Aldrich) 24 hours after transfection. This has been included in the methods.

      In Figure 3A, is the actin blot from the same immunoblots above it? Reviewers recommend the authors upload original immunoblot.

      This experiment was repeated, and the blot has been replaced.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The manuscript reports useful findings by resolving the crystal structure of Sedoheptulose-1,7-Bisphosphatase (SBPase) from the green algae Chlamydomonas reinhardtii, which is involved in the Calvin cycle. The data presented are solid based on validated methodologies, which help in understanding the structure and function of this enzyme.

      We thank the editors for this positive assessment.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this study, Le Moigne and coworkers shed light on the structural details of the Sedoheptulose-1,7-Bisphosphatase (SBPase) from the green algae Chlamydomonas reinhardtii. The SBPase is part of the Calvin cycle and catalyzes the dephosphorylation of sedoheptulose-1,7-bisphosphate (SBP), which is a crucial step in the regeneration of ribulose-1,5-bisphosphate (RuBP), the substrate for Rubisco. The authors determine the crystal structure of the CrSBPase in an oxidized state. Based on this structure, potential active site residues and sites of post-translational modifications are identified. Furthermore, the authors determine the CrSBPase structure in a reduced state revealing the disruption of a disulfide bond in close proximity to the dimer interface. The authors then use molecular dynamics (MD) to gain insights into the redox-controlled dynamics of the CrSBPase and investigate the oligomerization of the protein using small-angle X-ray scattering (SAXS) and size-exclusion chromatography. Despite the difference in oligomerization, disruption of this disulfide bond did not impact the activity of CrSBPase, suggesting additional thiol-dependent regulatory mechanisms modulating the activity of the CrSBPase.

      We thank reviewer 1 for his/her careful reading of our manuscript.

      The authors provide interesting new findings on a redox-mechanism that modulates the oligomeric behavior of the SBPase, however without investigating this potential mechanism in more detail. The conclusions of this manuscript are mostly supported by the data, but they should be more carefully evaluated in respect to what is known from other systems as e.g. the moss Physcomitrella patens. This is especially of interest, as SBPase was previously reported to be dimeric, whereas for FBPase a dimer/tetramer equilibrium has been observed.

      We thank reviewer 1 for his/her comments on the novel or confirmatory character of our structure-function analysis onCrSBPase. We address the questions of oligomeric states later in this response.

      (1) Given that PpSBPase has been already characterized in detail, the authors should provide a more rigorous comparison to the existing data on SBPases. This includes a more conclusive structural comparison but also the enzymatic assays should be compared to the findings from P. patens. Do the authors observe differences between the moss and the chlorophyte systems, maybe even in regard to the oligomerization of the SBPase?

      Indeed, a previous study conducted by one of the authors of the current manuscript (Stéphane D. Lemaire) and collaborators determined the structure and regulatory properties of SBPase from the moss Physcomitrella patens (Gütle et al. 2018 https://doi.org/10.1073/pnas.1606241113). We added a clearer reference to this earlier work. The differences that we observed regarding the oligomeric states of SBPase from Chlamydomonas reinhardtii principally stem from our analytical method in vitro through size-exclusion chromatography, in comparison with crystal packing analysis in the reference study. We detailed PpSBPase/CrSBPase oligoimeric state comparison in the paragraph 'Oligomeric states of CrSBPase'. Besides, the asymmetric unit of our CrSBPase crystal structure is also a homodimer, similarly to PpSBPase, and we suggest that PpSBPase is also likely to adopt several oligomeric states in vitro. If this were confirmed by experiments, SBPase in several organisms would behave analogously to FBPase regarding the dimer/tetramer equilibrium.

      In paragraph 'Crystal structure of CrSBPase' we added a comparison by alignment of our CrSBPase crystal structure to the previously reported _Pp_SBPase crystal structure, stating that with RMSD=0.478 Å the proteins are essentially identical.

      In paragraph 'CrSBPase enzymatic activity' we compared the value we obtained for enzyme specific activity to those previously published on other SBPase from Chlamydomonas or the land plant Spinacia oleracea, highlighting the similarity of results in three different systems and teams (Seuter et al. 2002 https://doi.org/10.1023/A:1019297521424 and Tamoi et al. 2005 DOI: 10.1271/bbb.69.848).

      (2) The authors should include the control experiments (untreated SBPase) and the assays performed with mutant versions of the SBPase, which are currently only mentioned in the text or not shown at all.

      We add supplementary figure 14 in order to illustrate that since SBPase C115S or C120S mutants are still activated by reducing agent, the disulfide bridge between cysteines 115 and 120 is not the single control over SBPase activity but rather a control over the oligomeric exchange of the enzyme indirectly contributing to redox activation of the active site.

      (3) The representation of the structure in figures (especially Figures 1 and 3) should be adjusted to match the author's statements. In Figure 1, the angle from which the structure is displayed changes over the entire figure making it difficult to follow especially as a non-structural biologist. Furthermore, important aspects of the structure mentioned in the text are not labeled and should be highlighted, by e.g. a close-up. Same holds true for Figure 3 that currently mostly shows redundant information.

      We thank reviewer 1 for his/her advise on how to improve Figure 1. We drew new images for the complete figure, hopefully providing more consistent and clearer visual support to our text. For simplicity, protein is now always represented centered around its active site in the same orientation. We represent co-crystallized water in all projections as a guide to the eye.

      Figure 3 and supplementary figure 3 were switched in order to better represent the experimental evidence provided by the resolution of SBPase structure under reducing conditions, i.e., the increase in local disorder around C115-C120 pair of cysteines in the 113-130 stretch forming a redox-conditionally dynamic loop and β-hairpin motif.

      (4) The authors state that mutation of C115 and C120 to serine destabilize the dimer formation, while more tetramer and monomer is formed. As the tetramer is essentially a dimer of dimers, the authors should elaborate how this might work mechanistically. In my opinion, dimer formation is a prerequisite for tetramer formation and the two mutations rather stabilize the tetramer instead of destabilizing the dimer.

      Time-dependent dynamic character of SBPase oligomer exchange is not resolved by the current study because we essentially combined size-exclusion chromatography (SEC) and X-ray crystallography to define quaternary structures at equilibrium. Overall, homodimer is the dominant state of wild-type SBPase by abundance in the purified recombinant form and by forming the constitutive asymmetric unit in all crystal packings. Dimer is indeed present in the tetramer state, a dimer of dimers, as pertinently stated by reviewer 1.

      This being recognized, we tried to explain the systematic co-elution of the principal dimeric form with an additional species of smaller size on SEC (supplementary figure 1, right-side shoulder of the peak), at the apparent mass of a monomer. When solving the crystal structures of SBPase we realized that the dimer interface is contributed by residues 113-130 forming a loop and β-hairpin motif. Notably, in this loop cysteine 115 (C115) maps at bonding distance of 3.9 Å of side chain of arginine 220 (R220) from dimer partner subunit. In loop 113-120, cysteine pair C115 and C120 are subject to redox switching between disulfide (closed) and dithiol (open) conformations, as shown in our structures 7B2O and 7ZUV, respectively. Given that the reduction of C115-C120 disulfide bridge correlates with a higher flexibility of this motif that contributes to dimer interface (figure S3), we hypothesized that reduction of SBPase would destabilize dimer state to the benefit of transitory monomer state, and indeed point mutagenesis of C115S or C120S caused a large modification of oligomer equilibrium in favour of the monomer (figure S1C).

      Mechanistically, we suggest two scenarios for the tetramer formation: either monomers first interact as in the crystallographic dimer before pairing such dimers into tetramers (as proposed by reviewer 1), or monomers start tetramerization by favoring the alternative subunit interface (figure 5B, between cyan and magenta chains) before stabilizing the crystallographic homodimer interface. In this latter case, monomerization would be necessary to efficiently re-arrange SBPase dimers into tetramers.

      In physiological conditions the re-arrangement switch would be controlled by C115-C120 reduction through ferredoxin-thioredoxin redox cascade. Structural studies in dynamic conditions like native mass spectroscopy/photometry would be necessary to solve this speculation unambiguously although at this stage of our investigation there seem little doubt to us that C115-C120 disulfide-dithiol exchange is essential to control a dimer/monomer balance in first instance.

      Reviewer #2 (Public Review):

      The central theme of the manuscript is to report on the structure of SBPase - an enzyme central to the photosynthetic Calvin-Benson-Bassham cycle. The authors claim that the structure is first of its kind from a chlorophyte Chlamydomonas reinhardtii, a model unicellular green microalga. The authors use a number of methods like protein expression, purification, enzymatic assays, SAXS, molecular dynamics simulations and xray crystallography to resolve a 3.09 A crystal structure of the oxidized and partially reduced state. The results are supported by the claims made in the manuscript. One of the main weakness of the work is the lack of wider discussion presented in the manuscript. While the structure is the first from a chlorophyte, it is not unique. Several structures of SBPase are available. As the manuscript currently reads, the wider context of SBPase structures available and comparisons between them is missing from the manuscript. Another important point is that the reported structure of crSBPase is 0.453A away from the alphafold model. Though fleetingly mentioned in the methods section, it should be discussed to place it in the wider context.

      We thank reviewer 2 for his/her assessment of our manuscript. In response to his/her suggestion to better compare our SBPase structure from the model microalga Chlamydomonas reinhardtii to that of the ortholog from Physcomitrium patens previously reported by an author of this manuscript (Stéphane D. Lemaire) and collaborators (Gütle et al. 2018), we wish to point out that paragraph 3 of the introduction was dedicated to this reference along with a mention to related Thermosynechococcus elongatus dual function fructose-1,6-bisphosphatase sedoheptulose-1,7-bisphosphatase (F/SBPase). We nevertheless follow his/her suggestion to better detail comparison between chloroplastic SBPase structures in the first result section 'Crystal structure of CrSBPase', consistently with response 1 to reviewer 1 (see above).

      Regarding the integration of AlphaFold (AF) computational models in a general discussion about SBPase molecular structure, we wish to point out that our initial 7B2O crystallographic model of CrSBPase was deposited in PDB on 2020-11-27 before AlphaFold2 was available for the scientific community (Jumper et al. publication date is 15 July 2021).

      AF2 entry AF-P46284-F1-model_v4 from AlphaFold Protein Structure Database aligns with our crystal structure 7B2O chain E with RMSD = 0.434 Å, showing excellent agreement between experiment and prediction at the level of protein main chain. It must still be pointed out that it is the AF2 model which is at 0.434 Å away from the experiment, and not the opposite. Exceptions of alignments are in local differences in several loops conformations and in the length of secondary structure elements. Many amino acid residues side chains adopt distinct orientations between the computational model and the experimental structure.

      AF3 was recently communicated (Abramson et al. 2024) along with its online prediction server hosted at https://golgi.sandbox.google.com. CrSBPase model from AF3 align to our crystal structure 7B2O chain A with RMSD = 0.489 Å showing again their strong similarity and with a smaller discrepancy between AF2 and AF3 of RMSD = 0.216 Å. The only significant deviations between 7B2O and AF3 are in the orientation of several side chains and notably on the conformation of region 114-131 that contain the redox sensor motif.

      We added the last two paragraphs to the revised version of the manuscript, after the results section presenting our crystallographic work.

      Recommendations for the authors:

      We made all recommended modifications as detail below.

      Reviewer #1 (Recommendations For The Authors):

      I have outlined a number of minor points below.

      We addressed all minor points listed.

      Line 220: The asymmetric unit only contains three dimers. The dimer of dimer or tetramer can only be reconstituted by displaying the symmetry mates.

      We corrected our sentence for 'The asymmetric unit is composed of six polypeptide chains packing as three dimers'.

      I also suggest that the authors separate the description of the asymmetric unit content from the modeled water molecules and rephrase e.g. „..and four water molecules could be modeled."

      We rephrased as suggested.

      I appreciate that the authors uploaded the structure in advance of this article, which allowed to evaluate the quality of the structure. Although this does not add valuable information, I have identified several unmodeled blobs, which possibly also account for waters.

      Unmodeled blobs were tentatively assigned to water but had to be removed during later refinements. We used Coot Validate tools 'Unmodelled blobs' and 'Check/Delete water' to progress towards the current optimal refinement statistics. We admit that the resolution of the crystallographic dataset (3.09 Å) is limiting to reliably model mobile or less resolved elements like water molecules. Overall, we estimate that the functional elements of the structure are modeled to the best of our knowledge and with minimal subjectivity.

      Line 222: Please write 309 instead of spelling the number.

      We corrected for 309 instead of spelling the number.

      Line 223: The structure representation in Figure 1A/B has to be improved. The authors might consider labeling the two domains & color them in two colors instead of the rainbow color coding. Furthermore, the 90{degree sign} rotation does not add much information. Here, turning the model in a different direction that allows to see the central b-sheet of domain 2 might be better suited. Furthermore, instead of describing b-strands first, followed by a-helices, I suggest describing which secondary structure elements form the two domains.

      We improved Figure 1A as suggested while keeping Figure 2B with 90° rotation as rainbow color gradient in order to display with clarity the secondary structure content and connectivity. The orientation was tilted to better display the central β-sheet. This new version of Figure 1A/B should facilitate the text description of SBPase architecture that we amended as suggested.

      Line 229: The information on A113-120 should be depicted in a closeup in Figure 1A.

      We made a close-up view of sequence 113-120 as added figures 1C-D and modified the rest of the figure and legend accordingly.

      Line 234: Please provide an r.m.s.d here.

      We now provide r.m.s.d. for all structural alignments.

      Line 242: Please introduce the domain labeling in Fig 1C to make it easier to track the exact region within SBP here. Is the residue numbering according to SBP or the human FBP?

      Modified version of figure 1 now shows SBPase in the same orientation for panels A, E, F, G, H for simplicity. Domains labeling is indicated in panel A with NTD/CTD distinct colors as suggested. We explicited the position of W401 on all panels as a guide to the eye. We indicated in figure legend that residue numbering is according to Chlamydomonas SBPase Uniprot entry P46284.

      Line 244: Is Figure 1D in the same orientation as C? I suggest making the surface transparent and showing the cartoon below, which will allow to easier see the solvent accessibility of the residues. Also, clearly label W401 (although it's the only water shown/modeled in this region).

      We modified figure 1 to show all equivalent panels (ie. A-E-F-G-H) with the same orientation. In this new form we think that solvent accessibility and the relative position of significant residues is easier to interpret for the reader. W401 is consistently labeled throughout figure 1 panels.

      Line 263: Please provide a close-up of the C222 and C231 including measured distance. It's clearly not visible from this view. It might even be helpful to provide close-ups of all cysteine residues that are mentioned in the text.

      In the modified version of figure 1 we estimate that C222 and C231 are more easily visible. We added a close-up view of C22-C231 environment in a new supplementary figure 2. Since we do not explore further the functional relevance of this redox pair we chose not include C222-C231 close-up view in main figure 1. We added legends and modified supplementary figures numbering accordingly.

      Line 276: As already mentioned earlier, none of the panels in Figure 1 provide a close-up of this loop. This should be added.

      This loop is now displayed as a close-up view in panels C and D of main figure 1.

      Line 284: It is difficult to follow the relative positions of the potential modification sites if the model is always depicted from a different angle in Figure 1. The authors might want to change this across Figure 1 or show the rotation angle.

      This problem was addressed in the revised figure 1, panels A-E-F-G-H are in the same orientation now. Panel B was kept at a rotation of 90° with corresponding annotation.

      Line 290: Please label W401. Also stick to one nomenclature (W or H20).

      We labeled W401 and kept nomenclature consistent throughout the manuscript.

      For comparative reasons, a full kinetic measurement (determination of Km and kcat) of the SBPase would also be helpful here.

      We resolved to avoid a full kinetic measurement of CrSBPase because we could neither identify a reliable chemical provider nor synthesize ourselves the physiological substrate sedoheptulose-1,7-bisphosphate (SBP) and only characterized the reaction with fructose-1,6-bisphosphate. However, in the revised form of the manuscript we added in main text paragraph 'CrSBPase enzymatic activity' the kinetic constants from the previous reference study conducted on spinach SBPase (Cadet and Meunier, Biochem. J. 1988) with KMSBP\=0.05 mM and kcatSBP\=81 sec-1 of fully active enzyme with SBP as a substrate. For comparison, the authors of this study report that activity of SBPase on FBP is in the same range but lower, with KMFBP\=0.38 mM and kcatFBP\=21 sec-1. We also added a comparison of specific activities of our CrSBPase and spinach SBPase in the main text, showing that our enzyme behaves as previously reported ortholog from land plant.

      Line 303: How much MgSO4 was used for the experiment shown in Figure 2A?

      10 mM of MgS04 was used for experiment shown in Figure 2A. We added this information in the figure legend. We also added in the legend that 10 mM DTT is present in the experiment of Figure 2B and that 10 mM of MgSO4 and 1 mM of DTT are present in the experiment of Figure 2C.

      Line 321: In my opinion it is not necessary to show the regions of all molecules here. I was rather expecting a superposition of the two structures (oxidized and reduced) with a close-up of the respective disulfide in the two states.

      We agree that the initial version of Figure 3 panels showing side-by-side all conformational variants of the redox motif appear redundant. We switched initial Figure 3 to supplementary data and replaced it with the crystallographic b-factor mapping of the redox motif, in the variable conditions resolved by the crystals. We would like to stress that all these conformations were experimentally determined through X-ray crystallography, whether of the crystal of pure inactive enzyme that proved to be oxidized on the redox motif, or of the equivalent crystals submitted to activating treatment by the chemical reductant TCEP. As an attempt to clarification we added visual boxes to better appreciate this reduction-induced conformational plasticity that we interpreted as a local conditional disorder.

      Line 331: Could the authors provide movies of the MD simulation? Otherwise, interpretation of the MD simulation results might be difficult for non-experts.

      We added two movies of 20-µsec MD simulations as supplementary data to help non-expert readers.

      Line 343: It might be helpful to label the structure elements in Figure 4 accordingly (e.g. residues, etc.)

      We added secondary structure labeling in Figure 4.

      Line 381: Should be changed to Figure 5A.

      We changed reference to figure 6 that is a renumbering of figure 5 with changes included from suggestions below. Figure 6 now includes chromatograms of recombinant SBPase in panel A and chromatogram and western blot analysis of Chlamydomonas extracts in panel B.

      Line 383: See above, figure 5B. Which structure is shown in the figure? 7zuv or 7b2o? Maybe include both structures in the figure in a side-by-side view. The authors might also want to include the SEC chromatograms in the main figure. Especially the purification from Chlamydomonas is helpful to estimate whether post-translational modifications have an impact on the oligomerization. This should also be mentioned in the text.

      7b2o and 7zuv are illustrated side-by-side in panels A and B of figure 5. This was indicated in the figure legend, we now added the information on the figure. As suggested above we included chromatograms initially presented as supplementary material in a new main figure 6, panel A for recombinant proteins and panel B for proteins extracted from Chlamydomonas. Initial figures 5D-E, showing surface conservation of the dimeric SBPase, is moved to supplementary figure 5.

      Line 385: I don't find the cultivation of Chlamydomonas in the method section. It should be added.

      We added a methods paragraph dedicated to « Cultivation of Chlamydomonas for native SBPase analysis ».

      Line 390-392: This information is not really helpful. Concentrated purified proteins might precipitate after a week storage without physiologically relevant effects being the reason.

      We agree that the observation of a precipitate building up in vitro after a week of storage bears no particular physiological implications. We rather intended to report that an aggregated form of purified protein can be turned to droplets under the redox conditions that activate the enzyme. We reformulated these lines for clarification.

      Line 397: I would appreciate having the SEC-chromatograms of the mutants also in the main figure.

      Size-exclusion chromatograms that were initially in supplementary figures are now shown in main text figure 6 panel A, with the profiles WT and mutants aligned.

      Line 402: Where are these data shown? They should be included in Figure 5.

      We added a figure to present these data, not shown in the initial version of the manuscript. We preferred to place it as supplementary material because C115S and C120S mutant catalytic activity is essentially the same as WT and do not reveal a direct mechanistic effect of C115-C120 reduction over the catalytic pocket.

      Line 427: Did the authors look into a possible cooperativity of their SBPase?

      We did not observe direct positive cooperativity that could be ascribed to allostery in our enzymatic assays. It was previously reported for spinach SBPase that SBP saturation functions were hyperbolic with no evidence of homotropic interactions in the enzyme oligomer (Cadet and Meunier Biochem J. 1988 253, 249-254). The authors of this kinetic study however present a clear sigmoid response of SBPase to Mg2+ concentration, suggestive of an activating cross-talk between active sites in the oligomer. We consider this hypothesis of interest and wish we could further investigate allosteric conformational changes when SBP physiological substrate would be available.

      Line 428-434: I don't really understand how the proteome mapping fits in here. Do the authors speculate that SBPase is recruited by some of the identified enzymes or directly interacts with them or that rather the spatial distribution optimizes the reaction kinetics?

      We indeed want to correlate our in vitro observations of CrSBPase conditions of activity to those recently published by the group of Dr. Martin Jonikas in a physiological, in vivo setup of Chlamydomonas reinhardtii (Wang, Patena et al. Cell 2023 186, 3499–3518). We have no experimental evidence demonstrating the first suggestion that SBPase is recruited or directly interacts with partner enzymes but we privilege the second suggestion that local spatial distribution in the chloroplast stroma optimizes enzyme reaction kinetic thanks to Calvin-Benson-Bassham enzymes proximity. We rephrased these lines to clarify our hypothesis and express its speculative character.

      Reviewer #2 (Recommendations For The Authors):

      To make the manuscript stronger, the authors are recommended to do the following:

      We followed given recommendations.

      (1) include a wider discussion on the other SBPase structures that are available. A detailed comparison should be made between the oxidized and reduced structures present in the PDB with the structures that are being reported in the manuscript.

      Consistently with reviewer #1 suggestion, and as detailed in response to public review above, we followed the recommendation to better report previous structural studies of SBPase in the results section. We also added comparisons with computational models from AlphaFold2 and AlphaFold3.

      (2) The authors mention co-operativity between the subunits. With excellent sampling from molecular dynamics simulations, the authors should demonstrate co-operativity between the subunits.

      Our molecular dynamic (MD) simulations span 20 µsec of SBPase in the dimeric state, starting from the experimental structures determined by XRC. In the considered time window, the only significant events that we observed are the local reorganization of the LBH motif that is a prerequisite for dimer rearrangement. We infer that local disorder contributes a separation of the pair of subunits in order to later allow for the building of the active homotetramer, at longer time scales that are outside the capacities used in this work. Moreover, demonstrating cooperativity with MD simulations would require more than a single event to ensure that results are significant, and performing series of 20µs-MD of SBPase is also outside the available capacities.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study provides a useful strategy for treating mouse cutaneous squamous cell carcinoma (mCSCC) with serum derived from mCSCC-exposed mice. The exploration of serum-derived antibodies as a potential therapy for curing cancer is particularly promising but the study provides inadequate evidence for specific effects of mCSCC-binding serum antibodies. This study will be of interest to scientists seeking a novel immunotherapic strategy in cancer therapy.

      Joint Public Review:

      Summary:

      This study presents an immunotherapeutic strategy for treating mouse cutaneous squamous cell carcinoma (mCSCC) using serum from mice inoculated with mCSCC. The author hypothesizes that antibodies in the generated serum could aid the immune system in tumor volume reduction. The study results showed a reduction in tumor volume and altered expression of several cancer markers (p53, Bcl-xL, NF-κB, Bax) suggesting the potential effectiveness of this approach.

      Strengths:

      The approach shows potential effect on preventing tumor progression, from both the tumor size and the cancer biomarker expression levels bringing attention to the potential role of antibodies and B cell responses in cancer therapy.

      We greatly appreciate your positive feedback on our study.

      Weaknesses:

      These are some of the specific things that the author could consider to strengthen the evidence supporting the claims in their study.

      (1) The study fails to provide evidence of the specific effect of mCSCC-antibodies on mCSCC. The study utilized serum which also contains many immune response factors like cytokines that could contribute to tumor reduction. There is no information on serum centrifugation conditions, which makes it unclear whether immune components like antigen-specific T cells, activated NK cells, or other immune cells were removed from the serum. The study does not provide evidence of neutralizing antibodies through isolation, analysis of B cell responses, or efficacy testing against specific cancer epitopes. To affirm the specific antibodies' role in the observed immune response, isolating antibodies rather than employing whole serum could provide more conclusive evidence. Purifying the serum to isolate mCSCC-binding antibodies, such as through protein A purification, and ELISA would have been more useful to quantify the immune response. It would be interesting to investigate the types of epitopes targeted following direct tumor cell injection. A more thorough characterization of the antibodies, including B cell isolation and/or hybridoma techniques, would strengthen the claim.

      I am deeply appreciative of the reviewer's highly professional comments. Tumor development involves the coexistence of cancer cells at different developmental stages, each harboring a variety of known and unknown mutated proteins. These mutated proteins expose multiple known and unknown epitopes, each capable of stimulating the production of corresponding antibodies in healthy mice. Identifying all these antibodies presents a significant challenge. Current research methodologies, such as ELISA, WB, and ChIP, can only identify known antibodies based on existing antigens. A prerequisite for using these techniques is that both antigens and antibodies are identified. At present, there is no technology available to identify antibodies produced by an unknown mutated protein and epitope. However, I find the reviewer's comments insightful. Perhaps we can initially identify some known mCSCC-antibodies on mCSCC. However, studying the specific effect of these known mCSCC-antibodies on mCSCC is uncertain because we believe that tumor shrinkage results from the combined action of both known and unknown antibodies.

      We concur with the reviewer's observations regarding the use of serum, which is rich in immune response factors such as cytokines that could potentially contribute to tumor reduction. In our future research, we plan to systematically analyze the individual roles of these antibodies and cytokines in tumor reduction. In 1973, Nature published a report indicating that serum demonstrated promising results in tumor treatment (Immunotherapy of Cancer with Antibody in Rats. Nature 243, 492 (1973). https://doi.org/10.1038/243492b0). Since then, there have been scarcely any reports on serum therapy for tumors. The primary focus of our study is to evaluate the efficacy of serum therapy in treating tumors. We hypothesize that antibodies and cytokines form a complex interactive network, working in synergy to reduce tumors. Consequently, we believe that studying these antibodies and cytokines in isolation may not yield effective results.

      In this study, the methodology section outlines the process of serum preparation. It is important to note that serum is devoid of blood cells. I hypothesized that whole blood might have superior therapeutic effects compared to serum. This is because antibodies could potentially synergize with immune cells (including T cells, B cells, and NK cells), thereby enhancing the effectiveness of the treatment. As previously discussed, these antibodies, cytokines, and immune cells form a complex interactive network aimed at tumor reduction. Consequently, there are numerous factors that could influence the experimental outcomes, which presents a challenge for analyzing the results. Furthermore, the implementation of whole blood transfusion therapy introduces additional considerations, such as potential side effects and reactions associated with blood transfusions.

      We thank the reviewers for their suggestion to purify the serum in order to isolate mCSCC-binding antibodies. As we previously mentioned, separating a large number of both known and unknown serum antibodies presents a significant technical challenge. We are eager to discuss and consider suggestions from the reviewers regarding methods to identify a large variety and number of unknown antibodies on cells. Perhaps, as the reviewer suggested, we could begin with known antibodies and employ Protein A purification technology to purify these antibodies and subsequently detect immune responses. We could also categorize the types of epitopes targeted, direct tumor cell injection, to study the epitopes of these types in further studies. The suggestion to study the response of B cells is valuable, and we plan to conduct comprehensive research on the response and status of B cells in our future studies.  

      The purification of antibodies to enhance the specificity of their effectiveness against tumors is a critical aspect of our study. However, we would like to address some concerns raised. (1) The separation of all antibodies and cytokines presents a significant technical challenge. Particularly, there is a risk of overlooking antibodies that are present in low concentrations but play crucial roles. (2) What concerns us is that studying the composition separately would lose the overall effectiveness of the study. Our primary concern is that studying these components in isolation could compromise the holistic understanding of the study. This is akin to current research on traditional medicine, where the separation and individual study of compounds often result in a loss of overall therapeutic efficacy. For instance, consider a scenario where 100 antibodies collectively work to shrink a tumor. These antibodies interact with 20 cytokines, forming a complex network that enhances the cytokines' activity against tumor cells. Furthermore, many important antibodies and cytokines are currently unknown. Studying these antibodies in isolation could potentially result in the loss of this therapeutic effect. Therefore, in the discussion section, we have emphasized that our study considers a tumor mass, including tumor cells at various stages of development, as a single entity. As a practicing clinician, my primary focus is on the therapeutic outcomes in tumor treatments, despite the mechanisms of serum therapy remaining largely elusive, liking a black box.

      (2) In the study design, the control group does not account for the potential immunostimulatory effects of serum injection itself. A better control would be tumor-bearing mice receiving serum from healthy non-mCSCC-exposed mice. Additionally, employing a completely random process for allocating the treatment groups would be preferable. Also, the study does not explain why intravenous injection of tumor cells would produce superior antibodies compared to those naturally generated in mCSCC-bearing mice.

      I concur with the reviewer's perspective that using serum from healthy, non-mCSCC exposed mice as a control could potentially improve our study. Initially, our primary concern was to minimize harm to the mice and avoid excessive blood reactions, which led us to exclude the use of serum from healthy, non-mCSCC exposed mice in our control group. The main objective of our study was to investigate tumor shrinkage through serum treatment, specifically serum-derived antibodies. We anticipated that tumor-bearing mice receiving serum from healthy, non-mCSCC exposed mice would exhibit a response to the injected serum, which would manifest as a blood reaction. However, we did not expect this to result in a tumor treatment effect. If it turns out that normal serum (from healthy, non-mCSCC-exposed mice) possesses tumor-reducing properties, it would indeed be a novel discovery. We appreciate the reviewer's insightful suggestion and will consider incorporating it into our future research.

      We concur with the reviewer's observations that the use of a completely random process for assigning treatment groups would be more desirable. Indeed, the complete randomization of the entire process further underscores the efficacy and universality of serum therapy. In this study, we utilized paired mice to mitigate the risk of cross-infection and adverse reactions associated with blood transfusions. We deeply value the reviewer's expert feedback.  

      Lastly, the reason why tumor cells, when intravenously injected, produce antibodies superior to those naturally generated in mCSCC-bearing mice, is due to the following reasons. As tumor cells grow, they produce a variety of mutated proteins to adapt to the immune microenvironment and evade the immune system of mCSCC-bearing mice. However, these tumor cells with mutated proteins are exceptionally sensitive and recognizable to healthy mice. This recognition triggers an immune response in healthy mice, leading to the production of specific therapeutic antibodies. This simultaneous production of diverse and abundant antibodies is only achievable by living organisms.

      (3) In Figure 2B, it would be more helpful if the author could provide raw data/figures of the tumor than just the bar graph. Similarly in Figure 3, the author should show individual data points in addition to the error bar to visualize the actual distribution.

      Raw data (numerical values) have been incorporated into Figures 2B and 3, but the data is placed in the table below the graph. If placed above the error bar, it requires a small font and may not be clear.

      (4) The author mentioned that different stages of tumor cells have different surface biomarkers. Therefore, experimenting with injecting tumor cells at various stages could reveal the most immunogenic stage. Such an approach would allow for a comparative analysis of immune responses elicited by tumor cells at different stages of development.

      Yes, throughout the course of tumor development, tumor cells at various stages will exhibit distinct markers or possess different mutated proteins. The concept of segregating tumor cells from different stages and independently comparing their immune responses is indeed commendable. Future research could involve isolating cells that express identical biomarkers at each stage for a comparative analysis of the immune responses triggered by the tumor cells. However, this approach diverges from the original intent of this study.

      Most tumor cells exist within the same developmental stage. However, this does not imply that all tumor cells within the tumor mass are at the same stage. For instance, a stage III liver cancer tumor may contain both stage I and stage IV tumor cells. Moreover, due to the complexity of tumor development, not all tumor cell surface markers are identical, even for tumors at the same stage. For instance, 20 major proteins and 100 minor proteins are implicated in tumor formation. In fact, random mutations in just 5 of these major proteins and 10 minor proteins can instigate the development of tumors. This implies that the protein pattern (tumor cell surface markers) associated with each individual's tumor is unique. While studying tumor cells at different stages separately allows for the observation of the immune response of tumor cells at each stage, it lacks a comprehensive research and treatment effect. For this reason, the design of this study treats a tumor mass as a whole, encompassing both the primary stage tumor cells and those not in that stage. These tumor cells are then injected to produce corresponding therapeutic antibodies. Furthermore, if tumor cells from only one stage are isolated and specific antibodies are produced against these cells, it could lead to immune escape of tumor cells at other stages, preventing the tumor from shrinking. Therefore, our approach aims to address this issue by considering the tumor mass as a whole.

      (5) In the abstract the author mentioned that using mCSCC is a proof-of-concept for this potential cancer treatment strategy. The discussion session should extend to how this strategy might apply to other cancer types beyond carcinoma.

      We have incorporated an additional paragraph in the discussion section where we delve into the concepts and experimental principles underpinning this study. This, we believe, addresses the reviewer's query regarding the applicability of our study's methodology to other types of tumors. The process for other tumors also involves isolating cells from the tumor, stimulating therapeutic antibody production in healthy mice using these cells, and ultimately reintroducing these antibodies into mice with tumors to facilitate tumor elimination

      Recommendations For The Authors:

      The author is encouraged to refine the study's design in future studies considering the weaknesses highlighted above, summarize the results more effectively, and seek opportunities to expand on this promising idea and enhance the research's impact and applicability.

      We greatly appreciate the valuable suggestions provided by the editor and reviewers. These insights will certainly be addressed in our future research endeavors.

      Suggestions for title modification:

      Following the scope of the study, the term 'specific homologous neutralizing-antibodies' may be misleading as neutralizing antibodies typically refer to antibodies preventing viral cell entry. In cancer therapy, 'neutralization' is not a relevant concept, as cancer cells do not infect host cells. Using whole tumor cells as immunogens diverges from the specificity of traditional vaccination approaches that utilize well-defined proteins or antigens. Furthermore, the term "homologous" suggests a precision in targeting that is not demonstrated by reintroducing serum without isolating its specific components. Therapeutic effects should not be attributed to "neutralizing antibodies" without isolating or characterizing the antibody response or verifying their efficacy against specific cancer epitopes. Additionally, it is suggested that you indicate the biological system that your study utilised in the title. More so, this approach is not entirely novel, as seen with the use of adjuvants in some flu vaccines, or in Moderna's cancer vaccine mRNA-4157, which encodes up to 34 patient-specific tumor neoantigens. You can consider the title below or a variant of the same.

      Suggested title: Generating serum-based antibodies from tumor-exposed mice: a potential strategy in cutaneous squamous cell carcinoma treatment

      I concur with your suggestion and have modified the title to " Generating serum-based antibodies from tumor-exposed mice: a new potential strategy for cutaneous squamous cell carcinoma treatment ". I believe this research remains some new, hence the addition of the word "new". Furthermore, the term "novel" in the paper has been either removed or substituted.

      Moreover, I propose that this study shares similarities with Moderna's cancer vaccine mRNA-415, albeit with certain differences. Moderna's cancer vaccine mRNA-415 encodes 34 recognized neoantigens to stimulate an immune response by eliciting specific T cell responses. This is similar to the strategy of some companies developing a protein set for diagnosing lung cancer, liver cancer, among others. Without a doubt, these methods have improved the effectiveness of tumor diagnosis and treatment. However, I think that these methods currently face challenges in completely eradicating tumors because they perceive tumors as a static process and cells that express certain mutated proteins in a fixed manner. I believe that small molecule antibodies, cytokines, and immune cells present in serum that are difficult to detect, have low concentrations, or are unknown are essential for maintaining the expression of important mutant proteins and the escape of tumor cells. This is also the primary reason why tumors are difficult to treat and prone to recurrence at present.

      From my perspective, different tumors, as well as different stages of the same tumor, express varying mutated proteins or surface markers. Targeting some may result in others escaping or even creating a more conducive growth environment for those that do escape. Our study adopts a comprehensive view of a tumor block, encompassing tumor cells at different stages and tumor cells at the same stage but expressing different biomarkers. This approach generates a multitude of known and unknown antibodies that work in concert with cytokines and immune cells. While our method may not be capable of generating all mutated proteins and epitope antibodies due to the weakness of some antigens (epitopes of mutated proteins), it can still be effective. As long as the number of tumor cells is reduced below a certain threshold following multiple rounds of treatment with various antibodies produced at different stages, these cancer cells can be eradicated by the body's immune system. This is a process that is real-time and dynamic. Undoubtedly, if it becomes evident that alterations in a set of proteins can bolster the immune system and eradicate tumor cells, then the implications are significant. The immunotherapy proteins, which have demonstrated positive therapeutic effects, developed by certain companies are also predicated on this very principle.

      Finally, I greatly appreciate your suggestions, which will be considered and gradually addressed in future research.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Public Review): 

      In the manuscript "Mechanistic target of rapamycin (mTOR) pathway in Sertoli cells regulates age-dependent changes in sperm DNA methylation", the authors proposed to test if the balance of mTOR complexes in Sertoli cells may play a significant role in age-dependent changes in the sperm epigenome. The paper could be of interest and has a good scientific aim but there are too many drawbacks that hamper the initial enthusiasm. All sections need extensive revision. The paper is mostly descriptive without a mechanistic-orientated explanation for the observed results. 

      Comments on revised version: 

      I am not sure that the authors have made an attempt to clearly answer the reviewers comments that aimed to improve the quality of the manuscript. It stands as mostly descriptive and with limited interest as it is. 

      We are thankful to the reviewer for agreeing to review our revised manuscript. Unfortunately, we completely disagree with the evaluation provided by the reviewer. Research on sperm DNA methylation experienced a significant rise of interest in the current century and by now more than 2000 papers have been published. Although it was demonstrated that the sperm DNA methylome may be affected by almost every factor analyzed, no study was published to identify molecular mechanisms that may link these factors with the sperm epigenome. Our study is the FIRST to identify such a mechanism (mTOR complexes balance in Sewrtoli cells). More so, we demonstrated experimentally that manipulations of this mechanism allow regulation of the rates of epigenetic aging of sperm in both directions (accelerate aging or rejuvenate). Thus, our study provides a mechanistic background for the development of therapeutic interventions that may target sperm epigenome.

      We acknowledge that our study does not provide the full cascade of events linking the balance of mTOR complexes in Sertoli cells with the sperm DNA methylome. It suggests, however, the most plausible event next in a cascade (BTB permeability changes). Our group is working on this question now and we hope to provide the answer soon in a separate study. Even after that, we will be far from understanding the complete chain of molecular events that link mTOR and sperm methylome. It may take many years and significant effort of many research groups to dissect the whole cascade. It is worth mentioning that understanding of a complete cascade involved in pathology is not needed to develop efficient therapies if the critical nodes are known. For many common drugs (e.g. metformin) we do not know the full chain of molecular mechanisms but use them successfully.

      Thus, we believe that our study is mechanistic as it identified a critical mechanism manipulation of which allows experimental aging and rejuvenation of the sperm methylome. Additionally, it generates new mechanistic questions and hypotheses to be answered in the future.

      Reviewer #3 (Public Review): 

      Summary and Strength: 

      The manuscript by Amir et al. describes that Sertoli-specific inactivation of the mTORC1 and mTORC2 complex by KO of either Raptor or Rictor, respectively, resulted in progressive changes in blood-testis-barrier (BTB) function, testis weight, and sperm parameters, including counts, morphology, mtDNA content and sperm DNA methylation. 

      The described studies are based on the hypothesis that a decline of BTB function with increasing chronological age of a male contributes to the DNA methylation changes that are known to occur in sperm DNA of old males when compared to sperm DNA from isogenic young males. In order to demonstrate the relevance of a functioning BTB for the maintenance of sperm methylation patterns, the authors generated mice with genetically disrupted mTORC2 complex or mTORC1 complex in Sertoli cells and determined sperm methylation patterns in comparison to isogenic wild-type males. In line with previously published scientific literature (e.g. Mok et al., 2013; Dong et al, 2015; and others), the manuscript corroborates that a Sertoli-cell specific deletion of mTORC2 caused a loss of BTB function and a progressive spermatogenic defect. The authors further show that sperm DNA is differentially methylated (DMRs) as a consequence of either a mTORC2 disruption (associated with a loss of BTB function) or following a mTORC1 disruption (BTB function either increased or not leaky) when compared to their isogenic age-matched wt controls. Those DMRs overlap partially with changes in sperm DNA methylation that were found when comparing sperm from 8-week males with sperm isolated from 22-week-old male mice. 

      The authors interpret the observed changes as representative of the sperm DNA methylation changes that occur during normal chronological aging of the male. For an aged control group, the authors use sperm DNA of 22-week-old wild-type mates from the mTORC2 and mTORC2 KO breeding and compare the sperm methylation patterns found in sperm from those 22-week males to 8-week young males, that are intended to represent an old and a young cohort, respectively. DNA methylation analysis indicates that a disruption of mTORC2 (& decrease of BTB function) results in increased DNA methylation of sperm DNA, while a disruption of mTORC1 (and proposed increase of BTB tightness, not shown in the manuscript, though) resulted in increased hypomethylation. 

      Weaknesses: 

      While the hypothesis and experimental system are interesting and the data demonstrating the relevance of the mTORC2 complex for BTB function is convincing, several open questions limit the evidence that supports the hypothesis that the sperm DNA methylation changes seen in old males are caused by BTB failure following an imbalance of mTOR signaling complexes. The major critique points are the lack of a chronologically old group and the choice of 8 weeks & 22 weeks age of age: 

      - Data illustrating the degree of BTB decline and sperm DNA methylation changes from chronologically "old" male mice is missing. 22-week-old mice are not considered old but are of good and mature breeding age, equivalent to humans in their mid-late twenties. (In the manuscript, the 22-week-old wildtype mice show no evidence of BTB breakdown (Figure 3), so why are their sperm used to represent "aged" sperm? 

      - Adding a group of "old" wild-type mice of 12-14 months of age, which is closer to the end of effective reproduction in mice, more equivalent to 45-59 year-old humans) could be used to illustrate that (a) aging causes a marked decrease in BTB function at this time in mouse life, and that this BTB breakdown chronologically aligns with the age-associated DNA hypermethylation seen in old sperm. Age-matched "old" mTORC1 KO, with a (supposedly) tighter BTB barrier, could then be expected to have a sperm DMA methylation profile closer to that of younger wild-type animals. Such data are currently missing. While the progressive testicular decline observed in the mTORC1 KO (Fig.5) could make it difficult to obtain the appropriately aged mTORC1 KO tissues, it is completely feasible to obtain data from chronologically old wild-type males. (The progressive testicular decline further raises the question of what additional defects the KO causes, and how such additional defects would influence the sperm DNA methylation profile.) The addition of data from an old group to the currently included groups could strengthen the interpretation that the observations in the BTB-defective mTORC2 KO mice are modelling an age-related testicular decline, provided that the DMRs seen in the chronologically old group significantly overlap with the BTB-defective changes. 

      - In the current form, the described differences in sperm DNA methylation are based on comparisons between pubertal mice (8 weeks) and mature but not old adult males (22 weeks), while a chronologically "old" group is missing from the data sets and comparisons. Thus, it appears that the described sperm methylation changes reflect developmental changes associated with normal maturation and not necessarily declining sperm quality due to aging. (Sperm obtained from 8-week-old mice likely were generated, at least in part, during the 1st wave of spermatogenesis, which is known to differ from the continuously proceeding spermatogenesis during the remained of the mature life. During the 1st wave of spermatogenesis, Sertoli cells are known to undergo gene expression changes which could contribute to varying degrees of BTB function, and thus have effects on the sperm DNA methylation profiles of such 1st wave sperm.) 

      - It is unclear why the aging-related DMRs between the 8 and 22-week-old wild-type mice vary so dramatically between the two wild-type groups derived from the mTORC1 and the mTORC2 breeding (Fig. S4). If the main difference was due to mTORC1 or mTORC2 activity, both wildtype groups should behave very similarly. Changes seen in a truly "old" mouse (e.g. 20 weeks to 56 weeks), changes in "young mTORC1" and in "old mTORC2" are missing.

      How do those numbers and profiles compare to the shown samples? 

      Comments on latest version: 

      The rebuttal letter and public response indicate the authors' reluctance to consider the limitations of their study, i.e. having chosen chronologically young animals to demonstrate a sperm aging effect and indicate that they are not willing to include adequate controls. 

      Since there is no evidence that mice at this young age have a deteriorating blood-testis-barrier (indeed, normal intact BTB is clearly visible in the figures included in this study from animals of the relevant age group), the whole central hypothesis that the study is built upon (i.e. that increasing age causes deteriorating BTB integrity which in turn causes age-related changes in sperm DNA methylation), appears irrelevant or invalid. 

      The authors' claim that age-related DNA methylation changes in sperm occur in linear fashion and that the changes are somewhat proportional with chronological age is in stark contrast of the claim that a decline of the BTB in old animals is causative for age-related sperm epigenetic changes, putting the relevance of the whole study in question. 

      We are thankful to the reviewer for agreeing to review our revised manuscript. We disagree with the evaluation provided by the reviewer, however.

      First, the reviewer misinterpreted the hypothesis of the study, although it is formulated in the last sentence of the Introduction:  “ … we hypothesized that the balance of mTOR complexes in Sertoli cells may also play a significant role in age-dependent changes in the sperm epigenome.” Instead, the reviewer assigned a different hypothesis to our study (that BTB integrity changes are responsible for age-dependent changes in sperm DNA methylation) and criticized us for not providing clear testing of this hypothesis.

      To clarify, we believe that our study provides high-quality testing of OUR hypothesis as we demonstrated experimentally that manipulations of mTOR complexes balance in Sertoli allow acceleration and deceleration of epigenetic aging of sperm. Additionally, our study generated a hypothesis that BTB permeability may mediate the effects of the mTOR pathway on sperm methylome. This second hypothesis is to be tested in the future research.

      We also disagree with the reviewer's interpretation of the aging process as an abrupt transition from a young, healthy, and undamaged state to an old, moribund, and damaged state. The whole body of biogerontological knowledge suggests instead steady accumulation of damage over lasting periods of time. For example, this understanding of steady change at the molecular level allowed the development and successful use of epigenetic clock and other molecular clock models, including several variants of sperm epigenetic clocks. These models clearly demonstrate linear or semi-linear accumulation in DNA-methylation changes in various tissues and biological species across the whole lifespan. It is reasonable to assume that BTB permeability decreases with age steadily as well and that in younger animals this decrease may be not easily detected by the existing analytical methods. Experimental data showing the dynamics of the BTB deterioration over age do not exist to our knowledge although it was demonstrated that older animals have loose BTB as compared with young. We agree with the reviewer that future studies testing the role of BTB deterioration for sperm methylome aging will need to provide such evidence. It was not the subject of the current study, however.


      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      In the manuscript "Mechanistic target of rapamycin (mTOR) pathway in Sertoli cells regulates age-dependent changes in sperm DNA methylation", the authors proposed to test if the balance of mTOR complexes in Sertoli cells may play a significant role in age-dependent changes in the sperm epigenome. The paper could be of interest and has a good scientific aim but there are too many drawbacks that hamper the initial enthusiasm. All sections need extensive revision. The paper is mostly descriptive without a mechanistic-orientated explanation for the observed results.

      Specific comments:

      (1) The abstract is poorly written. There is a lot of unnecessary introduction that does not provide a rationale for the work. It is not possible to understand the experimental approach or the major data just by reading the abstract. It does not clearly represent the work.

      - We have added details of experimental design and results to the abstract and reduced the introductory part of the abstract.

      (2) The introduction is somewhat vague and does not provide a clear rationale for the hypothesis. There should be more focus more on the role of mTOR in Sertoli cells that goes far beyond BTB. That will give more focus on mTOR. Then it is important to focus on BTB and mTOR: what is known? What is the gap and how can it be solved? Several relevant references are missed concerning mTOR and Sertoli cells.

      - The goal of this study was not to explore all potential roles of mTOR pathway in Sertoli cells, but to test if shifts in the balance of mTOR complexes regulate (accelerate/decelerate) epigenetic aging of sperm. As such, we disagree with the reviewer and consider that the current Introduction provides a focused rational for the study.

      (3) The Material and Methods section needs improvement. There is much important information missing. For instance: how many animals were used per group and how was the breeding done? At what age? Statistical analysis should be explained in detail.

      - The number of animals was clearly stated in the original manuscript. We have added details of breeding and statistical analysis. 

      (4) The results description could be improved. It is vague without highlighting how much difference was detected. The results should be numerically described when possible and the differences should be highlighted. A 10% difference may be significant but not biologically relevant. To correctly evaluate the differences it is important to describe them with some degree of detail.

      - For all DNA methylation experiments we provide numerical characteristics of methylation changes, including numbers of DMRs, % change, significance, correlation coefficients. We believe that only age- and genotype-associated changes in reproductive parameters were not characterized in our manuscript in detail. We have added Table 1 to provide these numbers.

      (5) There is no discussion of the data. The authors just summarize their findings without a comprehensive analysis of the literature and how the effects can be mediated. mTOR interacts with different pathways (mTORC1 and mTORC2 are even mediators of distinct pathways). This would be very relevant to discuss. In addition, there are many study limitations not discussed. There is no clear mechanistic explanation of the way by which the mTOR pathway in Sertoli cells regulates age-dependent changes in sperm DNA methylation. The paper seems preliminary.

      - We have added an additional paragraph to the discussion to highlight a potential molecular mechanism that links mTOR pathway with the sperm epigenome.

      (6) Figure 1 is too simple and does not provide any schematic support for the text.

      - We disagree with the reviewer and believe that the figure represents a good visualization of our hypothesis useful for the perception of the study.

      (7) Figure 2 lacks some detail. For instance, how many animals were used for each step?

      - Numbers of animals are provided in the text of the paper.

      (8) Taking into consideration the roles of mTOR on sperm, particularly mTORC1, it is not clear whether there were any differences in sperm motility.

      - We did not assess sperm motility in this study. 

      Reviewer #2 (Public Review):

      In this study, the authors hypothesized that the balance of mTOR complexes in Sertoli cells may also play a significant role in age-dependent changes in the sperm epigenome. To test this hypothesis, the authors use transgenic mice with manipulated activity of mTOR complexes in Sertoli cells. These results suggest that the mTOR pathway in Sertoli cells may be used as a novel target of therapeutic interventions to rejuvenate the sperm epigenome in advanced-age fathers.

      The authors attempt to demonstrate that the balance of mTOR complexes in Sertoli cells regulates the rate of sperm epigenetic aging. The authors have effectively met their research objectives, and their conclusions are supported by the data presented.

      - We are very thankful for the positive evaluation of our study.

      Reviewer #3 (Public Review):

      Summary and Strength:

      The manuscript by Amir et al. describes that Sertoli-specific inactivation of the mTORC1 and mTORC2 complex by KO of either Raptor or Rictor, respectively, resulted in progressive changes in blood-testis-barrier (BTB) function, testis weight, and sperm parameters, including counts, morphology, mtDNA content and sperm DNA methylation.

      The described studies are based on the hypothesis that a decline of BTB function with increasing chronological age of a male contributes to the DNA methylation changes that are known to occur in sperm DNA of old males when compared to sperm DNA from isogenic young males. In order to demonstrate the relevance of a functioning BTB for the maintenance of sperm methylation patterns, the authors generated mice with genetically disrupted mTORC2 complex or mTORC1 complex in Sertoli cells and determined sperm methylation patterns in comparison to isogenic wild-type males. In line with previously published scientific literature (e.g. Mok et al., 2013; Dong et al, 2015; and others), the manuscript corroborates that a Sertoli-cell specific deletion of mTORC2 caused a loss of BTB function and a progressive spermatogenic defect. The authors further show that sperm DNA is differentially methylated (DMRs) as a consequence of either a mTORC2 disruption (associated with a loss of BTB function) or following a mTORC1 disruption (BTB function either increased or not leaky) when compared to their isogenic age-matched wt controls. Those DMRs overlap partially with changes in sperm DNA methylation that were found when comparing sperm from 8-week males with sperm isolated from 22-week-old male mice.

      The authors interpret the observed changes as representative of the sperm DNA methylation changes that occur during normal chronological aging of the male. For an aged control group, the authors use sperm DNA of 22-week-old wild-type mates from the mTORC2 and mTORC2 KO breeding and compare the sperm methylation patterns found in sperm from those 22-week males to 8-week young males, that are intended to represent an old and a young cohort, respectively. DNA methylation analysis indicates that a disruption of mTORC2 (& decrease of BTB function) results in increased DNA methylation of sperm DNA, while a disruption of mTORC1 (and proposed increase of BTB tightness, not shown in the manuscript, though) resulted in increased hypomethylation.

      Weaknesses:

      While the hypothesis and experimental system are interesting and the data demonstrating the relevance of the mTORC2 complex for BTB function is convincing, several open questions limit the evidence that supports the hypothesis that the sperm DNA methylation changes seen in old males are caused by BTB failure following an imbalance of mTOR signaling complexes. The major critique points are the lack of a chronologically old group and the choice of 8 weeks & 22 weeks age of age:

      - Data illustrating the degree of BTB decline and sperm DNA methylation changes from chronologically "old" male mice is missing. 22-week-old mice are not considered old but are of good and mature breeding age, equivalent to humans in their mid-late twenties. (In the manuscript, the 22-week-old wildtype mice show no evidence of BTB breakdown (Figure 3), so why are their sperm used to represent "aged" sperm?

      - Adding a group of "old" wild-type mice of 12-14 months of age, which is closer to the end of effective reproduction in mice, more equivalent to 45-59 year-old humans) could be used to illustrate that (a) aging causes a marked decrease in BTB function at this time in mouse life, and that this BTB breakdown chronologically aligns with the age-associated

      DNA hypermethylation seen in old sperm. Age-matched "old" mTORC1 KO, with a (supposedly) tighter BTB barrier, could then be expected to have a sperm DMA methylation profile closer to that of younger wild-type animals. Such data are currently missing. While the progressive testicular decline observed in the mTORC1 KO (Fig.5) could make it difficult to obtain the appropriately aged mTORC1 KO tissues, it is completely feasible to obtain data from chronologically old wild-type males. (The progressive testicular decline further raises the question of what additional defects the KO causes, and how such additional defects would influence the sperm DNA methylation profile.) The addition of data from an old group to the currently included groups could strengthen the interpretation that the observations in the BTB-defective mTORC2 KO mice are modelling an age-related testicular decline, provided that the DMRs seen in the chronologically old group significantly overlap with the BTB-defective changes.

      - In the current form, the described differences in sperm DNA methylation are based on comparisons between pubertal mice (8 weeks) and mature but not old adult males (22 weeks), while a chronologically "old" group is missing from the data sets and comparisons. Thus, it appears that the described sperm methylation changes reflect developmental changes associated with normal maturation and not necessarily declining sperm quality due to aging. (Sperm obtained from 8-week-old mice likely were generated, at least in part, during the 1st wave of spermatogenesis, which is known to differ from the continuously proceeding spermatogenesis during the remained of the mature life. During the 1st wave of spermatogenesis, Sertoli cells are known to undergo gene expression changes which could contribute to varying degrees of BTB function, and thus have effects on the sperm DNA methylation profiles of such 1st wave sperm.)

      - It is unclear why the aging-related DMRs between the 8 and 22-week-old wild-type mice vary so dramatically between the two wild-type groups derived from the mTORC1 and the mTORC2 breeding (Fig. S4). If the main difference was due to mTORC1 or mTORC2 activity, both wildtype groups should behave very similarly. Changes seen in a truly "old" mouse (e.g. 20 weeks to 56 weeks), changes in "young mTORC1" and in "old mTORC2" are missing. How do those numbers and profiles compare to the shown samples?

      Some general comments regarding the chosen age of animals:

      - As mentioned, sperm from 8-week-old mice represent many sperm that were produced in the 1st wave of spermatogenesis; 22-week-old mice are not considered chronologically old mice, but mature and "relatively" young animals. 18-24 month-old mice are considered to be equivalent to 56-69 year-old humans, and might be more suitable to detect aging effects. "Old mice" for study purposes should be at least 12-14 months of age, ideally >18 months of age. 22 weeks (5 months of age) are mice at good breeding age, but still considered mature adults, not old males, and therefore are not expected to show typical aging health problems (like declining fertility).

      Even the cited reference (Flurkey et al. 2007) defines that "... mice used a reference group for "young mice" should be at least 3 months of age (~ 13 weeks), i.e. fully sexually mature. The authors specifically state: " The young adult group should be at least 3 months old because, although mice are sexually mature by 35 days, relatively rapid maturational growth continues for most biologic processes and structures until about 3 months. The upper age range for the young adult group is typically about 6 months. ... For the middleaged group, 10 months is typically the lower limit.... The upper age limit for the middleaged group is typically 14-15 months, because at this age, most biomarkers still have not changed to their full extent, and some have not yet started changing. For the old group, the lower age limit is 18 months because age-related change for almost all biomarkers of aging can be detected by then. The upper limit is 22-26 months, depending on the genotype." According to this reference, mice up to 6 months of age are generally considered "mature adults" (equivalent to humans 20-30 yrs), mice of 10-14 month are "middle-aged adults" (equivalent to ~38-47 human years) and 18-24 month mice are "old" (equivalent to human of 56-69 yrs.).

      Going on these commonly used age ranges, it is unclear why the authors used 8-week-old mice (generally considered pubertal to late adolescent age) as young mice and 5-month-old mice as "old mice".

      Differences seen between these cohorts most likely do not reflect aging, but more likely reflect changes associated with normal developmental maturation, since testis and epididymides continue to grow until about 10-11 weeks of age.

      - The DMRs identified between 8 and 22-week-old animals could represent DMRs that are dependent on developmental maturation more than being changed in an "age-dependent" manner (in the sense of increased chronological age). This interpretation is congruent with the fact that those DMRs are enriched for developmental categories.

      - We are thankful to the reviewer for a detailed explanation of their disagreement with the ages of mice used in this study. In short, the reviewer suggests that our older group (22 weeks) is not old enough to represent aged animals and our young group (8 weeks) may still have spermatozoa from the first wave of spermatogenesis, and as such the observed differences between the 2 ages cannot be considered as aging-related but rather may represent different stages of maturation of the reproductive system. At the first glance this criticism looks valid. 

      However, to design our experiments we used our data that was not included to this manuscript initially. These data demonstrated that age dependent changes in sperm DNA are linearly or semi linearly associated with age in the age range from 56 to 334 days. Thus, within this interval any 2 ages, distant enough to register the difference in DNA methylation, can be used to assess age dependent changes in DNA methylation and changes in the rates of epigenetic aging of sperm in response to genetic manipulations. We have added these results now, - see “Identification of agedependent patterns in sperm DNA methylation” section in Material and Methods and “Patterns of age-dependent changes in sperm DNA methylation” in Results. We also consider that the reviewer’s suggestion that sperm from 8-week-old mice represents the first wave of spermatogenesis does not have ground. Indeed, C57BL/6 mice first have fertile sperm in cauda epididymis at 37 days of age [1], 19 days earlier than the age of 56 days (8 weeks) at which sperm was collected in our study in the youngest group of mice. Given that young C57BL/6 mice ejaculate spontaneously around 3 times per 5 days [2], 8 weeks old mice have ejaculated > 10 times since the first wave of spermatogenesis before the sperm was collected for our study, making negligibly small the chances of survival of any first wave sperm in their cauda epididymides to the age of 8 weeks. We have added this information to the text.

      (1) Mochida, K.; Hasegawa, A.; Ogonuki, N.; Inoue, K.; Ogura, A. Early Production of Offspring by in Vitro Fertilization Using First-Wave Spermatozoa from Prepubertal Male Mice. J. Reprod. Dev. 2019, 65, 467–473, doi:10.1262/jrd.2019-042.

      (2) Huber, M.H.; Bronson, F.H.; Desjardins, C. Sexual Activity of Aged Male Mice: Correlation with Level of Arousal, Physical Endurance, Pathological Status, and Ejaculatory Capacity. Biol. Reprod. 1980, 23, 305–316, doi:10.1095/biolreprod23.2.305.

    1. Author response:

      We thank the editors and reviewers for their enthusiasm for this work and helpful suggestions. In summary, the reviewers provided suggestions for additional discussion items and clarifications for the text and figures, especially in relation to the cryo-EM structures and suppressor screen sections of the manuscript. We will consider each of these and make edits as needed. In particular, reviewers asked for further details about the structural model in addition to analysis of our new structure with respect to previously reported intron lariat spliceosome (ILS) complexes. For the latter point, we present additional evidence for the correct assignment of Yju2 in the S. cerevisiae ILS structure and note that docking of the 3’ splice site is not observed in any ILS structure from yeast, worms, or humans. This is consistent with our proposed mechanism. We will clarify these points in the text as well highlight some caveats of prior studies of the ILS complex. We feel that these changes will add additional nuance to the manuscript as well as clarify the findings and their context and significance for the reader.

    1. Author response:

      We would like to thank all reviewers for their valuable comments that help us to improve our manuscript. We will make the following modifications in the revised manuscript:

      (1) To reduce the complexity of the experiments we carried out, we will summarize trimeric G proteins in Ciona in the first paragraph of the Result section and explain how we focused on Gas and Gaq in the initial phase of this study.

      (2) As the reviewer 1 suggested, the polymodal roles of papilla neurons are interesting. We will add a discussion regarding this aspect. The sentences will be like the following:

      “The recent study (Hoyer et al., 2024) provided several lines of evidence suggesting that papilla neurons can serve as the sensors of several chemicals in addition to the mechanical stimuli. This finding and our model seem mutually related because these chemicals could modify Ca2+ and cAMP signaling. The use of G protein signaling may allow Ciona to reflect various environmental stimuli to initiate metamorphosis in the appropriate situation, both mechanically and chemically.”

      (3) As both reviewers suggested, imaging cAMP on the backgrounds of some G protein knockdowns and pharmacological treatments is important, and we will carry out some of these experiments.

      (4) According to reviewer 2's comment, we will carefully modify the text about interpreting the results so that the descriptions suitably reflect the results.

    1. Author response:

      Response to reviewers (Public review):

      We thank all the three reviewers for their opinion on our work on Candida albicans β-1,6-glucan, which highlights the importance of this cell wall component in the biology of fungi. Here are our responses to their comments for public reviews:

      (1) Indeed, the data presented for immunological studies is preliminary. It has been acknowledged by the reviewers that our analysis providing insights into the biosynthetic pathways involved in comprehensive in dealing with organization and dynamics of the β-1,6-glucan polymer in relation with other cell wall components and environmental conditions (temperature, stress, nutrient availability, etc.). However, we anticipated that there would be immediate curiosity as to what the immunological contribution of β-1,6 glucan and we therefore felt we needed to initiative these studies and include them. We therefore performed immunological studies to assess whether β-1,6-glucans act as a pathogen-associated molecular pattern (PAMP), and if so, what its immunostimulatory potential is. Our data clearly suggest that β-1,6-glucan is a PAMP, and consequently lead to several questions: (a) what are the host immune receptors involved in the recognition of this polysaccharide, and thereby the downstream signaling pathways, (b) how is β-1,6-glucan differentially recognized by the host when C. albicans switches from a commensal to an opportunistic pathogen, and (c) how does the host environment impact the exposure of this polysaccharide on the fungal surface. We believe addressing these questions is beyond the scope of the present manuscript and aim to present new data in future manuscript. Nonetheless, in the revised manuscript, suggest approaches that we can take to identify the receptor that could be involved in the recognition of β-1,6-glucan. Moreover, we have modified the discussion presenting it based on the data rather than being descriptive.    

      (2) It will be interesting to assess the organization of β-1,6-glucan and other cell wall components in the opaque cells. It is documented that the opaque cells are induced at acidic pH and in the presence of N-acetylglucosamine and CO2. Our data shows that pH has an impact on β-1,6-glucan, which suggests that there will be differential organization of this polysaccharide in the cell wall of opaque cells. As suggested by the reviewer, we will include analysis of opaque cells (and other C. albicans cell types) in future studies.

      With the exception of these major new avenues for this research, our revision can address each of the comments provided by the reviewers.

    1. Author response:

      Reviewer #1 (Public Review): 

      Summary: 

      In this study, Masroor Ahmad Paddar and his/her colleagues explore the noncanonical roles of ATG5 and membrane ATG8ylation in regulating retromer assembly and function. They begin by examining the interactomes of ATG5 and expand the scope of these effects to include homeostatic responses to membrane stress and damage. 

      Strengths: 

      This study provides novel insights into the noncanonical function of ATG8ylation in endosomal cargo sorting process. 

      Weaknesses: 

      The direct mechanism by which ATG8ylation regulates the retromer remains unsolved. 

      We agree with the reviewer.  We do however show how at least one aspect of ATG8ylation contributes to the proper retromer function, which occurs via lysosomal membrane maintenance and repair. Understanding the more direct effects on retromer will require a separate study. We will emphasize this in the revised manuscript and point out the limitations of the present work.

      Reviewer #2 (Public Review): 

      Summary:

      Padder et al. demonstrate that ATG5 mediates lysosomal repair via the recruitment of the retromer components during LLOMe-induced lysosomal damage and that mAtg8-ylation contributes to retromer-dependent cargo sorting of GLUT1. Although previous studies have suggested that during glucose withdrawal, classical autophagy contributes to retromer-dependent GLUT1 surface trafficking via interactions between LC3A and TBC1D5, the experiments here demonstrate that during basal conditions or lysosomal damage, ATGs that are not involved in mATG8ylation, such as FIP200, are not functionally required for retromer-dependent sorting of GLUT1. Overall, these studies suggest a unique role for ATG5 in the control of retromer function, and that conjugation of ATG8 to single membranes (CASM) is a partial contributor to these phenotypes. 

      Strengths: 

      (1) Overall, these studies suggest a unique non-autophagic role for ATG5 in the control of retromer function. They also demonstrate that conjugation of ATG8 to single membranes (CASM) is a partial contributor to these phenotypes. Overall, these data point to a new role for ATG5 and CASM-dependent mATG8ylation in lysosomal membrane repair and trafficking. 

      (2) Although the studies are overall supportive of the proposed model that the retromer is controlled by CASM-dependent mATG8-ylaytion, it is noteworthy that previous studies of GLUT1 trafficking during glucose withdrawal (Roy et al. Mol Cell, PMID: 28602638) were predominantly conducted in cells lacking ATG5 or ATG7, which would not be able to discriminate between a CASM-dependent vs. canonical autophagy-dependent pathway in the control of GLUT1 sorting. Is the lack of GLUT1 mis-sorting to lysosomes observed in FIP200 and ATG13KO cells also observed during glucose withdrawal? Notably, deficiencies in glycolysis and glucose-dependent growth have been reported in FIP200 deficient fibroblasts (Wei et al. G&D, PMID: 21764854) so there may be differences in regulation dependent on the stress imposed on a cell. 

      We thank the reviewer on the overall assessment of the strengths of the study.

      We have discussed in the manuscript the elegant study by Roy et al., PMID 28602683. To accommodate reviewer’s comment, we will additionally emphasize in the text that our study is focused on basal conditions and conditions that perturb endolysosomal compartments. We agree with the reviewer that under metabolic stress conditions (such as glucose limitation) more complex pathways may be engaged and will acknowledge that in the discussion.

      Weaknesses: 

      (1) Additional controls are needed to clarify the role of CASM in the control of retromer function. Because the manuscript proposes both CASM-dependent and independent pathways in the ATG5 mediated regulation of the retromer, it is important to provide robust evidence that CASM is required for retromer-dependent GLUT1 sorting to the plasma membrane vs. lysosome. The experiments with monsensin in Fig. 7C-E are consistent with but not unequivocally corroborative of a role for CASM.

      We fully agree with the reviewer. In fact, our data with bafilomycin A1 treatment causing GLUT1 miss-sorting (manuscript line 317) show that it is the perturbance of lysosomes  and not CASM per se that leads to mis-sorting of GLUT1 (Fig. 7D,E). Note that it has been shown (PMIDs: 28296541, 25484071 and 37796195) that although bafilomycin A1 deacidifies lysosomes it does not induce but instead inhibits CASM. This is because bafilomycin A1 cases dissociation of V1 and V0 sectors of V-ATPase, unlike other CASM-inducing agents which promote V1 V0 association. Complementing this, our data with ATG2AB DKO and ESCRT VPS37A KO (Fig. 8A-F) indicate that the repair of lysosomes is important to keep the retromer machinery functional (as illustrated in Fig. 8G). This may be one of the effector mechanisms downstream of membrane atg8ylation in general and hence also downstream of CASM. We will revise Fig. 7 title to read “Lysosomal damage causes GLUT1 mis-sorting” and will explain these relationships in the text.

      Based on the results shown with ATG16KO in Fig 4A-D, rescue experiments of these 16KO cells with WT vs. C-terminal WD40 mutant versions of ATG16 will specifically assess the requirement for CASM and potentially provide more rigorous support for the conclusions drawn. 

      We will carry out the experiment proposed by the reviewer for the planned revision.

      (2) Also, the role of TBC1D5 should be further clarified. In Fig S7, are there any changes in the interactions between TBC1D5 and VPS35 in response to LLOMe or other agents utilized to induce CASM?

      We thank the reviewer for pointing this out. We do have data with VPS35 in co-IPs shown in Fig. S7.  There is no change in the amounts of VPS35 or TBC1D5 in GFP-LC3A co-IPs. We will include a graph with quantification in the revised manuscript and emphasize this point.

      Does TBC1D5 loss-of-function modulate the numbers of GLUT1 and Gal3 puncta observed in ATG5 deficient cells in response to LLOMe? 

      We agree that TBC1D5 is an interesting aspect. However, because TBC1D5 does not change its interactions in the experiments in our study, we consider this topic (i.e. whether TBC1D5 phenocopies VPS35 and ATG5 KOs in its effects on Gal3) to be beyond the scope of the present work. We underscore that LLOMe (lysosomal damage) mis-sorts GLUT1 even without any genetic intervention (e.g., in WT cells in the absence of ATG5 KO; Fig. 7). Thus, in our opinion the effects of TBC1D5 inactivation may be a moot point.

      (3) Finally, the studies here are motivated by experiments in Fig. S1 (as well as other studies from the Deretic and Stallings labs) suggesting unique autophagy-independent functions for ATG5 in myeloid cells and neutrophils in susceptibility to Mycobacterium tuberculosis infection. However, it is curious that no attempt is made to relate the mechanistic data regarding the retromer or GLUT1 receptor mis-sorting back to the infectious models. Do myeloid cells or neutrophils lacking ATG5 have deficiencies in glucose uptake or GLUT1 cell surface levels? 

      Reviewer’s point is well taken. Glucose uptake, its metabolism, and diabetes underly resurgence in TB in certain populations and are important factors in a range of other diseases. This was alluded to in our discussion (lines 461-469). However, these are complex topics for future studies. We will expand this section of the discussion.

      Reviewer #3 (Public Review): 

      In this manuscript, Padder et al. used APEX2 proximity labeling to find an interaction between ATG5 and the core components of the Retromer complex, VPS26, VPS29, and VPS35. Further studies revealed that ATG5 KO inhibited the trafficking of GLUT1 to the plasma membrane. They also found that other autophagy genes involved in membrane atg8ylation affected GLUT1 sorting. However, knocking out other essential autophagy genes such as ATG13 and FIP200 did not affect GLUT1 sorting. These findings suggest that ATG5 participates in the function of the Retromer in a noncanonical autophagy manner. Overall, the methods and techniques employed by the authors largely support their conclusions. These findings are intriguing and significant, enriching our understanding of the non-autophagic functions of autophagy proteins and the sorting of GLUT1. Nevertheless, there are several issues that the authors need to address to further clarify their conclusions. 

      (1) The authors confirmed the interaction between Atg5 and the Retromer complex through Co-IP experiments. Is the interaction between Atg5 and the Retromer direct? If it is direct, which Retromer complex protein regulates the interaction with Atg5? Additionally, does ATG5 K130R mutant enhance its interaction with the Retromer? 

      AlphaFold modeling in the initial submission of our study to eLife (absent from the current version) suggested the possibility of a direct interaction between ATG5 and VPS35 with ATG12—ATG5 complex facing outwards, in which case K130R would not matter. However, mutational experiments in putative contact residues did not alter association in co-IPs. So either ATG5 interacts with other retromer subunits or more likely is in a larger protein complex containing retromer. It will take a separate study to dissect associations and find direct interaction partners. We can provide our data on the currently available modeling and mutational analyses in a full point-for-point rebuttal but believe that since they are inconclusive, they should not be included in the study.

      (2) To more directly elucidate how ATG5 regulates Retromer function by interacting with the Retromer and participates in the trafficking of GLUT1 to the plasma membrane, the authors should identify which region or crucial amino acid residues of ATG5 regulate its interaction with the Retromer. Additionally, they should test whether mutations in ATG5 that disrupt its interaction with the Retromer affect Retromer function (such as participating in the trafficking of GLUT1 to the plasma membrane) and whether they affect Atg8ylation. They also need to assess whether these mutations influence canonical autophagy and lysosomal sensitivity to damage. 

      Please see the response to point 1.

      We thank the editors and reviewers for their assessment, constructive criticisms and recommendations.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Public Review):

      We thank the reviewer for his careful reading, which enabled us to improve the quality of this manuscript. We have addressed some major criticisms, and in particular, we have now included the characterization of the impact of BMP2 on other lines as well as the study of the impact of reversion of the H3.3K27M mutation (Figure 3 - figure supplement 1C-D). This control, judiciously proposed by the reviewer, seems more relevant than using mutant H3.1K27M / ACVR1 lines, given the possibility of BMP2 action via other receptors.


      The following is the authors’ response to the original reviews.

      Reviewer #1

      Summary:

      Mutational analysis of diffuse midline glioma (DMG) found that ACVR1 mutations, which up-regulate the BMP signaling pathway are found in most H3.1K27M, but not H3.3K27M DMG cases. In this manuscript, Huchede et al attempted to determine whether the BMP signaling pathway has any role in H3.3K27M DMG tumors. They found that the BMP signaling is activated to a similar level in H3.3K27M DMG cells with wild-type ACVR1 compared to ACVR1 DMG cells, likely due to the expression of BMP7 or BMP2. They went on to test whether cells treated with BMP7 or BMP2 treatments affected the gene expression and cell fitness of tumor cells with H3.3K27M mutation. They concluded that BMP2/7 synergizes with H3.3K27M to induce a transcriptomic rewiring associated with a quiescent but invasive cell state. The major issue for this conclusion is that the authors did not use the right models/controls to obtain results to support this conclusion as detailed below. Therefore, in order to strengthen the conclusion, the authors need to address the major concerns below.

      Strength:

      This paper addresses an important question in the DMG field.

      Major concerns/weakness:

      (1) All the results in Fig. 2 utilized two glioma lines SF188 and Res259. The authors should repeat all these experiments in a couple of H3.3K27M DMG lines by deleting the H3.3K27M mutation first.

      We thank the referee for his/her comments that have helped us to strengthen our conclusions. Although we were rather interested in studying how the BMP pathway can participate in installing a particular cell state at the time of expression of the K27M mutation, we have now included the characterization of the native H3.3K27M BT245 and SU-DIPGXIII cell lines, and their counterparts in which the mutation was reverted by CRISPRCas9 (Harutyunyan et al., 2019). As shown in Figure 3-figure supplement D, the growth arrest induced by BMP2 seems indeed to be specific of the K27M epigenetic context, which could also be required to settle a positive regulation loop to activate the BMP pathway, as mentioned in the Discussion.

      (2) Fig. 3. The experiments of BMP2 treatment should be repeated in other H3.3K27M DMG lines using H3.1K27M ACVR1 mutant tumor lines as controls.

      The use of mutant ACVR1 lines is interesting, but their control status seems questionable, as the addition of BMPs could have a cumulative effect on the effect of the mutation, notably by activating other receptors in the pathway. But we have now included 3 different cell lines (HSJD-DIPG-014, BT245 and SU-DIPGXIII), and observed similar impact of BMP2 with growth arrest as a readout (Figure 3-figure supplement C-D)

      Minor concerns

      Fig.2A. BMP2 expression increased in H3.3K27M SF188 cells. Therefore, the statement "whereas BMP2 and BMP4 expressions are not significantly modified (Figure 2A and Figure 2-figure supplement A-B)" is not accurate.

      The referee is absolutely right, and we have corrected this statement.

      Reviewer #2 (Public Review):

      The manuscript by Huchede et al investigates the BMP pathway in H3K27M-mutant gliomas carrying or not activating mutations in ALK2 (ACVR1). Their results in cell lines and in datasets acquired from the literature on patient tumors indicate that the BMP signaling pathway is activated at similar levels between ACVR1 wild-type and mutant tumors. The group further identifies BMP2 and BMP7 as possibly the main activators of the pathway in cells. They then show that BMP2 and 7 crosstalk with the H3 mutation and synergize to induce transcriptomic rewiring leading to an invasive cell state.

      The paper is well-written and easy to follow with a robust experimental plan and datasets supporting the claims. While previous work (acknowledged by the authors) indicated activation of BMP in H3K27M tumors, wild type for the ACVR1 mutation this paper is a nice addition and provides further mechanistic cues as to the importance of the BMP pathway and specific members in these deadly brain cancers. The effect of these BMPs in quiescence and invasion is of particular interest.

      We thank the referee for his/her supportive comments.

      A few suggestions to clarify the message are provided below 1- In thalamic diffuse midline gliomas, the BMP pathway should not be activated as it is in the pons. The authors should identify thalamic tumors in the datasets they explored and patients-derived cell lines from thalamic tumors available to investigate whether this pathway is active across all H3.3K27M mutants in the brain midline or specifically in tumors from the pons.

      The inter-patient variability observed in the level of activation of the BMP pathway may indeed be due, at least in part, to different tumor locations. However, we failed to find this information in the publicly available datasets that we used. We however included this element in the Discussion part.

      (2) There are ~20% H3.3K27M tumors that carry an ACVR1 mutation and similar numbers of H3.1K27M that are wild type for this gene. Can the authors identify these outliers in their datasets and assess the activation of BMP2 and 7 or other BMP pathway members in this context?

      We have now included the outliers present in our datasets in the legends of Figure 1B and Figure 1-figure supplement B and F. From the few samples available to document these outliers in the cohorts that we used, we have not observed major differences regarding the expression levels of BMP2/7 or BMP pathway members and have discussed the fact that it may result from the establishment in all cases of a feedback loop of activation.

      In all this is an interesting paper that provides meaningful data to pursue clinical targeting of the BMP pathway, which would be a nice addition to the field.

      We thank the reviewer for his/her supportive comments.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The study by Vengayil et al. presented a role for Ubp3 for mediating inorganic phosphate (Pi) compartmentalization in cytosol and mitochondria, which regulates metabolic flux between cytosolic glycolysis and mitochondrial processes. Although the exact function of increased Pi in mitochondria is not investigated, findings have valuable implications for understanding the metabolic interplay between glycolysis and respiration under glucose-rich conditions. They showed that UBP3 KO cells regulated decreased glycolytic flux by reducing the key Pi-dependent-glycolytic enzyme abundances, consequently increasing Pi compartmentalization to mitochondria. Increased mitochondria Pi increases oxygen consumption and mitochondrial membrane potential, indicative of increased oxidative phosphorylation. In conclusion, the authors reported that the Pi utilization by cytosolic glycolytic enzymes is a key process for mitochondrial repression under glucose conditions.

      Comments on revised version:

      This reviewer appreciates the author's responses addressing some of the concerns.

      (1) However, the concern of reproducibility and experimental methods applied to the study is still valid, particularly considering that many conclusions were drawn from western blot analysis. The authors used separate gel loading controls for western blot analysis, which is not a valid method. Considering loading and other errors/discrepancies during the transfer phase of the assay, the direct control should be analyzing the membrane after transfer or using an internal control antibody on the same membrane. None of the western blots are indicated with marker sizes, and it isn't very clear how many repeats there are and whether those repeats are biological or technical repeats.

      We thank the reviewer for raising this concern. This point requires detailed clarification regarding two key points: the first one regarding the use of Coomassie stained gels over internal ‘housekeeping gene’ antibodies, and the second one regarding the challenges in performing controls for western blots In case of high abundance proteins such as glycolytic enzymes.

      (1) In our western blots, we have used Coomassie stained gel as a loading control for all our western blots. This is performed by cutting one half of the gel and using it for transfer followed by blotting and using the other half for Coomassie staining. I.e. This is not two separate gels that are loaded, but the same gel. Practically, this is no different from cutting a membrane to blot with different antibodies. This method is of course valid method for normalizing western blot data, and is used by multiple studies, for the reasons mentioned below. The historical use of a ‘house-keeping’ gene as a loading control for western blotting assumes that the protein levels of these does not change under different conditions. However, this approach has multiple, severe limitations (since a ‘housekeeping gene’ is entirely contextual, and indeed), and therefore it is correct to use total protein as a loading control. This is indeed recommended for use by multiple studies (Collins et al., 2015). Coomassie staining for total protein is far more reliable than using house-keeping genes as a loading control in western blots (Welinder and Ekblad, 2011). A notable example would be GAPDH itself, which is widely used as a loading control in many studies. As is clear from our data in this manuscript, GAPDH levels itself decrease in ubp3Δ cells. Had we used GAPDH as a loading control, we wouldn’t have identified the decrease in glycolytic enzymes in ubp3Δ cells, and this story would have met with a tragic fate very early on in its inception. We have in fact be very careful with these quantitations, and even before loading samples on gels, they are first normalized using a standard protein estimation assay (Bradford), followed by normalized loading, followed by cutting the gel into two parts - one for coomassie staining and protein normalization, and the other for the western blot for the respective proteins. However, in point (2) below, we clarify on why sometimes we have to load a separate gel with normalized protein, which should resolve this point.

      (2) Glycolytic enzymes are highly abundant proteins and to achieve a signal in the linear range of western blot, the protein extracts have to be diluted (up to 25 or 50 times). As discussed under point 1, an internal control ‘housekeeping gene’ antibody is not a reliable method to use as loading control. Even if we want to use an antibody for an internal protein as a control, there are not many proteins that are as abundant as metabolic enzymes and because of this simple reason, the sample dilution results in these proteins not getting detected in the western blot since the signal will be below the limit of detection. This leaves using a separate gel loading control as the only easy to perform, reliable option.

      We would like to further highlight the fact that the changes in metabolic enzymes and ETC proteins that we observe in the ubp3 mutant by western blot, were also independently observed by large scale untargeted quantitative proteomics study by  (Isasa et al., 2015), which we cite extensively in this manuscript. Since an entirelyindependent study, using a completely different (untargeted) method has also shown very similar  changes in proteins that we observe (mitochondrial, and glycolytic enzymes), there should be no room for doubt regarding the altered glycolytic enzyme and ETC protein  levels that we discover in this study.

      None of the western blots are indicated with marker sizes

      We have clearly indicated the marker sizes in all our western blots. Separately, raw images of the blots and Coomassie stained gels have been provided with the manuscript raw data, and is therefore easily available for any interested reader.

      It isn't very clear how many repeats there are and whether those repeats are biological or technical repeats.

      We have already clearly indicated the details of each blot in the figure legends. For example “A representative blot (out of three biological replicates, n=3) and their quantifications are shown. Data represent mean ± SD.” We kindly request the reviewer to thoroughly go through the figure legends for details regarding the western blots, or any other data. We hope this addresses all the reviewer concerns regarding the credibility of our western blot results and the method of using Coomassie stained gels as loading controls in this study.

      (2) Concern regarding citing the Ouyang et al. paper is still valid. This paper is an essential implication in phosphate metabolism and is directly related to some of the findings associated with mitochondrial function, along with conflicting results, which should be discussed in the discussion section. As a reviewer, I do not request citing any paper from the authors in general; however, considering some of the conflicting results here, citing and discussing paper from Ouyang et al. will improve the interoperation/value of their findings.

      As mentioned in detail in our previous response  letter, we do not believe that the study from Ouyang et al., present ‘conflicting results’ of any kind. Nevertheless, in response to the reviewer's suggestion, we have revised the discussion section of our manuscript and added a few points that  incorporate the insights from Ouyang et al. These are in the discussion section (“It is important to highlight that our experiments, whether involving Pi supplementation or Pi limitations, maintain the cellular Pi concentration within the millimolar range and are conducted within a short timeframe (~ 1 hour). This differs significantly from Pi starvation studies, where cells are subjected to prolonged and complete Pi deprivation, triggering extensive metabolic adjustments to sustain available Pi pools, such as an increase in mitochondrial membrane potential, independent of respiration”). We trust that this modification will enhance the interested readers' understanding of our study's overarching conclusions.

      Reviewer #2 (Public Review):

      Summary:

      Cells cultured in high glucose tend to repress mitochondrial biogenesis and activity, a prevailing phenotype type called Crabree effect that observed in different cell types and cancer. Many signaling pathways have been put forward to explain this effect. Vengayil et al proposed a new mechanism involved in Ubp3/Ubp10 and phosphate that controls the glucose repression of mitochondria. The central hypothesis is that ∆ubp3 shift the glycolysis to trehalose synthesis, therefore lead to the increase of Pi availability in the cytosol, then mitochondrial received more Pi and therefore the glucose repression is reduced.

      Strengths:

      The strength is that the authors used an array of different assays to test their hypothesis. Most assays were well designed and controlled.

      Weaknesses:

      I think the main conclusions are not strongly supported by the current dataset. Here are my comments on authors' response and model.

      (1) The authors addressed some of my concerns related to ∆ubp3. But based on the results they observed and discussed, the ∆ubp3 redirect some glycolytic flux to gluconeogenesis while the 0.1% glucose in WT does not. Similarly, the shift of glycolysis to trehalose synthesis is also not relevant to the WT cells cultured in low glucose situation. This should be discussed in the manuscript to make sure readers are not misled to think ∆ubp3 mimic low glucose. It is likely that ∆ubp3 induce proteostasis stress, which is known to activate respiration and trehalose synthesis.

      But based on the results they observed and discussed, the ∆ubp3 redirect some glycolytic flux to gluconeogenesis while the 0.1% glucose in WT does not. Similarly, the shift of glycolysis to trehalose synthesis is also not relevant to the WT cells cultured in low glucose situation.

      We would like to clarify that we do not observe a redirection of glycolytic flux to gluconeogenesis in ubp3 mutant. What we observe is a rewiring of glycolytic flux into increased trehalose synthesis and PPP, and decreased glycolysis. Also, the shift of glycolysis to trehalose synthesis is relevant to WT cells cultured in low glucose. It is a well-known fact that the trehalose synthesis increases with decrease in media glucose. In case of 0.1% glucose, this increase in trehalose is not due to an increase in gluconeogenesis (since the pathways utilizing alternate carbon sources still remain repressed  in 0.1% glucose (Yin et al., 2003)), but by the increase in glycolytic flux towards trehalose. This is also supported by increase in Tps2 protein levels upon decreasing glucose concentration (Shen et al., 2023). We will also note that there are very few studies that actually estimate gluconeogenic flux in cess (and they only rely on steady state measurements). Estimating gluconeogenic flux appropriately is challenging in itself (eg. see Niphadkar et al 2024). 

      In case of glucose concentrations lower than 0.1%, the shift to trehalose synthesis might not be as relevant. We observe that the glycolysis defective mutant tdh2tdh3 cells does not show an increase in trehalose synthesis (Figure 3-figure supplement 1E). However, in this context, the decrease in the rate of GAPDH catalyzed reaction alone appears to be sufficient to increase the Pi levels (Figure 3F) even without an increase in trehalose. Therefore, there might be differences in the relative contributions of these two arms towards Pi balance, based on whether it is low glucose in the environment, or a mutant such as ubp3Δ that modulates glycolytic flux. In ubp3Δ cells, the combination of low rate of GAPDH catalyzed reaction and high trehalose will happen (based on how glycolytic flux is modulated), vs only the low rate of the GAPDH catalyzed reaction in tdh2tdh3 cells. As an end point the increase in Pi happens in both cases, but this happens via slightly differing outcomes. Also note: in terms of free Pi sources a low-glucose condition (with low glycolytic rate) is very different from a no-glucose, respiratory condition (where cells perform very high gluconeogenesis, at a rate that is an order of magnitude higher than in low glucose). In respiration-reliant conditions such as in ethanol, cells switch to high gluconeogenesis, where there is a large increase in trehalose synthesis as a default (eg see Varahan et al 2019). In this condition, trehalose synthesis could become a major source for Pi (eg see Gupta 2021). This could also support the increased mitochondrial respiration. In an ethanol-only medium, the directionality of the GAPDH reaction is itself reversed (i.e. G-1,3-BP → G-3-P). Therefore, this reaction now becomes an added source of Pi, instead of a net consumer of Pi (see illustration in Figure 3G). Therefore, a very reasonable inference is that a combination of increased trehalose and increased 1,3 BPG to G3P conversion can become a Pi source, supporting increased mitochondrial respiration in a non-glucose, respiratory medium.

      We have now clarified these points in the discussion section in the updated version of our manuscript. Lines xxx. We hope that this updated discussion section satisfies the reviewer’s concern regarding how relevant the increase in trehalose synthesis is for altered Pi balance and increased mitochondrial respiration in WT cells.

      It is likely that ∆ubp3 induce proteostasis stress, which is known to activate respiration and trehalose synthesis.

      Apart from some general changes in metabolism, there are no reports whatsoever that suggest that general proteostasis stress can results in an extensive, precise metabolic rewiring - where there is an increased in respiration, mitochondrial de-repression, precise decrease in two limiting glycolytic enzyme levels, and a precise reduction in glycolytic flux, as observed in the ubp3 mutant. If this was the case, deletion of any deubiquitinase should result in an increase in trehalose and respiration which clearly does not happen (as is already clear from the large screen shown in Figure 1)

      However, in response to this query, we performed experiments to assess the extent of proteostasis stress in ubp3 mutants. For this, we have now estimated the changes in global ubiquitination in WT vs ubp3 mutant, and compared this with conditions of moderate proteostasis stress (mild heat shock at 42C/~1hr). These data are now included in the revised manuscript as Figure 1- figure supplement 1J. Notably, our analysis reveals only very minor  alteration in global ubiquitination levels in ubp3 mutants compared to WT cells. This is in very stark contrast to  limited heat stress, where a clear increase in global ubiquitination can be easily observed. Given these data, we can conclude that there is no significant general proteostatic stress in ubp3 mutants, that could induce substantial metabolic rewiring of such precise nature.

      (2) Pi flux: it is known that vacuole can compensate the reduction of Pi in the cytosol. The paper they cited in the response, especially the Van Heerden et al., 2014 showed that the pulse addition of glucose caused transient Pi reduction and then it came back to normal level after 10min or so. If the authors mean the transient change of glycolysis and respiration, they should point that out clearly in the abstract and introduction. If the authors are trying to put out a general model, then the model must be reconsidered.

      In Van Heerden et al., the pulse addition of glucose causes transient Pi reduction due to rapid Pi consumption in glycolysis. The phosphate levels came back to normal level because of the glucose flux into trehalose synthesis releasing free Pi. This is the entire crux of the study and this is the reason why tps2 mutants which cannot synthesize trehalose exhibit a growth defect and have decreased Pi levels. As explained in detail in our early response, the cellular Pi levels are maintained by a relative balance of reactions that consume and release Pi and therefore a change in this balance can change Pi as well. Indeed, if this were not the case, the tps2 mutants would simply maintain the Pi levels similar to WT cells by increasing Pi transport from the medium, which is clearly not the case (eg see Gupta 2021).

      The cytosol has ~50mM Pi (van Eunen et al., 2010 FEBSJ), while only 1-2mM of glycolysis metabolites, not sure why partial reduction of several glycolysis enzymes will cause significant changes in cytosolic Pi level and make Pi the limiting factor for mitochondrial respiration. In response to this comment, the authors explained the metabolic flux that the rapid, continuous glycolysis will drain the Pi pool even each glycolytic metabolite is only 1-2mM. However, the metabolic flux both consume and release Pi, that's why there is such measurement of overall free Pi concentration amid the active metabolism. One possibility is that the observed cytosolic Pi level changes was caused by the measurement fluctuation.

      The measurement fluctuations that we mentioned in our previous response letter was in case of cells grown in high and low glucose, where there are multiple factors such as mitochondrial amount which complicates the Pi measurements. In case of ubp3 mutants which have a similar amount of total mitochondria as that of WT cells, there is minimal fluctuation for Pi measurement. We have done extensive standardization of mitochondrial isolation and Pi measurement in the isolated mitochondria (as explained in detail in the manuscript) to minimize any such fluctuations. 

      However, the metabolic flux both consume and release Pi, that's why there is such measurement of overall free Pi concentration amid the active metabolism

      The reviewer is correct in pointing out that metabolic flux consume and release Pi. However, in glucose grown yeast cells, the rate of glycolysis which is a Pi consuming reaction is higher than any other metabolic pathway. In fact, the glycolytic rate in glucose-grown S. cerevisiae is one of the highest ever observed in any living system. A decrease in glycolysis and an increase in trehalose therefore shifts the balance in Pi utilization and results in increased free Pi in ubp3 cells. For a more detailed theoretical reasoning on the consumption and production of Pi, see Gupta 2021.

      Importantly, the authors measured Pi inside mito for ethanol and glucose, but not the cytosolic Pi, which is the key hypothesis in their model. The model here is that the glycolysis competes with mito for free cytosolic Pi, so it needs to inhibit glycolysis to free up cytosolic Pi for mitochondrial import to increase respiration. I don't see measurement of cytosolic Pi upon different conditions, only the total Pi or mito Pi. The fact is that in Fig.3C they saw WT+Pi in the medium increase total free Pi more than the ∆ubc3, while WT decrease mito Pi compared to WT control and ∆ubc3 and therefore decrease basal OCR upon Pi supplement. A simple math of Pitotal = Pi cyto + Pi mito tells us that if WT has more Pitotal (Fig.3C) but less Pi mito (fig.5 supp 1C), then it has higher Pi cyto. This is contradictory to what the authors tried to rationalize. Furthermore, as I pointed out previously, the isolated mitochondria can import more Pi when supplemented, so if there is indeed higher Picyto, then the mito in WT should import more Pi. So, to address these contradictory points, the authors must measure Pi in the cytosol, which is a critical experiment not done for their model. For example, they hypothesized that adding 2-DG, or ∆ubp3, suppress glycolysis and thus increase the supply of cytosolic Pi for mito to import, but no cytosolic Pi was measured (need absolute value, not the relative fold changes). It is also important to specific how the experiments are done, was the measurement done shortly after adding 2-DG. Given that the cells response to glucose changes/pulses differently in transient vs stable state, the authors are encouraged to specify that.

      (1) Importantly, the authors measured Pi inside mito for ethanol and glucose, but not the cytosolic Pi, which is the key hypothesis in their model. The model here is that the glycolysis competes with mito for free cytosolic Pi, so it needs to inhibit glycolysis to free up cytosolic Pi for mitochondrial import to increase respiration. I don't see measurement of cytosolic Pi upon different conditions, only the total Pi or mito Pi.

      As clearly described in the manuscript, the key hypothesis that emerges is the role of the availability/accessibility of Pi for the mitochondria, in the context of activity. As discussed in detail in the discussion section, this can come from a combination of available Pi pools in the cytosol and increased transport of this Pi to the mitochondria. While it is true that the decreased glycolysis in ubp3 mutants frees up available Pi pools in the cytosol, measurement of cytosolic Pi in these mutants growing in log phase might not necessarily show an increased cytosolic Pi, if the Pi is being actively transported the the mitochondria at a rate higher that the WT, as indicated by the ~6 fold increase in mitochondrial Pi in ubp3 cells. This would require tools such as intracellular fluorescence based-Pi sensors that could accurately capture temporal changes in cytosolic and mitochondrial Pi following glycolytic inhibition. However, these tools are not available till date for use in yeast and measuring cytosolic Pi following glycolytic inhibition over time using colorimetric Pi assays are extremely difficult.  

      However, the reviewer does correctly state that we had not included measurement of cytosolic Pi. Since the mitochondrial Pi estimate was itself a very challenging (and critical) experiment we had originally thought that data was sufficient. We have therefore now performed a series of new experiments, where we first enrich the cytosolic fraction (without mitochondrial contamination), and estimated cytosolic Pi amounts in WT and ubp3 cells. Our Pi measurements indicate a cytosolic Pi concentration in the range of ~35 mM, which is similar to the earlier reported values in yeast. We further observe that the cytosolic Pi is about ~25% lower in ubp3 mutants (~25-27 mM) compared to WT cells (Figure 4B). As mentioned earlier, this would be consistent with higher transport of Pi from the cytosol to the mitochondria in these cells. Effectively, ubp3 cells have a total increase in cellular Pi, and with a Pi pool distribution such that there is increased Pi availability in mitochondria (Figure 4B). This further substantiates this hypothesis of an increased Pi allocation to mitochondria in ubp3 mutants. The reason for increased rate of Pi transport to mitochondria is not immediately clear, but could also come from changes in cytosolic pH - a possibility that we suggest in our discussion, and is discussed in a later section of this response letter as well.   

      (2) The fact is that in Fig.3C they saw WT+Pi in the medium increase total free Pi more than the ∆ubc3, while WT decrease mito Pi compared to WT control and ∆ubc3 and therefore decrease basal OCR upon Pi supplement. A simple math of Pitotal = Pi cyto + Pi mito tells us that if WT has more Pitotal (Fig.3C) but less Pi mito (fig.5 supp 1C), then it has higher Pi cyto. This is contradictory to what the authors tried to rationalize. Furthermore, as I pointed out previously, the isolated mitochondria can import more Pi when supplemented, so if there is indeed higher Picyto, then the mito in WT should import more Pi.

      a) “The fact is that in Fig.3C they saw WT+Pi in the medium increase total free Pi more than the ∆ubc3, while WT decrease mito Pi compared to WT control and ∆ubc3 and therefore decrease basal OCR upon Pi supplement. A simple math of Pitotal = Pi cyto + Pi mito tells us that if WT has more Pitotal (Fig.3C) but less Pi mito (fig.5 supp 1C), then it has higher Pi cyto.”

      In WT cells supplemented with external Pi (WT+Pi), there is an increased total Pi, but a decreased mitochondrial Pi. As discussed in the discussion section in the manuscript, this could be due to the supplemented Pi not being transported to mitochondria. The reviewer is correct in pointing out that as per simple math this should mean that the cytosolic Pi in WT+Pi should be high. We have now assessed cytosolic Pi upon external Pi supplementation, and this is exactly what we observe in our cytosolic Pi measurements now included in the revised manuscript (Figure 5-figure supplement 5C). There is a higher cytosolic Pi in WT+Pi (~52 mM) compared to WT cells (~35 mM) and ubp3 cells (~27 mM). We have now pointed this out in the discussion section in the revised manuscript “Notably, this increased respiration does not happen upon direct Pi supplementation to highly glycolytic WT cells, where the Pi accumulates in cytosol, without increasing mitochondrial Pi (Figure 5-figure supplement 1C).” We hope that these new data completely addresses the reviewer’s concern regarding the Pi allocations in case of WT+Pi cells.

      b) This is contradictory to what the authors tried to rationalize. Furthermore, as I pointed out previously, the isolated mitochondria can import more Pi when supplemented, so if there is indeed higher Picyto, then the mito in WT should import more Pi.

      We would like to clarify that the Pi measurements in WT+Pi absolutely do not contradict our hypothesis. Furthermore, nowhere do we claim that an increase in cytosolic Pi will increase mitochondrial Pi!! On the contrary, we explain in detail that supplementing Pi to WT cells (which increases cytosolic Pi) will not increase respiration if the increased Pi is not being transported to mitochondria. This is exactly what happens in WT+Pi, where Pi accumulates in the cytosol but does not result in increased mitochondrial Pi. The reviewer argues that if there is higher cyto Pi, mitochondria should import more Pi. This is true in case of transport via diffusion where the external concentration dictates the direction of metabolite transport, but is fundamentally wrong in case of transport of metabolites where active transporters and additional regulators are involved. This is the entire basis of the idea of metabolic compartmentalisation where  cells maintain pools of metabolites in different organelles which regulate the cellular metabolic state. A well-studied example is pyruvate, whose cytosolic concentration is high in glycolytic cells, but it's transport to mitochondria is reduced in glycolysis to maintain cytosolic fermentation. As discussed in the manuscript, a logical explanation for Pi supplementation not increasing respiration and mitochondria Pi is that there might be mechanisms in highly glycolytic cells that restrict the transport of Pi to mitochondria, thereby compartmentalizing Pi in the cytosol. One such possible mechanism is pH (discussed in a later section) and it is possible that there are other mechanisms involved. 

      In case of isolated mitochondria, Pi supplementation results in an increased respiration simply because it is an in vitro set up where we supplement metabolites such as pyruvate, malate and ADP along with phosphate to ensure that mitochondria is actively respiring and in this case Pi will be consumed since it is being used for ATP synthesis. This is entirely different from an in vivo scenario where cells are glycolytic, and mechanisms to prevent mitochondrial transport of metabolites such as pyruvate and phosphate are active. 

      c) It is also important to specific how the experiments are done, was the measurement done shortly after adding 2-DG?

      Cells were treated with 2-DG for one hour and respiration was measured. We have mentioned these details clearly in the figure legends and methods.  

      d) The most likely model to me is that, which is also the consensus in the field, is that no matter 2-DG or ∆ubp3, the cells re-wiring metabolism in both cytosol and mitochondria, and it is the total network shift that cause the mitochondrial respiration increase, which requires the increase of mito import of Pi, ADP, O2, and substrates, but not caused/controlled by the Pi that singled out by the authors in their model.

      The aim of our study is only to highlight the importance of mitochondrial Pi availability as a critical factor in controlling mitochondrial respiration. Of course this would require sufficient other factors such as ADP, substrates and oxygen. It cannot be otherwise. However, as we point out in the discussion, a major limiting factor might be Pi availability. While the altered glycolysis in ubp3 mutants might control availability of other factors such as pyruvate and ADP, this is not the focus of our study. We would also like to point out that prior studies show that even though cytosolic ADP decreases in the presence of glucose, this does  not limit mitochondrial ADP uptake, or decrease respiration, due to the very high affinity of the mitochondrial ADP transporter. This is discussed in our discussion section as well. Further we show that the levels of ETC proteins can be altered by changing Pi levels, which places Pi as a major regulator of respiration. We would like to point out once again that studies in other systems have also highlighted a major role of mitochondrial Pi availability in controlling respiration. These references are included in our manuscript (Scheibye-Knudsen et al., 2009, Seifer et al., 2015). This includes a recent study in T cells that clearly shows increased mitochondrial respiration upon overexpressing mitochondrial Pi transporter SLC25A3 alone (Wu et al., 2023). Our manuscript now in fact provides a contextual explanation of these diverse observations from other cellular systems where mitochondrial Pi transport appears to regulate respiration.

      (3) The explanation that cytosolic pH reduction upon glucose depletion/2DG is a mistake. There are a lot of data in the literature showing the opposite. If the authors do think this is true, then need to show the data. Again, it is important to distinguish transient vs stable state for pH changes.

      We observe that directly supplementing Pi to WT cells growing in high glucose does not result in higher mitochondrial Pi or increased respiration. However, supplementing Pi to WT cells increases mitochondrial respiration in the presence of glycolytic inhibitor 2-DG. We therefore merely suggest that cytosolic pH could be an additional regulator of mitochondrial Pi transport, since this will be consistent with the differences in mitochondrial Pi transport in highly glycolytic cells, and cells with decreased glycolysis ( such as 2-DG addition and ubp3 mutant). This is because in mitochondria, Pi is co-transported along with protons. Therefore, changes in cytosolic pH (which changes the proton gradient) will control the mitochondrial Pi transport (Hamel et al., 2004).  The glycolytic rate is itself a major factor that controls cytosolic pH. The cytosolic pH in highly glycolytic cells is maintained ~7, and decreasing glycolysis results in cytosolic acidification (Orij et al., 2011). Therefore, under conditions of decreased glycolysis (such as loss of Ubp3), cytosolic pH becomes acidic. Since mitochondrial Pi transport depends on the proton gradient, a low cytosolic pH would favour mitochondrial Pi transport. Therefore, under conditions of decreased glycolysis (2DG treatment, or loss of Ubp3), where cytosolic pH would be acidic, increasing cytosolic Pi might indirectly increase mitochondria Pi transport, thereby leading to increased respiration. But we certainly do leave alternate interpretations to the imagination of any reader, and are indeed open to them. These are all exciting future directions this study will enable a contextual interpretation of.

      The explanation that cytosolic pH reduction upon glucose depletion/2DG is a mistake.

      We have cited two independent studies which suggest that cytosolic pH decreases upon a decrease in glycolysis (Orij et al.,2011 ,Dechant et al., 2010). This control of cytosolic pH by the glycolytic rate has been extensively shown using glycolytic mutants, cells in low glucose and cells grown in the presence of glycolytic inhibitors. According to the reviewer, this is a mistake and

      there are a lot of data in the literature showing the opposite.

      In our literature review we did not come across any relevant studies that actually show the opposite. If the  reviewer still thinks this is a mistake, the reviewer is welcome to include some of the relevant literature that clearly shows the opposite in the comments, with actual measurements of cytosolic pH. Additionally,  the possible role of cytosolic pH in this context does not affect the conclusions of our study, and we only include this as a possibility in the discussion. Therefore, this is obviously well beyond the scope of experiments in our current study, and considering the extensive data from multiple studies that shows that cytosolic pH decreases under low glycolysis, there is no relevance  to including experiments to address the same in this study. We leave this as a point for an interested reader to think about, and it certainly can nucleate new directions of future study.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Summary of the changes

      Changes in the manuscript were made to clarify some ambiguities raised by the reviewers and to improve the report following their recommendations. A summary of the main changes is listed below:

      - The title was changed to better reflect the results of this study - Re-training the model on log transformed FACS scores.

      - Testing the specificity of the FEPS to facial expression of pain within this experimental setup by comparing it to the activation maps obtained from the Warm stimulation condition.

      - Testing for sensitization/habituation of the behavioral measures (FACS scores and pain ratings).

      - Adding a section in the discussion to better address the limitations of this study and provide potential directions for future studies.

      Other changes target areas where the original manuscript may have been ambiguous or lacked precision. To address these concerns, additional details have been incorporated, and certain terms have been revised to ensure a more precise and transparent presentation of the information.

      Public Reviews:

      Reviewer #1 (Public Review):

      Picard et al. report a novel neural signature of facial expressions of pain. In other words, they provide evidence that a specific set of brain activations, as measured by means of functional magnetic resonance imaging (fMRI), can tell us when someone is expressing pain via a concerted activation of distinctive facial muscles. They demonstrate that this signature provides a better characterization of this pain behaviour when compared with other signatures of pain reported by past research. The Facial Expression of Pain Signature (FEPS) thus enriches this collection and, if further validated, may allow scientists to identify the neural structures subserving important non-verbal pain behaviour. I have, however, some reservations about the strength of the evidence, relating to insufficient characterization of the underlying processes involved.

      We are thankful for the summary of our work. We are hopeful that the modifications made in the latest version effectively address these concerns. The changes are outlined in the summary above, and detailed in the following point-by-point response.

      Strengths:

      The study relies on a robust machine-learning approach, able to capitalise on the multivariate nature of the fMRI data, an approach pioneered in the field of pain by one of the authors (Dr. Tor Wager). This paper extends Wager's and other colleagues' work attempting to identify specific combinations of brain structures subserving different aspects of the pain experience while examining the extent of similarity/dissimilarity with the other signatures. In doing so, the study provides further methodological insight into fine-grained network characterization that may inspire future work beyond this specific field.

      We are thankful for the positive comments.

      Weaknesses:

      The main weakness concerns the lack of a targeted experimental design aimed to dissect the shared variance explained by activations both specific to facial expressions and to pain reports. In particular, I believe that two elements would have significantly increased the robustness of the findings:

      (1) Control conditions for both the facial expressions and the sensory input. An efficient signature should not be predictive of neutral and emotional facial expressions (e.g., disgust) other than pain expressions, as well as it should not be predictive of sensations originating from innocuous warm stimulation or other unpleasant but non-painful stimulation.

      We do recognize the lack of specificity testing for the FEPS, especially towards negative emotional facial expressions. This would be relevant to test given the behavioural overlap between the facial expressions of pain and disgust, fear, anger, and sadness (Kunz et al., 2013; Williams, 2003). The experimental design used in this study did not include other negative states. However, we fully support the necessity of collecting data throughout those conditions, and we believe that the present study highlights the importance of such a demonstration. Future research should involve recording facial expressions while exposing participants to stimuli that elicit a range of negative emotions but, to our knowledge, such combination of fMRI and behavioural data is currently unavailable. As raised by the reviewer, this approach would allow us to assess the specificity of the FEPS to the facial expression evoked by pain compared to different affective states. We would like to emphasise that specificity and generalizability testing is a massive amount of work, requiring multiple studies to address comprehensively. A Limitations paragraph addressing this research direction has been added to the Discussion. A conclusion was added to the abstract as follows: “Future studies should explore other pain-relevant manifestations and assess the specificity of the FEPS against other types of aversive or emotional states.”

      (2) Graded intensity of the sensory stimulation: different intensities of the thermal stimulation would have caused a graded facial expression (from neutral to pain) and graded verbal reports (from no pain to strong pain), thus offering a sensitive characterisation of the signal associated with this condition (and the warm control condition).

      However, these conditions are missing from the current design, and therefore we cannot make a strong conclusion about the generalisability of the signature (regardless of whether it can predict better than other signatures - which may/may not suffer from similar or other methodological issues - another potential interesting scientific question!). The authors seem to work on the assumption that the trials where warm stimulation was delivered are of no use. I beg to disagree. As per my previous comment, warm trials (and associated neutral expressions) could be incorporated into the statistical model to increase the classification sensitivity and precision of the FEPS decoding.

      The experience of pain can fluctuate for a fixed intensity or after controlling statistically for the intensity of the stimulation (Woo et al., 2017). Consistent with this, the current study focused on spontaneous facial expression in response to noxious thermal stimuli delivered at a constant intensity that produced moderate to strong pain in every participant. As the reviewer points out, this does not allow us to characterise and compare the stimulus-response function of facial expression and pain ratings. The advantage of the approach adopted is to maximise the number of trials where facial expression is more likely to occur, while ensuring that changes in facial expression and pain ratings are not confounded with changes in stimulus intensity. The manuscript has been revised to clarify that point. However, we do agree that it would be interesting to conduct more studies focusing on facial expression in response to a range of stimulus intensities. This discussion has been added to the Limitations paragraph.

      Furthermore, following the reviewer’s suggestion, we performed complementary analyses on the warm trials in the proposed revisions. The dot product (FEPS scores) between the FEPS and the activation maps associated with the warm condition was computed. A linear mixed model was conducted to investigate the association between FEPS scores and the experimental condition (warm vs pain). The trials in the pain condition were divided into two conditions: null FACS scores (painful trials with no facial response; FACS scores = 0) and non-null FACS scores (painful trials with a facial response; FACS > 0). The details of this analysis have been added to the manuscript (see Response of the FEPS to pain and warm section in the Methods; lines 427 to 439) as well as the corresponding results (see Results and Discussion; lines 138 to 158). The FEPS scores were larger in the pain condition where a facial response was expressed, compared to both the pain condition without facial expression and the warm condition. These results confirmed the sensitivity of the FEPS to facial expression of pain.

      Reviewer #2 (Public Review):

      Summary:

      The objective of this study was to further our understanding of the brain mechanisms associated with facial expressions of pain. To achieve this, participants' facial expressions and brain activity were recorded while they received noxious heat stimulation. The authors then used a decoding approach to predict facial expressions from functional magnetic resonance imaging (fMRI) data. They found a distinctive brain signature for pain facial expressions. This signature had minimal overlap with brain signatures reflecting other components of pain phenomenology, such as signatures reflecting subjective pain intensity or negative effects.

      We appreciate this concise and accurate summary of our study.

      Strength:

      The manuscript is clearly written. The authors used a rigorous approach involving multivariate brain decoding to predict the occurrence and intensity of pain facial expressions during noxious heat stimulation. The analyses seem solid and well-conducted. I think that this is an important study of fundamental and clinical relevance.

      Weaknesses:

      Despite those major strengths, I felt that the authors did not suffciently explain their own interpretation of the significance of the findings. What does it mean, according to them, that the brain signature associated with facial expressions of pain shows a minimal overlap with other pain-related brain signatures?

      We express our sincere gratitude for the valuable insights and constructive comments on the strengths and weaknesses of the current study. We thank reviewer 2 for the encouragement to reinforce our interpretation of the significance of the findings, while acknowledging the limitations raised by the three reviewers.

      A few questions also arose during my reading.

      Question 1: Is the FEPS really specific to pain expressions? Is it possible that the signature includes a facial expression signal that would be shared with facial expressions of other emotions, especially since it involves socio-affective regulation processes? Perhaps this question should be discussed as a limit of the study?

      We acknowledge this limitation as outlined in response to Reviewer #1. We have incorporated a Limitations paragraph to provide a more in-depth discussion of this limitation and to explore potential future avenues (lines 225 to 268). Again, please note that the demonstration of specificity is an incremental process that requires a systematic comparison with other conditions where facial expressions are produced without pain. A concluding sentence was added to the abstract to encourage specificity testing in future studies. as indicated above.

      Question 2: All AUs are combined together in a composite score for the regression. Given that the authors have other work showing that different AUs may be associated with different components of pain (affective vs. sensory), is it possible that combining all AUs together has decreased the correlation with other pain signatures? Or that the FEPS actually reflects multiple independent signatures?

      The question raised is consistent with the work of Kunz, Lautenbacher, LeBlanc and Rainville (2012), and Kunz, Chen and Rainville (2020). In the current study, the pain-relevant action units were combined in order to increase the number of trials where a facial response to pain was expressed, thus enhancing the robustness of our analyses. Given the limited sample size, our current dataset is unfortunately insufficient to perform such analysis as there would not be enough trials to look at the action units separately or in subgroups. While the approach of combining the different AUs has proven to be valid and useful, we recognize the value of investigating potential independent signatures associated with the different AUs within the FEPS, and examining whether those signatures can lead to more similar patterns compared to previously developed pain signatures. This discussion has been included in the Limitations paragraph in the Discussion (lines 225 to 268).

      Question 3: Is facial expressivity constant throughout the experiment? Is it possible that the expressivity changes between the beginning and the end of the experiment? For instance, if there is a habituation, or if the participant is less surprised by the pain, or in contrast if they get tired by the end of the experiment and do not inhibit their expression as much as they did at the beginning. If facial expressivity changes, this could perhaps affect the correlation with the pain ratings and/or with the brain signatures; perhaps time (trial number) could be added as one of the variables in the model to address this question.

      The concern raised by the reviewer is legitimate. We conducted a mixed-effects model to assess the impact of successive trials and runs on facial expressivity. Results indicate that the FACS scores did not change significantly throughout the experiment, suggesting no notable effect of habituation or sensitization on the facial expressivity in our study. Details about the analysis and the results have been added to the Facial Expression section in the Methods (lines 335 to 346).

      Reviewer #3 (Public Review):

      In this manuscript, Picard et al. propose a Facial Expression Pain Signature (FEPS) as a distinctive marker of pain processing in the brain. Specifically, they attempt to use functional magnetic resonance imaging (fMRI) data to predict facial expressions associated with painful heat stimulation. The main strengths of the manuscript are that it is built on an extensive foundation of work from the research group, and that experience can be observed in the analysis of fMRI data and the development of the machine learning model. Additionally, it provides a comparative account of the similarities of the FEPS with other proposed pain signatures. The main weaknesses of the manuscript are the absence of a proper control condition to assess the specificity of the facial pain expressions, a few relevant omissions in the methodology regarding the original analysis of the data and its purpose, and a biased interpretation of the results.

      I believe that the authors partially succeed in their aims, as described in the introduction, which are to assess the association between pain facial expression and existing pain-relevant brain signatures, and to develop a predictive brain activation model of the facial responses to painful thermal stimulation. However, I believe that there is a clear difference between those aims and the claim of the title, and that the interpretation of the results needs to be more rigorous.

      We wish to express our appreciation for the insightful and constructive critique provided. The limitation pertaining to the absence of specificity testing had been addressed in response to Reviewer #1, and it has been incorporated into the manuscript (lines 251 to 258).

      The commentary made by Reviewer #3 has drawn our attention to a critical concern, namely the potential misalignment between the study findings and our original title. Consequently, we have changed the title to “A distributed brain response predicting the facial expression of acute nociceptive pain”. We also revised the interpretation of the results in the discussion section and we have added a section on limitations.

      Recommendations for the Authors:

      Reviewer #1 (Recommendations For The Authors):

      I hope the following comments will be useful to improve the manuscript.

      Abstract

      I felt the abstract could be more clear in terms of experimental or scientific questions, hypotheses/expectations, and findings. I also feel the abstract should briefly support the conclusive claim ("is better than...": how better? Or according to what criterion? This may be more relevant than the final conclusive general sentence that does not specifically address the significance of the findings).

      The abstract was revised to reinforce the functional perspective adopted to interpret brain activity produced by noxious stimuli and predicting various pain-relevant manifestations. We also mention explicitly the other pain-relevant signatures against which the FEPS is compared in this report, and we added a concluding sentence highlighting the importance of assessing the specificity of the FEPS in future studies.

      Introduction - background and rationale

      I would postpone the discussion around pain signature and anticipate the one about the brain mechanisms of facial expressions of pain. This will allow you to reinforce the logical flow of rationale, literature gap/question, why the problem is important, and study aims. Only then go for a review of relevant literature on signatures before providing a more specific final paragraph about the study-specific questions, expectations, and implementation. At the moment this is limited to a single very descriptive short paragraph at the end of the intro.

      The introduction was structured to guide the readers through a comprehensive understanding of different pain neurosignatures. The introduction aimed to establish a robust rationale for the subsequent analyses detailed in the results section. Indeed, the presentation of that literature ensured that the discussion around pain signatures is contextualised within a broader continuous framework. We acknowledge the reviewer’s comment on the limited description of the brain mechanisms of facial expression of pain. However, this was addressed in several previous reports of our laboratory (Kunz et al. 2011; Vachon-Presseau et al. 2016; Kunz, Chen, and Rainville 2020). We have added some more details about the brain mechanisms of facial expression, and highlighted those references in the first paragraph of the introduction.

      Methods and Results

      (1) Was there any indication of power based on the previous work or the other signature papers? If yes, how that would inform the present analysis?

      The NPS was trained on 20 participants that experienced 12 trials at each of four different intensities. The assessment of the effect sizes was performed on the Neurological Pain Signature in Han et al. (2022). That study revealed a moderate effect size for predicting between-subject pain reports, and a large one for predicting within-subject pain reports. We trained our model on 34 participants that underwent 16 trials. We expected our results to show a smaller effect size as the current experimental design only allowed us to examine spontaneous changes in the facial expression, as noted in the comments made by Reviewer #1. However, the best way to calculate the unbiased effect size of the results presented in the current study would be to test the unchanged model on new independent datasets (see Reddan, Lindquist, and Wager, 2017). Unfortunately, such datasets do not currently exist.

      (2) I would clarify to the reader what is meant by normal range of thermal pain and why is this relevant. Also, I did not find data about this assessment nor about the assessment of facial expressiveness (or reference to where it can be found).

      We changed this formulation to “All participants included in this study had normal thermal pain sensitivity” and we added a few references. By targeting a healthy population with normal thermal pain sensitivity, our study sought to identify a predictive brain pattern related to facial expression evoked by typical responses to pain that could eventually be generalised to other individuals from the same population. Details about the assessment of facial expressiveness have been added in the appropriate section in the Methods.

      (3) That pain ratings are only weakly associated with facial responses is, in its own right, an interesting finding, as a naïve reader would expect the two to be highly positively correlated. I'd suggest discussing this aspect (in reference to previous research) as it is interesting on both theoretical and empirical grounds.

      The likelihood and the strength of pain facial expression generally increase with pain ratings in response to acute noxious stimuli of increasing physical intensities, thereby leading to a positive association between the two responses that is driven by the stimulus. However, the poor correlation or the dissociation between facial pain expression and pain rating is a very well known phenomenon that can be demonstrated easily using experimental methods where the stimulus intensity is held constant and spontaneous fluctuations are observed in both facial expression and pain ratings. This result was not discussed in the current manuscript as it was already addressed in the work of Kunz et al. (2011) and Kunz, Karos and Vervoot (2018). We added the references to these studies in the revised manuscript (lines 330 to 334).

      (4) It may be worth having CIs throughout the whole set of analyses.

      Thanks for the suggestions, this was an oversight. The confidence intervals have been added in the manuscript where applicable.

      (5) I would clarify if there are two measures of the brain signature: dot-product and activation map. Relatedly, I cannot find where the authors explained what "FEPS pattern expression scores". Can the authors please clarify?

      The clarification has been added in the manuscript (lines 413 to 414).

      (6) There seems to be the assumption that the relationship between pain-relevant brain signatures and facial expressions of pain would be parametric and linear. However, this might not hold true. Did the authors test these assumptions?

      We indeed decided to use a linear regression technique (i.e. LASSO regression) to model the association between the brain activity and the facial expression of pain. The algorithm choice was mainly based on the simplicity and the interpretability of that approach, and our limited number of observations. The choice was also coherent with previous studies in the domain (e.g. Wager et al., 2011; Wager et al., 2013; Krishnan et al. 2016; Woo et al., 2017). Using a linear model, we were able to predict above chance level the facial expression evoked by pain using the fMRI activation. However, it is legitimate to think that more complex non linear models can better capture the brain patterns predictive of that behavioural manifestation of pain.

      (7) Did the authors assess whether the FACS were better to be transformed/normalised? More generally, I would report any data assessment/transformation that has not been reported.

      Thank you for this highly relevant suggestion. FACS scores were indeed not normally distributed and the analyses were conducted again to predict the log transformed FACS scores. This transformation was effective to normalize the distribution (skewness = 0.75, kurtosis = -0.84). The predictive model was confirmed on transformed data.

      (8) Page 12: I am not clear on whether all the signatures are included in the same model (like a multiple regression) or if separate regressions are calculated per signature. The authors seem to imply that several regressions have been computed (possibly one per comparison with each signature?).

      The correlation between the FACS scores and the pain-related signatures was computed separately for each signature. This information has been clarified.

      (9) MVPA: See my main comment about warm trials and experimental/statistical design. For example, the LASSO regression model for the pain trials could be compared with a model using warm trials besides (or instead of) the unfitted model. Otherwise, add the warm trials as another predictor or within the subject level in a dummy fixed factor comprising pain and warm trials.

      The inclusion of warm trials in the model training would be inconsistent with the goal of the main analysis to predict the facial expression of pain when a noxious pain stimulus is presented. Secondary analyses were conducted to compare the response of the FEPS to the warm trials compared to noxious pain trials. The dot product between the FEPS and the activation maps (FEPS scores) associated with the warm condition was computed. A linear mixed model was conducted to investigate the association between FEPS scores and the experimental condition (warm vs pain). Additional contrasts compared the warm trials with the pain trials with and without pain facial expression. The details of this analysis have been added to the manuscript (see Response of the FEPS to pain and warm in the Methods) as well as the corresponding results (see Results and Discussion).

      (10) I would clarify for the reader why the separate M1 analysis has been run. Although obvious, I feel the reader would benefit from the specific hypothesis about this control analysis being spelled out together with the other statistical hypotheses within the statistical design in a more streamlined manner.

      We extended the discussion on the rationale of that analysis and its interpretation taking into account the most recent results using the log transformed FACS scores (lines 125 to 133).

      (11) The mixed model aimed to assess the relationship between pain ratings FEPS scores and facial scores is a crucial finding. I believe it speaks to the importance of a more complete design, which I already highlighted. I have a couple of technical questions: did the authors assess random slopes too? And, what was the strategy used to determine the random effects structure?

      The linear mixed model considered the participants as a random effect, with random intercepts, considering the grouping structure in our data (i.e., each participant completed multiple trials). The reported results in the original manuscript were considering fixed slopes. However, following the reviewer’s comment, we re-computed the mixed linear models allowing the slopes to vary according to the intensity ratings. The results were changed in the manuscript to represent the output of those models.

      (12) The text from lines 63 to 67 could go in the methods.

      We decided to include those lines within the Result and Discussion section to give the reader more specification about the FACS scores, as this term is subsequently referenced in the following part of the Results and Discussion section. We are concerned that putting this information only in the Methods section would disrupt the reading.

      Reviewer #2 (Recommendations For The Authors):

      p. 4-5. When you report the positive weight clusters, you follow up with a sentence specifying which cognitive processes those brain regions are typically associated with. However, when you report the negative weight clusters, you do not specify the cognitive processes typically associated with those brain areas. I think that providing that information would be helpful to the readers.

      Thanks for noticing this omission. The information has been added in the most recent version of the manuscript (lines 119 to 121).

      p. 9. You specify that the degree of expressiveness of participants was evaluated. How did you evaluate expressiveness? Did you use this variable in your analyses? Were participants excluded based on their degree of expressiveness?

      Details about the assessment of facial expressiveness have been added in the appropriate section in the Methods (lines 285 to 289).

      p. 10. You explain that two certified FACS-coders evaluated the video recordings to rate the frequency of AUs. Could you please provide more details about the frequency measure? I think that there are different ways in which this could have been done. For instance, were the videos decomposed into frames, and then the frequency measured by summing the number of frames in which the AU occurred? Or was it "expression-based", so one occurrence of an AU (frequency of 1) would correspond to the whole period between its activation onset and offset? Both ways have pros and cons. For example, if the frequency represents the number of frames, then it controls for the total duration of the AU activation within a trial (pro); but if there were multiple activations/deactivations of the AU within one trial, this will not be controlled for (con). And vice-versa with the second way of calculating frequency.

      Details about the frequency scores have been added to the manuscript (lines 315 to 319).

      p. 11. When you explained how you calculated the association between the facial expression of pain and pain-related brain signatures, I felt that there was some information missing. Did you use the thresholded maps (available in the published articles), or did you somehow have access to the complete, voxel-by-voxel, raw regression coefficient maps?

      The unthresholded maps were used. The information has been clarified in the latest version of the manuscript, as well as the details about the availability of the maps (see Data Availability section at the end of the manuscript).

      Reviewer #3 (Recommendations For The Authors):

      Format

      The authors will notice that many observations about the manuscript are related to missing information and a lack of graphical representations. I believe the topic and the content of the manuscript are too complex to condense into a short report.

      Title

      The claim of the title is simply not substantiated by the content of the manuscript. Demonstrating that the FEPS is a distinctive (i.e., specific) marker of pain processing requires a substantially different experimental design, with more rigorous controls and a broader set of painful stimulations. The manuscript would benefit from a more accurate title.

      We agree that the title could better align with our findings. We modified the title accordingly : “A distributed brain response predicting the facial expression of acute nociceptive pain”.

      Abstract

      I find it puzzling that the authors claim that there is limited knowledge of the neural correlates of facial expression of pain given what they describe in the first paragraph of the introduction. Besides, they propose to reanalyze a dataset that has been extensively described in Kunz et al. (2011), which is unlikely to provide any new significant information.

      We respectfully disagree with that comment. We considered that three articles (i.e., Kunz et al., 2011; Vachon-presseau et al., 2016; Kunz, Chen and Rainville, 2020) on the topic do constitute limited knowledge, especially if we compare it to the very large body of literature on the neural correlates associated with pain ratings. Except for these three studies, all the other citations pertain to behavioral studies on facial expression of pain, and do not examine the brain activity related to it. Furthermore, we believe that the complementary nature of the analyses performed in Kunz et al. (2011) and in this manuscript offers new insights into our understanding of facial expression in the context of pain. Indeed, the multivariate approach used in this study addresses some limitations present in Kunz et al. (2011) univariate analyses, mainly that it provides a quantifiable way to compare the similarity between different predictive patterns (Reddan and Wager, 2017). We submit that the assessment of the FEPS against several other pain-relevant signatures provides new and important information.

      Furthermore, the abstract does not clearly state the aim, and the first line of the results does not match what the authors claim in the preceding line. The take-home message (last sentence) introduces the concept of a biomarker, which, as stated before, cannot be validated with the current data/experimental design. To put it in plain words, a given facial expression (or a composite score derived from a combination of expressions) cannot be a specific biomarker for pain, because a person can always mimic the same expression without feeling pain. Whether a given facial expression can be predicted from brain activity is a different issue, and whether that prediction can differentiate between painful and non-painful origins of the facial expression is another different issue. Unfortunately, neither of those issues can be tested with the current data/experimental design. The abstract would improve if the authors would circumscribe to what they actually tested, which is accurately described in the last sentence of the Introduction.

      The abstract was revised accordingly. The term ‘biomarker’ was used in accordance with preceding studies in the field (see Reddan and Wager, 2017; Lee et al., 2021). Please note that we applied the same reasoning to fluctuations in pain expression as previous studies have applied to pain ratings. Of course, we can not dismiss the possibility of someone mimicking facial expressions. Similar reasoning applies to subjective reports, as individuals can intentionally overestimate their pain experience conveyed through verbal reports. This is another case of specificity testing that cannot be addressed in the present study (see new conclusion of the abstract and discussion of limitations). The challenge of pain assessment is a classical problem within both the scientific and the clinical literature. Here, we suggest that the consideration of multiple manifestations of pain is necessary to address this challenge and will provide a more comprehensive portrait of pain-related brain function.

      Introduction

      I believe that the Introduction would benefit from a strict definition of what is a marker/biomarker/neuromarkers (all those terms are used in the manuscript) and what are its desirable features (validity, reliability, specificity, etc.). I also believe that the Introduction (and the rest of the text) would benefit from a critical assessment of the term "signature". The Introduction describes four existing "signatures", all of them differing in the experimental condition in which acute nociceptive pain is studied, and proposes a fifth one. Keeping with the analogy, I'm wondering whether they should be called (pain) "signatures" if there is a different one for each experimental acute pain condition, and they are so dissimilar between them when they are tested on the same condition (this dataset).

      The last part of that comment raises fundamental methodological potential limitations that should be addressed in more depth in another article. That point goes beyond the scope of a research article. Regarding the stability aspect of the signatures, most of the signatures have not been studied extensively. It is thus difficult to currently assess their reliability. However, Han et al. (2022) showed high within-individual test-retest reliability for the NPS across eight different studies. Given that pain is a multidimensional experience, it is not surprising to find different patterns of activation predictive of different aspects or dimensions of the pain experience (see Čeko et al., 2022 for a similar discussion applied to negative affect).

      The authors state that "As an automatic behavioral manifestation, pain facial expression might be an indicator of activity in nociceptive systems, perceptual and evaluative processes, or general negative affect." Doesn't it reflect all three of them? (and instead of or?) Why "might"?

      The original sentence has been modified as follows: “As an automatic behavioral manifestation, pain facial expression is considered to be an indicator of activity in nociceptive systems, and to reflect perceptual and affective-evaluative processes” (lines 65 to 67).

      Methods

      The pain scale should be described. Kunz et al. used a 0-100 scale, where 50 was the pain threshold. This is crucial to interpret the 75-80/100 score for the painful thermal intensity.

      The description of the pain scale has been added to the manuscript (lines 299 to 300).

      Ratings for warm and painful temperatures should be reported (ideally plotted with individual-trial/subject data). In the same line of reasoning, FACS scores should be reported as well (ideally plotted with individual-trial/subject data). It would be interesting to explore the across-trial variability of pain ratings and FACS scores. That is, do people keep giving the same ratings and making the same facial expression after 16 trials? How much variability is between trials and between subjects?

      The point raised in that comment was already addressed in response to a comment made by Reviewer #1 (also see the new Figures S2 and S4; see also lines 335 to 346).

      How come only painful trials are analyzed? What if the FEPS signature was the same for warm and painful stimulation, thus reflecting the settings (fMRI experiment, stimulation, etc.) rather than the brain response to the stimuli?

      The point raised in that comment was already addressed in response to a comment made by Reviewer #1. There was no pain expression in the warm trials and the FEPS shows no response to warm trials. This is now illustrated in the new Figure S4B (see also lines 138 to 158).

      The authors propose to predict the trial-by-trial FACS composite score from the pain ratings using a LMM. However, it is interesting that they aim for an almost constant within- and between-subject pain score (75-80/100) as stated in the Methods. This should theoretically render the linear model invalid since its first (and main) assumption would be that FACS should vary linearly with the pain score. Even if patients were not aware that the temperatures were constant across trials, the variation in pain scores should be explained by random noise for a constant stimulation intensity.

      Reviewer #3 raises an important point that we need to clarify. Contrary to the expectation that FACS responses should be strongly correlated to pain ratings, we posited that these response channels depend at least in part on separate brain networks that may be differentially sensitive to a variety of modulatory mechanisms (attention, emotion, expectancy, motor priming, social context, etc.). This implies that part of the variance in FACS is independent from pain ratings. We, therefore, consider what Reviewer #3 refers to as random noise to be relevant and meaningful fluctuations reflecting endogenous processes influencing one’s experience of pain and differentially affecting various output responses.

      I noticed that fMRI data was analyzed with SPM5 in the original paper (Kunz et al., 2011) and with SPM8 in this manuscript. Was fMRI data re-processed for this manuscript? Were there any differences between the original analysis and this one that might induce changes in the interpretation of results?

      The data were indeed re-processed using SPM8, which was the most recent version available when we started the analyses reported here. We used trial-by-trial activation maps for MVPA, which differs from what was used in the previous study (contrast maps at the level of the conditions, not the trials). We have no reason to believe that the different versions will change the message of this manuscript since those versions do not differ significantly in terms of the fMRI preprocessing pipeline (see SPM8 release notes; https://www.fil.ion.ucl.ac.uk/spm/software/spm8/). Furthermore, the aim of this present study is not to compare the different analysis parameters implemented in SPM5 vs SPM8.

      What is the rationale for including PVP in the comparison among signatures? The experimental settings in which it was devised are distant from those described here.

      The inclusion of the PVP was aimed at enhancing our comparative analysis with the FEPS, as we sought to investigate the potential functional meaning of the FEPS. The PVP was developed to capture the aversive value of pain, a dimension that is conceptually proximal to the interpretation of the facial expression as a manifestation of the affective response to nociceptive pain.

      The LASSO-PCR approach is, in my opinion, not a procedure for (brain) decoding in this context. It is accurately described in the section title as a method for multivariate pattern analysis, or as a variable selection and regularization method for a prediction model. Here, brain activity in specific areas related to pain processing can hardly be described as "encoded", and the method just helps select those activations relevant for explaining a certain outcome (in this case, facial expressions).

      We understand the point made by reviewer #3. The term brain decoding was changed for multivariate pattern analysis in the latest version of the manuscript.

      Details are missing with regards to the dataset split into training, validation, and testing.

      Details about the training and testing procedure were added in the manuscript (lines 383 to 385).

      This might just be ignorance from me, so I apologize in advance, but what are "contrast" fMRI images? They are mentioned three times in the text but not really described. Are they the "Pain > Warm" contrasts from the original paper?

      We apologize for any confusion caused by the use of the term “contrast images” which suggests a direct comparison between two experimental conditions. We have replaced “contrast images” with “activation maps” to provide a more accurate description of the nature of the data used in the multivariate pattern analysis (lines 388 to 389).

      In the "Facial expression" section, the authors run an LMM to test the association between pain ratings (response variable) and facial responses (explanatory variable). If I understand correctly, in the "Multivariate pattern analysis" section they test the association between facial composite scores (response variable) and pain ratings (explanatory variable), but they obtain different results.

      The analyses were recomputed on the log transformed data, as mentioned previously in the response to reviewers 1-2. The first model (in the “Facial expression” section) used the log transformed FACS scores as a dependent variable, the pain ratings as the fixed effect, and the participants as the random effect. The results of that analysis suggested that the transformed facial expression scores were not significantly associated with the pain ratings (p = .07). The second model uses both the FEPS pattern expression scores and pain ratings as fixed effects to predict facial responses. This analysis showed the significant contribution of the FEPS to the prediction of FACS scores (p < .001) and no significant effect of the pain ratings. However, a significant interaction was found (p = .03) suggesting that the prediction of the pain facial expression by the FEPS may vary with pain ratings (i.e. moderator effect). Those results have been clarified in the “Multivariate pattern analysis” section in the Methods (lines 416 to 426).

      In this same section, what are "FEPS pattern expression scores"? They are used three times in the text, but I could not find their description.

      The FEPS pattern expression scores correspond to the dot product between the trial-by-trial activation maps and the unthresholded FEPS signature. This information has been added to the manuscript (lines 413 to 414).

      It would not be far-fetched to hypothesize that FACS scores could be predicted using solely activity from the motor cortex. The authors attempted to do this, but only with information from M1. Why did they not use the entire motor cortex, or better, regions of the motor cortex directly linked with the AUs described in the manuscript?

      The selection of the primary motor area (M1) was based on the results found in Kunz et al. (2011). In this study, M1 showed the strongest correlation with facial expression of pain. There are numerous possibilities of combinations of multiple brain regions considering a variety of criteria based on distributed networks involved in motor, affective, or pain-related processes. We limited our exploration to the region with the strongest hypothesis due to practical feasibility concerns.

      Results and Discussion

      As a general recommendation, results should present individual data whenever possible. For example, the association between signatures and facial expression should be plotted using scatterplots.

      We have added figures showing individual data when it was applicable (Figure S2; Figure S4).

      The authors state that the LASSO-PCR model accounts for the facial responses to pain. I believe this is an overstatement, considering:

      - A Pearson's r of 0.49 is usually considered low/weak correlation (moderate at best). In the same line, an R2 of 0.17 means that only 17% of the variance is explained by the model.

      More nuanced interpretation of the results has been added to the discussion. A section has been added to highlight the limitations of the study.

      - Figure 1 needs to display individual subject data and the ideal regression line.

      The model was trained using a k-fold cross-validation procedure. The regression lines thus represent the model’s prediction for each one of the 10 folds (i.e. each fold is trained and tested on a different subset of the data). A scatter plot including the ideal regression line computed across all trials and subjects was added in supplementary material to illustrate the relation between the FACS scores and the FEPS pattern expression scores (Figure S4).

      - Looking at Figure 1, it is clear that the model has an intercept different from zero. This means that when the FACS score was zero (i.e., volunteers did not make any distinguishable facial expression), the model predicted a score larger than zero. This is not discussed in the manuscript, and in simple terms, it means that there are brain activation patterns when no discernible facial expression is being made by the volunteers. In the original paper by Kunz et al., two groups of subjects were categorized, and one of them was a facially low- or non-expressive group (n=13). This fact is not even mentioned in the manuscript.

      The categorization in the previous report (Kunz et al., 2012) was based on a pre-experimental session. All subjects were included in the current analysis. This is now indicated in the Methods (lines 287 to 289).

      - On the other end of the range in Figure 1, differences between the FACS scores near the maximum range (40) are underestimated by 23 to 33 points! I guess that the RMSE is smaller (6-7 points), because many FACS scores are concentrated on the low end of the scale.

      This is a very interesting comment. A section discussing the limits of the model to predict the lower and higher FACS scores has been added in the manuscript (lines 232 to 250).

      It is of course acceptable to interpret the low similarity between signatures as a sign that each signature describes a different mechanism related to pain processing. However, I believe that a complete discussion should contemplate other competing hypotheses. Considering that all signatures were developed using a similar painful thermal stimulation protocol, it is reasonable to expect larger similarities between signatures. The fact that they are so dissimilar could be a reflection of model overfit, i.e., all these signatures are just fitted to these particular experimental protocols and data, and do not generalize to brain mechanisms of pain processing.

      We appreciate the pertinent observation. We have included a limitations section in which we discussed, among other considerations, the possible overfitting of models and the necessity of pursuing generalizability studies (lines 225 to 268).

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This is an important study on the regulation of chlorophyll biosynthesis in rice embryos. It provides insights into the genetic and molecular interactions that underlie chlorophyll accumulation, highlighting the inhibition of OsGLK1 by OsNF-YB7 and the broader implications for understanding chloroplast development and seed maturation in angiosperms. The results presented, including mutation analysis, gene expression profiles, and protein interaction studies, provide convincing evidence for the function of OsNF-YB7 as a repressor in the chlorophyll biosynthesis pathway.

      Thank you very much for your positive assessment of our manuscript. We have carefully revised the manuscript according to the reviewers’ valuable suggestions and comments. For more details, please see the point-to-point response to the reviewers below.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This manuscript investigates the regulation of chlorophyll biosynthesis in rice embryos, focusing on the role of OsNF-YB7. The rigorous experimental approach, combining genetic, biochemical, and molecular analyses, provides a robust foundation for these findings. The research achieves its objectives, offering new insights into chlorophyll biosynthesis regulation, with the results convincingly supporting the authors' conclusions.

      Strengths:

      The major strengths include the detailed experimental design and the findings regarding OsNF-YB7's inhibitory role.

      Weaknesses:

      However, the manuscript's discussion on the practical implications for agriculture and the evolutionary analysis of regulatory mechanisms could be expanded.

      Thank you for your insightful comments and suggestions. In the revised manuscript, we discussed the potential application of the chlorophyllous embryo (please see line 270-274). The presence of chlorophyll in the embryo facilitates photosynthesis at early developmental stages, potentially leading to improved seedling growth and vigor (Smolikova and Medvedev, 2016). In crops such as soybean and canola, green embryo is considered as a valuable trait due to its association with enhanced photosynthetic capacity, which consequently promotes fatty acid biosynthesis (Ruuska et al., 2004). However, chlorophyll degradation must be carefully managed during seed maturation to avoid negative effects on seed viability and meal quality (Chung et al., 2006). Interestingly, the green embryo of lotus (Nelumbo nucifera) is widely used as a food ingredient in Asian, Australia, and North America. It is employed in herbal medicine to treat nervous disorders, insomnia, and other conditions (Zhu et al., 2017; Ha et al., 2022), highlighting the significant potential value of the green embryo.

      In many chloroembryophytes, such as Arabidopsis, the embryo occupies a large proportion of the seed. From an evolutionary perspective, the presence of chlorophyll in the embryo may promote adaptation in such chloroembryophytes because more reserves can be accumulated in the seed through active photosynthesis, better supporting the embryo development and subsequent seedling growth (Sela et al., 2020). On the other hand, some leucoembryophytes, such as rice, have persistent endosperm rich in storage reserves to nourish embryo development (Liu et al., 2022). Gaining the ability to accumulate chlorophyll in the embryo is unnecessary for such species. In agreement with this hypothesis, cholorophyllous embryos are more prevalent in non-endospermous seeds (Dahlgren, 1980). However, we would like to emphasize that the evolutionary force driving the divergence of chloroembryophytes and leucoembryophytes is currently almost completely unknown and deserves in-depth investigation in the future. We discussed the possible evolution of the ability to accumulate chlorophyll in the embryo, please find the details in Line 276-295.

      Reviewer #2 (Public Review):

      Summary:

      The authors set out to establish the role of the rice LEC1 homolog OsNF-YB7 in embryo development, especially as it pertains to the development of photosynthetic capacity, with chlorophyll production as a primary focus.

      Strengths:

      The results are well-supported and each approach used complements each other. There are no major questions left unanswered and the central hypothesis is addressed in every figure.

      Weaknesses:

      There are a handful of sections that could use clarifying for readers, but overall this is a solidly composed manuscript.

      The authors clearly achieved their aims; the results compellingly establish a disparity between how this system operates in rice and Arabidopsis. Conclusions are thoroughly supported by the provided data and interpretations. This work will force a reconsideration of the value of Arabidopsis as a model organism for embryo chlorophyll biosynthesis and possibly photosynthesis during embryo maturation more broadly, as rice is a major crop organism and it very clearly does not follow the Arabidopsis model. It will thus be useful to carry out similar tests in other organisms rather than relying on Arabidopsis and attempting to more fully establish the regulatory mechanism in rice.

      Thank you very much for your positive comments. We have carefully revised the manuscript according to your and the other reviewers’ comments and suggestions. Particularly, we emphasized the necessary to carry out similar tests in other organisms rather than relying on Arabidopsis to better understand the regulatory mechanism in rice.

      Reviewer #3 (Public Review):

      Summary:

      In this study, the authors set out to understand the mechanisms behind chlorophyll biosynthesis in rice, focusing in particular on the role of OsNF-YB7, an ortholog of Arabidopsis LEC1, which is a positive regulator of chlorophyll (Chl) biosynthesis in Arabidopsis. They showed that OsNF-YB7 loss-of-function mutants in rice have chlorophyll-rich embryos, in contrast to Arabidopsis LEC1 loss-of-function mutants. This contrasting phenotype led the authors to carry out extensive molecular studies on OsNF-YB7, including in vitro and in vivo protein interaction studies, gene expression profiling, and protein-DNA interaction assays. The evidence provided well supported the core arguments of the authors, emphasising that OsNF-YB7 is a negative regulator of Chl biosynthesis in rice embryos by mediating the expression of OsGLK1, a transcription factor that regulates downstream Chl biosynthesis genes. In addition, they showed that OsNF-YB7 interacts with OsGLK1 to negatively regulate the expression of OsGLK1, demonstrating the broad involvement of OsNF-YB7 in rice Chl biosynthetic pathways.

      Strengths:

      This study clearly demonstrated how OsNF-YB7 regulates its downstream pathways using several in vitro and in vivo approaches. For example, gene expression analysis of OsNF-YB7 loss-of-function and gain-of-function mutants revealed the expression of selected downstream chl biosynthetic genes. This was further validated by EMSA on the gel. The authors also confirmed this using luciferase assays in rice protoplasts. These approaches were used again to show how the interaction of OsNF-YB7 and OsGLK1 regulates downstream genes. The main idea of this study is very well supported by the results and data.

      Weaknesses:

      From an evolutionary perspective, it is interesting to see how two similar genes have come to play opposite roles in Arabidopsis and rice. It would have been more interesting if the authors had carried out a cross-species analysis of AtLEC1 and OsNF-YB7. For example, overexpressing AtLEC1 in an osnf-yb7 mutant to see if the phenotype is restored or enhanced. Such an approach would help us understand how two similar proteins can play opposite roles in the same mechanism within their respective plant species.

      We appreciate your insightful comments and suggestions. It is a very interesting question whether AtLEC1 can fully restore osnf-yb7, given the possible functional divergence between the genes in terms of regulation of chlorophyll biosynthesis in the embryo. We have previously expressed OsNF-YB7 in the lec1-1 background in Arabidopsis, driven by the native promoter of LEC1 (Niu et al., 2021). We found that OsNF-YB7 could almost completely rescue the embryo defects in Arabidopsis, indicating that OsNF-YB7 plays a resemble role in rice as the LEC1 does in Arabidopsis (Niu et al., 2021). We sought to determine whether AtLEC1 can complement the chlorophyll defect in osnf-yb7. However, given the fact that osnf-yb7 shows severe callus induction defect, which is not surprising, because many studies have shown that LEC1 is indispensable for somatic embryo development in various plant species, we are struggling to obtain the genetic materials for analysis. We have to transform OsNF-YB7pro::AtLEC1 into the WT background first, and then cross the transformant with the osnf-yb7 mutant. This is a time-consuming process in rice, but hopefully we will able to isolate a line expressing OsNF-YB7pro::AtLEC1 in the osnf-yb7 background from the resulting segregating population.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      A minor comment regarding the chlorophyll contents quantification in the study. Line 87: "The results showed that WT had an achlorophyllous embryo throughout embryonic development,...." In the TEM result, chloroplast was not observed in the WT embryo sections, indicating a lack of chlorophyll-containing structures, contrary to what was found in the osnf-yb7 embryos where chloroplasts were observed.

      The authors stated that the embryo morphologies and Chl autofluorescence data showed that WT had an achlorophyllous embryo throughout embryonic development. However, the quantification of Chl levels in Figure 1D and Figure 4C showed that WT does produce some chlorophylls, albeit at lower levels than osnf-yb7 or OSGLK-OX embryos (WT values in the two figures are slightly different). This discrepancy warrants clarification to ensure consistency and accuracy in the manuscript's findings.

      We re-evaluated the Chl content in the embryos of WT and OsGLK1-OX mature seeds. The result confirmed our previous finding that WT embryos produce a small amount of chlorophyll (please see the updated Fig. 4C). Notably, we observed that the dark-grown etiolated plants still have measurable chlorophyll content as reported in many studies (for example, Wang et al., 2017; Yoo et al., 2019), suggesting that there is potential bias in measuring chlorophyll content using an absorbance-based approach. We assume this possibly explains the concern you have raised.

      Reviewer #2 (Recommendations For The Authors):

      Mild editing for grammar is needed throughout, e.g. line 73, "It is still a mysterious why plant species".

      We have carefully edited the grammar.

      As a minor point, the placement of figure panels, such as in Figure 1, is not always intuitive.

      Thank you for your suggestion. This figure has been revised as suggested. Please see the updated Fig. 1.

      What is the significance of the two GFP mutants in Figures 2C and 2D? Is one of those the mislabeled Flag mutant?

      The lines showed in Fig. 2C and D were not mislabeled. They were two independent transgenic events, both of which showed that OsNF-YB7 inhibited the expression of OsPORA and OsLHCB4 in rice. The transgenic lines overexpressing OsNF-YB7 tagging with the 3× Flag (NF-YB7-Flag) were also used for this experiment. In agreement, OsPORA and OsLHCB4 were significantly downregulated in the three independent NF-YB7-Flag lines (Fig. S4C), confirming the results showed in Fig. 2C and D.

      In Figures 2G and 2H, what is that enormous band at the bottom of the gel?

      The bands at the bottom of the gel were free probes. We indicated this in the revised figure.

      Not until the Materials and Methods section did I realize that any of this study was being done in tobacco; the Introduction implies it's rice vs. Arabidopsis and it might be a good idea to mention the organism of study somewhere before Figure 6.

      We apologize for any confusion caused by our previous writing. While the majority of this study was performed with rice plants or protoplasts, the split complementary LUC assays and BiFC assays were performed with tobacco. We have specified these in the revised manuscript as suggested.

      Reviewer #3 (Recommendations For The Authors):

      It would be nice if the author could show what the phenotype is in AtLEC1 OX in osnf-yb7 and also OsNF-YB7 OX in atlec1 mutants.

      Thank you for your suggestion. We have previously expressed OsNF-YB7 in the lec1-1 background of Arabidopsis, driven by the native promoter of Arabidopsis LEC1 (Niu et al., 2021). Since OsNF-YB7 could rescue the embryo morphogenesis defects in Arabidopsis (Niu et al., 2021), we assumed that OsNF-YB7 plays a similar role in rice as the LEC1 does in Arabidopsis. However, it remains unknown whether expression of LEC1 in osnf-yb7 may restore the chlorophyllous embryo phenotype in rice. As the generation of genetic material is time-consuming, and especially given the fact that osnf-yb7 has a severe callus induction defect, we are struggling to obtain the complementary line for analysis. We have to transform OsNF-YB7pro::AtLEC1 in a WT background first, and then cross the transformant with the osnf-yb7 mutant. Hopefully, we will be able to isolate a line expressing OsNF-YB7pro::AtLEC1 in osnf-yb7 background, from the derived segregating population. We discussed the reviewer’s concern in the revised manuscript, please see Line 369-376.

      Line 46, I think it is vague to mention that 'Like most plant species'. Some species might have different copy numbers, for example, a single GLK in liverwort M. polymorpha.

      The statement has been revised. Please see Line 46.

      Figures 2F and 5B, why was only one promoter region used for OsLHCB4? It would be better to have more regions like OsPORA.

      Thank you for your comments. Here, we have examined more promoter regions (P1, P2 and P3) in the revised manuscript as suggested, among which, the previously selected promoter region (P3) contains both the G-box and CCAATC motifs that can be potentially recognized by GLK1. Consistent to our previous report, the results showed that OsNF-YB7 (left) and OsGLK1 (right) were associated with the P3 region, but showed no significant differences in the other probes. Please see the results in Fig. 2F and Fig. 5B of the revised manuscript.

      Legend of Figures 2G, H, OsPORA (I), and OsLHCB (J) should be (G) and (H) respectively.

      Corrected.

      References

      Chung, D.W., Pruzinska, A., Hortensteiner, S., and Ort, D.R. (2006). The role of pheophorbide a oxygenase expression and activity in the canola green seed problem. Plant Physiol 142, 88-97.

      Ha, T., Kim, M.S., Kang, B., Kim, K., Hong, S.S., Kang, T., Woo, J., Han, K., Oh, U., Choi, C.W., and Hong, G.S. (2022). Lotus Seed Green Embryo Extract and a Purified Glycosyloxyflavone Constituent, Narcissoside, Activate TRPV1 Channels in Dorsal Root Ganglion Sensory Neurons. J Agric Food Chem 70, 3969-3978.

      Liu, J., Wu, M.W., and Liu, C.M. (2022). Cereal Endosperms: Development and Storage Product Accumulation. Annu Rev Plant Biol 73, 255-291.

      Niu, B., Zhang, Z., Zhang, J., Zhou, Y., and Chen, C. (2021). The rice LEC1-like transcription factor OsNF-YB9 interacts with SPK, an endosperm-specific sucrose synthase protein kinase, and functions in seed development. Plant J 106, 1233-1246.

      Ruuska, S.A., Schwender, J., and Ohlrogge, J.B. (2004). The capacity of green oilseeds to utilize photosynthesis to drive biosynthetic processes. Plant Physiol 136, 2700-2709.

      Sela, A., Piskurewicz, U., Megies, C., Mene-Saffrane, L., Finazzi, G., and Lopez-Molina, L. (2020). Embryonic Photosynthesis Affects Post-Germination Plant Growth. Plant Physiol 182, 2166-2181.

      Smolikova, G.N., and Medvedev, S.S. (2016). Photosynthesis in the seeds of chloroembryophytes. Russ J Plant Physl+ 63, 1-12.

      Wang, Z., Hong, X., Hu, K., Wang, Y., Wang, X., Du, S., Li, Y., Hu, D., Cheng, K., An, B., and Li, Y. (2017). Impaired Magnesium Protoporphyrin IX Methyltransferase (ChlM) Impedes Chlorophyll Synthesis and Plant Growth in Rice. Front Plant Sci 8, 1694.

      Yoo, C.Y., Pasoreck, E.K., Wang, H., Cao, J., Blaha, G.M., Weigel, D., and Chen, M. (2019). Phytochrome activates the plastid-encoded RNA polymerase for chloroplast biogenesis via nucleus-to-plastid signaling. Nat Commun 10, 2629.

      Zhu, M., Liu, T., Zhang, C., and Guo, M. (2017). Flavonoids of Lotus (Nelumbo nucifera) Seed Embryos and Their Antioxidant Potential. J Food Sci 82, 1834-1841.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      As a reviewer for this manuscript, I recognize its significant contribution to understanding the immune response to saprophytic Leptospira exposure and its implications for leptospirosis prevention strategies. The study is well-conceived, addressing an innovative hypothesis with potentially high impact. However, to fully realize its contribution to the field, the manuscript would benefit greatly from a more detailed elucidation of immune mechanisms at play, including specific cytokine profiles, antigen specificity of the antibody responses, and long-term immunity. Additionally, expanding on the methodological details, such as immunophenotyping panels, qPCR normalization methods, and the rationale behind animal model choice, would enhance the manuscript's clarity and reproducibility. Implementing functional assays to characterize effector T-cell responses and possibly investigating the microbiota's role could offer novel insights into the protective immunity mechanisms. These revisions would not only bolster the current findings but also provide a more comprehensive understanding of the potential for saprophytic Leptospira exposure in leptospirosis vaccine development. Given these considerations, I believe that after substantial revisions, this manuscript could represent a valuable addition to the literature and potentially inform future research and vaccine strategy development in the field of infectious diseases.

      Reviewer #2 (Public Review):

      Summary:

      The authors try to achieve a method of protection against pathogenic strains using saprophytic species. It is undeniable that the saprophytic species, despite not causing the disease, activates an immune response. However, based on these results, using the saprophytic species does not significantly impact the animal's infection by a virulent species.

      Strengths:

      Exposure to the saprophytic strain before the virulent strain reduces animal weight loss, reduces tissue kidney damage, and increases cellular response in mice.

      Weaknesses:

      Even after the challenge with the saprophyte strain, kidney colonization and the release of bacteria through urine continue. Moreover, the authors need to determine the impact on survival if the experiment ends on the 15th.

      Reviewer #3 (Public Review):

      Summary:

      Kundu et al. investigated the effects of pre-exposure to a non-pathogenic Leptospira strain in the prevention of severe disease following subsequent infection by a pathogenic strain. They utilized a single or double exposure method to the non-pathogen prior to challenge with a pathogenic strain. They found that prior exposure to a non-pathogen prevented many of the disease manifestations of the pathogen. Bacteria, however, were able to disseminate, colonize the kidneys, and be shed in the urine. This is an important foundational work to describe a novel method of vaccination against leptospirosis. Numerous studies have attempted to use recombinant proteins to vaccinate against leptospirosis, with limited success. The authors provide a new approach that takes advantage of the homology between a non-pathogen and a pathogen to provide heterologous protection. This will provide a new direction in which we can approach creating vaccines against this re-emerging disease.

      Strengths:

      The major strength of this paper is that it is one of the first studies utilizing a live non-pathogenic strain of Leptospira to immunize against severe disease associated with leptospirosis. They utilize two independent experiments (a single and double vaccination) to define this strategy. This represents a very interesting and novel approach to vaccine development. This is of clear importance to the field.

      The authors use a variety of experiments to show the protection imparted by pre-exposure to the non-pathogen. They look at disease manifestations such as death and weight loss. They define the ability of Leptospira to disseminate and colonize the kidney. They show the effects infection has on kidney architecture and a marker of fibrosis. They also begin to define the immune response in both of these exposure methods. This provides evidence of the numerous advantages this vaccination strategy may have. Thus, this study provides an important foundation for future studies utilizing this method to protect against leptospirosis.

      Weaknesses:

      Although they provide some evidence of the utility of pretreatment with a non-pathogen, there are some areas in which the paper needs to be clarified and expanded.

      The authors draw their conclusions based on the data presented. However, they state the graphs only represent one of two independent experiments. Each experiment utilized 3-4 mice per group. In order to be confident in the conclusions, a power analysis needs to be done to show that there is sufficient power with 3-4 mice per group. In addition, it would be important to show both experiments in one graph which would inherently increase the power by doubling the group size, while also providing evidence that this is a reproducible phenotype between experiments. Overall, this weakens the strength of the conclusions drawn and would require additional statistical analysis or additional replicates to provide confidence in these conclusions.

      A direct comparison between single and double exposure to the non-pathogen is not able to be determined. The ages of mice infected were different between the single (8 weeks) and double (10 weeks) exposure methods, thus the phenotypes associated with LIC infection are different at these two ages. The authors state that this is expected, but do not provide a reasoning for this drastic difference in phenotypes. It is therefore difficult to compare the two exposure methods, and thus determine if one approach provides advantages over the other. An experiment directly comparing the two exposure methods while infecting mice at the same age would be of great relevance to and strengthen this work.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Major Comments

      (1) Elucidation of Immune Mechanisms: The manuscript intriguingly suggests that exposure to saprophytic Leptospira primes the host for a Th1-biased immune response, contributing to survival and mitigation of disease severity upon subsequent pathogenic challenge. However, the underlying mechanisms remain broadly defined. A more detailed investigation into the cytokine profiles, particularly the levels of IFN-γ, IL-12, and other Th1-associated cytokines, could clarify the mechanism of Th1 bias. Moreover, exploring the role of antigen-presenting cells (APCs) in priming T cells towards a Th1 phenotype would add valuable insights.

      In this study we continue to elucidate the immune mechanisms engaged by pathogenic and non-pathogenic Leptospira as a follow up to our previous work (Shetty et al, 2021 PMID: 34249775, and Kundu et al 2022 PMID 35392072). We, and others, have shown that saprophytic L. biflexa and pathogenic L. interrogans induce major chemo-cytokines associated with Th1 biased immune responses (Shetty et al. 2021; Cagliero et al. 2022; Krangvichian et al. 2023) and engage myeloid immune cells such as macrophages and dendritic cells. The role of antigen presenting cells such as dendritic cells in priming T cells and activating adaptive response is a separate question and can be addressed in the future. To further address this question, a recent mechanistic study (Krangvichian et al. 2023) showed that non-pathogenic leptospires (L. biflexa) promote MoDC maturation and stimulate the proliferation of IFN-γ-producing CD4+ T cells and potentially elicit a Th1-type response in mice, which also supports our current claim and it is referenced in our manuscript.

      (2) Quantitative Analysis of Kidney Colonization: The manuscript reports that pre-exposure to L. biflexa did not prevent the colonization of kidneys by L. interrogans but led to a more regulated immune response and reduced fibrosis. A more nuanced quantification of bacterial loads in the kidneys, using techniques such as CFU counting or more sensitive qPCR methods, could provide a clearer picture of how saprophytic exposure affects the ability of pathogenic Leptospira to establish infection. Additionally, a time-course study showing the kinetics of bacterial colonization and clearance post-infection would be informative.

      We are currently validating digital PCR to use in the future and plan to do time course studies.

      (3) Characterization of B Cell and T Cell Responses: While the manuscript mentions increased B cell frequencies and effector T helper cell responses, specifics regarding the nature of these responses are lacking. For instance, detailing the isotype and specificity of antibodies produced, the proliferation rates of specific B and T cell subsets, and their functional capabilities (e.g., cytotoxicity, help for B cells) would significantly enrich the understanding of the immune response elicited by pre-exposure to saprophytic Leptospira.

      Indeed, additional experiments need to be conducted to flush out the immune responses engaged after pre-exposure to saprophytic Leptospira followed by LIC challenge.

      (4) Comparative Analysis with Other Models of Pre-exposure: The study primarily focuses on pre-exposure to a live saprophytic Leptospira. Including a comparison with pre-exposure to killed saprophytic bacteria, or even to other non-pathogenic microbes, could help discern whether the observed protective effect is unique to live saprophytic Leptospira exposure or if it represents a more general phenomenon of trained immunity.

      Regarding the use of other non-pathogenic microbes, our lab has shown in the past that oral use of probiotic strain Lactobacillus plantarum (Potula et al 2017) also reduces the severity of Leptospirosis by recruiting myeloid cells. Thus, there may be a general phenomenon of trained immunity involved. We added this to the discussion.

      (5) Assessment of Long-term Immunity: The study provides valuable insights into the short-term outcomes following saprophytic Leptospira exposure and subsequent pathogenic challenge. Extending these observations to assess long-term immunity, including memory B and T cell responses several months post-infection, would be crucial for understanding the potential of saprophytic Leptospira exposure in providing lasting protection against leptospirosis.

      Long term immunity is a complex and separate question that we plan to address later.

      Minor Comments

      (1) Technical Specifics of Flow Cytometry Analysis: The manuscript could benefit from including more details on the flow cytometry gating strategy and the specific markers used to identify different immune cell subsets. This addition would aid in the reproducibility of the results and allow for a clearer interpretation of the immune profiling data.

      We included the technical specifics of the flow-cytometry analysis in the materials and methods section. The gating strategy (Fig S1) and the specific markers (TableS1) used to identify different immune cell subsets were incorporated in the supplementary datasheet. The cell specific markers were incorporated in the figures (Fig 5 and 6) under each representative cell subset which facilitates clarity and reproducibility of immune profiling.

      (2) Statistical Methodology for IgG Subtyping: The analysis of IgG subtypes in response to Leptospira exposure is intriguing but would be strengthened by specifying the statistical tests used to compare IgG1, IgG2a, and IgG3 levels between groups. Additionally, discussing the biological significance of the observed differences in IgG subtype levels would provide a more comprehensive understanding of the immune response.

      We applied the ordinary One-way ANOVA test to compare the IgG subtypes between groups followed by a Tukey’s multiple comparison correction analysis (included in the figure legend of Fig 4). We addressed the biological relevance of the observed differences in IgG subtype levels in the discussion section.

      (3) Details on Animal Welfare and Ethical Approval: While the manuscript mentions compliance with institutional animal care and use committee protocols, providing the specific ethical guidelines followed, such as the 3Rs (Replacement, Reduction, Refinement), would reinforce the commitment to ethical research practices.

      This is addressed in our institutional IACUC which is approved and listed in Methods.

      (4) Clarification of Figure Legends: Some figure legends are brief and could be expanded to more thoroughly describe what the figures show, including details on what specific data points, error bars, and statistical symbols represent.

      We updated and expanded the figure legends (Fig 1-4).

      (5) Revision of Introduction and Background: The introduction provides a good overview of leptospirosis and the rationale behind the study. However, it could be further improved by briefly summarizing current challenges in vaccine development against leptospirosis and how understanding the immune response to saprophytic Leptospira could address these challenges.

      We revised the introduction keeping this comment in mind.

      Reviewer #2 (Recommendations For The Authors):

      - Perform the same challenge experiment with a hamster.

      We clarified throughout the manuscript that all the work was done using the C3H-HeJ mouse model which was developed in our lab for the purpose of measuring differences in sublethal and lethal LIC infections. We leave the experiments using hamster to the investigators that have thoroughly validated the hamster model of lethal Leptospira infection.

      - Review the written part where it is understood that the challenge with saprophyte strain before virulence prevents the disease.

      We reviewed the manuscript to be understood that inoculation of mice with a saprophyte Leptospira before pathogenic challenge prevents severe leptospirosis and promotes kidney homeostasis and increased shedding of Leptospira in urine which is interesting. The last 2 sentences of the abstract read: “Thus, mice exposed to live saprophytic Leptospira before facing a pathogenic serovar may withstand infection with far better outcomes. Furthermore, a status of homeostasis may have been reached after kidney colonization that helps LIC complete its enzootic cycle.”

      Reviewer #3 (Recommendations For The Authors):

      (1) Line 83: The authors refer to the classification of Leptospira by old nomenclature. The bacteria are now categorized into clades P1, P2, S1 and S2. See Vincent et al. Revisiting the taxonomy and evolution of pathogenicity of the genus Leptospira through the prism of genomics. PLoS Negl Trop Dis. 2019 May 23;13(5):e0007270. doi: 10.1371/journal.pntd.0007270. PMID: 31120895; PMCID: PMC6532842.

      We have included the categories (S1 for L. biflexa and P1+ for L. interrogans) in introduction and methods but we did not update the figures because we want to be specific about the species used in these experiments. We also include a few sentences on evolution of Leptospira species in discussion and reference Thibeaux 2018, Vincent 2019 and Giraud-Gatineau, 2024.

      (2) Line 133: Please remove the extra line to be consistent with the rest of the method section format.

      We addressed all formatting issues.

      (3) Line 137: Are these primers specific to pathogenic L. interrogans? Or do they cross react with L. biflexa? If not specific, how long does L. biflexa stick around after infection?

      The primers are specific to the genus Leptospira. Surdel et al. in 2022 used 16s rRNA target sequence to amplify L. biflexa Patoc in mice at 6 hours post infection. We did not detect any positive sample for L. biflexa with the 16s rRNA primer set because we do our analysis 30 days and 45 days post inoculation with L. biflexa. We clarified this issue in methods and results.

      (4) Statistical analysis:

      (a) Some of your graphs have more than 4 points on them (such as Figure 4), while the legend still reads "represents one of two independent experiments". Are these actually combined replicates in the same graph? Combining them would provide strength to your conclusions throughout your manuscript and may provide stronger power for comparisons. If they are not included, why are they not included together? Please clarify what is included in each graph, and why the two experiments were not included together.

      We updated the legends with the total number of mice used in the experiment represented in the figure. Figures 1, 2, 4 and S2 contain the combined results from two independent experiments. Figures 3, 5 and 6 represent data from one of two independent experiments. For Fig 3 it would be redundant to show HE images of two experiments. Regarding Figs 5 and 6, the flow-cytometry equipment acquires data at different voltage every single time and biological samples vary between experiments even if all the markers and procedures are the same. So, we reproduce the experiment and show results from one experiment after confirming that the trend between individual experiments are the same.

      (b) If ANOVA was used, were all columns compared to each other? Why in some graphs are "ns" labeled only for certain comparisons? I would suggest removing the "ns" comparisons and only highlighting the significant differences.

      We have incorporated the comparison analysis between control (PBS) versus the PBS-LIC, LB versus LB-LIC and PBS-LIC versus LB-LIC in both the studies although we have compared significance between all groups.

      (5) Line 165: Bacteria were not plated, extract was plated. Perhaps you mean "extract corresponding to 107-108 bacteria"?

      We addressed it as follows: “Nunc MaxiSorp flat-bottom 96 well plates (eBioscience, San Diego, CA) were coated with extracts prepared from 107-108 bacteria per well and incubated at 4℃ overnight” …

      (6) Line 260: The authors claim that "Exposure to non-pathogenic L. biflexa before pathogenic L. interrogans challenge provided a significant immune cell boost with an increase in overall B and helper T cell frequencies..." However, in Figure 5A, the number of B cells in both the PBS2LIC2 and the LB2LIC2 are not significantly different. Thus, the claim is not supported by the evidence provided. It appears that infection with LIC led to similar increases in B cells regardless of pretreatment.

      We rephrased that title to reflect the finding that increased differences were measured in effector Helper T cells between PBS2LIC2 and LB2LIC2 (Figs 5D and 6B, 6C) and we re-wrote this section for clarity.

      (7) Lines 314-315: The authors claim that it protected against kidney fibrosis, however, the data only supports that only a single exposure to LB reduced levels of a marker associated with kidney fibrosis. Fibrosis was never directly measured.

      Indeed, we didn’t do Mason’s Trichrome stain to get supporting data for kidney fibrosis and only measured a fibrosis marker ColA1. We toned down this section: “ …. it may confer protection against kidney fibrosis.”

      (8) Line 317: Authors state that pre-exposure induced higher antibodies in serum, however, this was never shown. Only an increase in IgG2a was shown. Please word this statement to make it clear total antibodies were never measured.

      We did measure total anti-Leptospira interrogans IgM and IgG antibodies. We added the following sentence to description of these results: “In both experiments, total IgM and IgG were significantly increased in PBS-LIC and LB-LIC when compared to the respective controls, but not between PBS-LIC and LB-LIC.  Regarding IgG isotypes, IgG1…”

      (9) Line 323: The authors state that the exposure "induced antibody responses that provided heterologous protection." There is no evidence that the protection is due to the antibody response in these experiments. In fact, they also showed that it induced increased T cell responses.

      We toned down this statement: “In our study, exposure to a saprophytic Leptospira induced antibody responses that may provide heterologous protection against the pathogenic strain of Leptospira.”

      (10) Line 328: The authors us the term "stark difference", however, only slight differences are seen.

      We toned down that statement as follows:  “Differences in antibody titer among the L. interrogans infected….”

      (11) Line 490: reword this sentence to provide clarity and easier to read: "inoculated once with 10^8 L. biflexa at 6 weeks and they were challenged with 10^8 L. interrogans SEROVAR Copenhageni FioCruz (LIC) at 8 weeks."

      We revised the sentence.

      (12) Figure 1 and 2: Quantifying bacteria in culture after infection is not meaningful, as there are numerous factors that can affect the replication in culture after infection, such as how the organ perhaps was cut before placing it in culture. The comparisons in Figure 2E and F therefore are not interpretable. I would suggest presenting this data as Culture Positive or Culture Negative.

      We added these data to the figure under DFM (dark field microscopy).

      (13) Figure 3A: H&E staining often leads to different qualities of stains. But is there a better image that can be chosen for the PBS1LIC1 that provides a better comparison with the other images chosen? This is not worth repeating the experiment to get one, just make the figure look better if you have one available.

      We screened the images again but the one incorporated in the figure3A for PBS1LIC1 is the best.

      (14) Figure 3D: I agree that the PBS-LIC treatment is significant, but please include P value, as it looks very similar to the LB-LIC group. The two LIC groups are not significantly different, so the conclusion would be pre-exposure does not mitigate renal fibrosis marker ColA in the double-exposure study.

      We included the p-values in this figure. The two LIC groups are significantly different (ColA1) in the single exposure experiment, and the in double exposure we don’t expect to be able to measure ColA1 differences because the mice are older (10 wk) when we do the LIC challenge.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews.

      Reviewer #1 (Public Reviews):

      Summary:

      The "optorepressilator", an optically controllable genetic oscillator based on the famous E. coli 3-repressor (LacI, TetR, CI) oscillator "repressilator", was developed. An individual repressilator shows a stable oscillation of the protein levels with a relatively long period that extends a few doubling times of E. coli, but when many cells oscillate, their phases tend to desynchronize. The authors introduced an additional optically controllable promoter through a conformal change of CcaS protein and let it control how much additional CI is produced. By tightly controlling the leak from the added promoter, the authors successfully kept the original repressilator oscillation when the added promoter was not activated. In contrast, the oscillation was stopped by expressing the additional CI. Using this system, the authors showed that it is possible to synchronise the phase of the oscillation, especially when the activation happens as a short pulse at the right phase of the repressilator oscillation. The authors further show that, by changing the frequency of the short pulses, the repressilator was entrained to various ratios to the pulse period, and the author could reconstruct the so-called "Arnold tongues", the signature of entrainment of the nonlinear oscillator to externally added periodic perturbation. The behaviour is consistent with the simplified mathematical model that simulates the protein concentration using ordinary differential equations.

      Strengths:

      Optical control of the oscillation of the protein clock is a powerful and clean tool for studying the synthetic oscillator's response to perturbation in a well-controlled and tunable manner. The article utilizes the plate reader setup for population average measurements and the mother machine setup for single-cell measurements, and they compensate nicely to acquire necessary information.

      Weakness:

      The current paper added the optogenetically controlled perturbation to control the phase of oscillation and entrainment, but there are a few other works that add external perturbation to a collection of cells that individually oscillate to study phase shift and/or entrainment. The current paper lacks discussion about the pros and cons of the current system compared to previously analyzed systems.

      Recommendations

      Even if the main purpose of the current paper is to develop a toolbox, it is beneficial to emphasize the pros and cons of the current system compared to the existing work. In addition to the ref [36] that authors cite but do not discuss concretely, example literature about entrainment includes:

      - Sanchez, P.G.L., Mochulska, V., Denis,  C.M., Mönke, G., Tomita, T., Tsuchida-Straeten, N., Petersen, Y., Sonnen, K., François, P. and Aulehla, A., 2022. Arnold tongue entrainment reveals dynamical principles of the embryonic segmentation clock. Elife, 11, p.e79575.

      - Heltberg M, Kellogg RA, Krishna S, Tay S, Jensen MH. Noise induces hopping between NF-κB entrainment modes. Cell systems. 2016 Dec 21;3(6):532-9.

      There is surely more literature. It is recommended that a solid discussion be added on the relation between existing works and current work.

      We thank the Reviewer for their positive comments on our manuscript. Their main recommendation is to expand literature and discuss how our method compares to previously reported entrainment of genetic oscillators. In summary, we believe that the main advantages of the optorepressilator are the simplicity of the transcriptional network combined with the flexibility of optical control. In the “Discussion” section of the revised manuscript, we now try to highlight this also in connection to the suggested literature.

      Reviewer #2 (Public Reviews):

      Summary

      In this manuscript by Cannarsa et. al., the authors describe the engineering of a light-entrainable synthetic biological oscillator in bacteria. It is based on an upgraded version of one of the first synthetic circuits to be constructed, the repressilator. The authors sought to make this oscillator entrainable by an external forcing signal, analogous to the way natural biological oscillators (like the circadian clock) are synchronized. They reasoned that an optogenetic system would provide a convenient and flexible means of manipulation. To this end, the authors exploited the CcaS-CcaA light-switchable system, which allows activation and deactivation of transcription by green and red light, respectively. They used this system to make the expression of one of the repressilator's transcription factors (lacI) light-controlled, from a construct separated from the main repressilator plasmid. This way, under red light the oscillator runs freely, but exposure to green light causes overexpression of the lacI, pushing the system into a specific state. Consequently, returning to red light will restore the oscillations from the same phase in all cells, effectively synchronizing the cell population.

      After demonstrating the functionality of the basic concept, the authors combined modeling and experiments to show how periodic exposure to green light enables efficient entrainment, and how the frequency of the forcing signal affects the oscillatory behavior (detuning).

      This work provides an important demonstration of engineering tunability into a foundational genetic circuit, expands the synthetic biology toolbox, and provides a platform to address critical questions about synchronization in biological oscillators. Due to the flexibility of the experimental system, it is also expected to provide a fertile ground for future testing of theoretical predictions regarding non-linear oscillators.

      Strengths:

      The study provides a simple and elegant mechanism for the entertainment of a synthetic oscillator. The design relies on optogenetic proteins, which enable efficient experimentation compared to alternative approaches (like using chemical inducers). This way, a static culture (without microfluidics or change of growth media) can be easily exposed to flexible temporal sequences of the zeitgeber, and continuously measured through time.

      The study makes use of both plate-reader-based population-level readout and mother-machine single-cell measurements. Synchronization through entrainment is a single cell level phenomenon, but with a clear population-level manifestation. Thus, this experimental approach combination provides a strong validation to their system. At the same time, differences between the readout from the two systems have emerged, and provided a further opportunity for model refinement and testing.

      The authors correctly identified the main optimization goal, namely the effective leakiness of their construct even under red light. Then, they successfully overcame this issue using synthetic biology approaches.

      The work is supported by a simplified model of the repressilator, which provides a convenient analytical and numerical means to draw testable predictions. The model predictions are well aligned with the experimental evidence.

      Weaknesses:

      Even after optimizing the expression level of the light-sensitive gene, the system is very sensitive, i.e., a very short exposure is sufficient to elicit the strongest entertainment. This limited dynamic range might hamper some model testing and future usage.

      As a result of the previous point, the system is entrained by transiently "breaking" the oscillator: each pulse of green light represents a Hopf bifurcation into a single attractor. it means that the system cannot oscillate in constant green light. In comparison, this is generally not the case for natural zeitgebers like light and temperature for the circadian rhythms. Extreme values might prevent oscillations (not necessarily due to breaking the core oscillator), but usually, free running is possible in a wide range of constant conditions. In some cases, the free-running period length will vary as a function of the constant value. While the approach presented in this manuscript is valid, a comprehensive analysis of more subtle modes of repressilator entrainment could also be of value.

      The entire work makes use of a single intensity and single duration of the green pulse to force entrainment. While the model has clear predictions for how those modalities should affect entrainment, none of the experiments attempted to validate those predictions.

      While we agree with the Reviewer that all reported experiments were performed with pulses of constant amplitude and duration, we do not see this as a necessary limitation for future studies on the optorepressilator. Using pulse-width modulation, green light intensity could be easily and continuously modulated from zero to a maximum value (as in Fig. 4), exploring a wide range of intermediate intensity levels and therefore of mean LacI production rates from the optogenetic promoter. We do not include additional experiments in the revised manuscript but we have greatly expanded the theoretical discussion on the low amplitude regime, both for a constant illumination (new Supplementary Materials Section 5) and the pulsed case (new Supplementary Fig. 8).

      Recommendations for the Authors:

      (1) The introduction emphasized the utility of entrainment as a means to achieve population-wide synchrony. It is worth mentioning also that it enables synchronization of the internal oscillator with an external zeitgeber, to achieve a specific phase-locking between them. Often, this is the main utility attributed to entrainment, e.g., in circadian clocks.

      Following Reviewer’s suggestion we now say in the introduction:

      These oscillations maintain a constant phase relation to the external light cue that can act as a zeitgeber.

      (2) It is sometimes unclear at first glance which of the figure panels show simulation data and which show experimental data (e.g., Figure 5a,b; Figure 6a,b). More explicitly labeling the panels could help.

      We thank the reviewer for pointing this out, we now explicitly label all the panels.

      (3) Figure 3b - please add a color bar to indicate the meaning of the red-green scale, and enlarge the markers so their color is more visible. Also, can add additional controls of (i) sfGFP expression without the ccaR, and (ii) the autofluorescent signal from wild type. Please also provide the raw data (not the time derivatives) in a supplementary figure.

      A colorbar has been added and markers enlarged.

      (i) Unfortunately we do not have a control for GFP expression without ccaR.

      (ii) autofluorescence signal from “a negative control consisting of DHL708 with plasmids pNO286-3 and pSR58-0 (optogenetic plasmids without sfGFP cassette)” has been added for comparison to Fig.3b. This modification was actually very helpful in understanding that the sensitivity threshold in our experiments is mainly determined by autofluorescence. OD600 and fluorescence raw data are now provided in Supplementary Fig. 6.

      (4) Figure 3d - the claim in the text is that the purple optorepressilator and the wildtype repressilator have identical periods and amplitude. However, it seems from the figure that there is a small difference in the period length. This deviation is not problematic in any way, but I wondered whether it might actually be explained by the model, assuming that there is still a very weak leak from the new construct. In other words, would the model predict a bifurcation diagram in which an increasing x' concentration causes a gradual decrease in amplitude and increase in period, before the loss of rhythmicity? If so, Figure 3d can serve not only as a technical optimization demonstration but also as a nice validation of the model.

      We thank the reviewer for raising this interesting point. We now report, in Supplementary Materials Section 5, a theoretical prediction of the period with respect to a constant concentration of x'. For our choice of parameters (adjusted to reproduce the main experimental quantitative features) we find a period that decreases with x'. Leakage would therefore lead to a shorter period, contrary to what is observed experimentally. To explain the longer period observed in the optorepressilator we went back to extract the average growth rates of bacteria in the purple optorepressilator and repressilator curves in Fig.3d. As we now discuss in the main text:

      “The slight difference in period can be explained by the presence of additional plasmids in the optorepressilator strain, which results in a lower growth rate (Supplementary Figures 4 and 5). As found in the digital approximation, the repressilator period is mainly controlled by the inverse growth rate (see Figure 1a and Supplementary Figure 9) meaning a lower growth rate results in a longer oscillation period. When we normalize the time with the growth rate the two oscillations overlap nicely (Supplementary Figure 4).”

      (5) Supplementary Figure 10 has no reference from the main text. it is unclear what's the difference from Figure 3. In general, many items in the supplementary materials are not referenced from the text. In addition, on many occasions, there is a reference to "supplementary information" without a specific address, which is not so useful to the reader. In any case possible, please be more specific. Also, note that there's inconsistency in referring to the supplemental section as "supplementary materials" vs "supplemental information".

      We now explicitly reference all Supplementary Figures in the main text and use consistent reference to Supplementary Materials.

      (6) The discussion at the bottom of p.7 ("Optogenetic entrainment") is missing a reference to the duration and intensity of the zeitgeber: In the example from human circadian rhythms it doesn't indicate light intensity; In the modeling of the PRC, both modalities are absent. it is important at least to indicate the parameters used for the simulation and experiments. It would be even better to explore in the model how these modalities affect the PRC and entrainment. And it would be incredible if the authors could show this also experimentally.

      We now report the light intensity values for:

      - our experiments:

      “We first demonstrate this by monitoring the population signal from CFP (reporting TetR or 𝑦 in the model) in multiwell cultures under constant red illumination (9.82 W/m^2) interrupted by green light pulses (5.64 W/m^2) with a duration of 2 h and period 𝑇 = 18 h.”

      For mother machine experiments “Green and red light stimuli were provided by the two LEDs (Thorlabs M530L4, Thorlabs M660L4) with respective intensities 6 W/m^2 and 26 W/m^2 for the synchronization experiments, and 1.1 W/m^2 and 4.5 W/m^2 for the entrainment experiments.”

      - and simulations:

      “In Fig. 5a we report the phase shift produced by a single pulse (with duration tau=2 h and intensity beta’=80 h-1 fixed for all the simulations) as a function of the pulse arrival phase ϕ.”

      We also added an additional supplementary figure (Supplementary Fig. 7) that explores how the duration and intensity of the light pulses affect the PRC in the model. An approximate analytic result is also derived for the PRC in the digital approximation that compares very well with simulation, providing physical insight into PRC shape (Supplementary Materials Section 7).

      (7) The experimental validation of the PRC can be much more thorough. Notably, an entrainment experiment with repeated pulses does not provide the same level of validation as a proper PRC experiment. This is because many differently shaped PRCs can give rise to the same entrainment pattern, as long as their fixed-point phases are the same.

      Luckily, there might already be a decent amount of data from the mother machine experiments to fit with the PRC prediction, given the authors have pulsed a non-synchronized population that spans the entire x-axis of the PRC. It is possible that a proper PRC experiment wouldn't be too difficult with the plate reader either, given the throughput of the author's system.

      This is a very interesting suggestion but unfortunately, in our mother machine data, the first pulse arrives before the cells have completed a full cycle, so although different cells receive the first pulse at a sufficiently randomized phase, we can’t extract their individual phases at the pulse arrival time.

      Indeed it would be possible to design a plate reader experiment for the specific purpose of directly measuring the PRC. However, our current protocol involves continuous manual dilutions, which makes it rather laborious. We are currently working on an automated procedure that will allow us to systematically address this and other interesting suggestions in the future.

      An indirect experimental validation of the PRC is however still possible using available data. See added red points in Fig.5a and reply to point 10 below.

      (8) The discrepancy between the mother machine and plate reader experiments in Figure 5 is explained by a difference in growth rate variability in the two systems. It is not readily obvious how a difference in variability rather than the mean value of the period length can cause a shifted mean phase. It is only hinted in the text that growth rate has two different effects - on the period as well as the amplitude. I hypothesize that because of this period and amplitude correlation, there is a bias contribution to the sum of trajectories that have resulted in a shifted mean phase. Maybe there is another contribution from the asymmetric waveform of the signal? or from the distribution the alpha is sampled from? A direct discussion on that point will make the results much clearer. If the period-amplitude speculation above is right, please add also a panel that shows it. It will also be helpful to show the predicted PRC for the two parameter regimens.

      We thank the reviewer for highlighting this point. In the previous version of the manuscript we omitted the fact that in order to better match experimental signals we chose slightly different values for T_L/T_0 for simulations in Fig. 5d and 5e. We now report the values of all simulation parameters in the revised manuscript. This difference could also contribute to the shift in the mean phase for the two cases. We added this information in the main text.

      “The bottom panel in Fig. 5d shows the result of a numerical simulation with the same parameters as in Fig. 1b and the addition of a periodic light stimulation, with period $T_L/T_0 = 1$} [...] For the simulations in the lower panel of Fig.5e, all parameters remained the same as in Fig.5d with the exception of the period of the light pulses (T_L/T_0 = 0.97) and the standard deviation of the growth rate distribution, which was increased from 0.034 h^-1 to 0.071 h^-1 to better reproduce the experimental observations in the mother machine.”

      Additionally, we added a supplementary figure (Supplementary Fig. 9) demonstrating the correlation between period and amplitude of the oscillations, for simulations with varying growth rate.

      (9) The results from the detuning experiments are really nice, especially the decomposition in high frequency shown in Figure 6c. However, the experiments explore only the very high forcing amplitude conditions. Is there any way to test the weaker forcing regimens, as these are expected to uncover the interesting areas in between the Arnold's tongues? If this is experimentally difficult, it would be interesting to include at least the model prediction.

      We thank the reviewer for stimulating us to go in this direction. We have performed simulations to explore model predictions for areas between the Arnold’s tongues. We find onset of entrainment as the amplitude increases and also the existence of intermediate plateaus at fractional frequency ratios. These results are now included in the Supplementary Fig. 8.

      (10) Another prediction from the Arnold's tongue would be the relative phase of entrainment in different f/v0 conditions. The text refers to it very briefly, but this is a quantitative prediction that can be demonstrated clearly in a figure - how well do they match? It can be shown, for example, by a plot with f/v0 on the x-axis, the phase difference between the pulse and peak expression on the y-axis, a curve representing the model prediction for that function, and dots (with error bars) representing the calculated values from the experimental data.

      Generally, when suitable, this kind of direct comparison is more useful to the reader than the way the authors chose to compare simulation and experiments throughout the manuscript.

      We thank the reviewer for this very interesting suggestion. We have completely rewritten the discussion on entrainment commenting on how the same PRC (phase shift vs pulse arrival phase) can be interpreted as a T_L/T_0-1 vs phase difference plot. Indeed in the new Fig.5a we plot over the theoretical PRC curve, the values of the relative phase of entrainment for three values of the period of the light pulses (from the data in Fig. 6b). The agreement is remarkably good, providing a further experimental validation of the predicted PRC.

      (11) The raw data can be valuable for the community for reanalysis and further hypothesis testing. Hence, it will be very useful to make all of the data (e.g., the fluorescence signal quantification tables from all the experiments) publicly available.

      We prepared files with all raw data, to be made available to the community.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Reviews):

      Summary: 

      The authors use a combination of biochemistry and cryo-EM studies to explore a complex between the cap-binding complex and an RNA binding protein, ALYREF, that coordinates mRNA processing and export.

      Strengths: 

      The biochemistry and structural biology are supported by mutagenesis which tests the model in vitro. The structure provides new insight into how key events in RNA processing and export are likely to be coordinated.

      Weaknesses: 

      The authors provide biochemical studies to confirm the interactions that they identify; however, they do not perform any studies to test these models in cells or explore the consequences of mRNA export from the nucleus. In fact, several of the amino acids that they identified in ALYREF that are critical for the interaction, as determined by their own biochemical studies, are conserved in budding yeast Yra1 (residues E124/E128 are E/Q in budding yeast and residues Y135/V138/P139 are F/S/P), where the impact on poly(A) RNA export from the nucleus could be readily evaluated. The authors could at least mention this point as part of the implications and the need for future studies. No one seems to have yet targeted any of these conserved residues, so this would be a logical extension of the current work.

      We thank the reviewer for the feedback on our work. ALYREF coordinates pre-mRNA processing and export through interactions with a plethora of mRNA biogenesis factors including the DDX39B subunit of the TREX complex, CBC, EJC, and 3’ processing factors. ALYREF mediates the recruitment of the TREX complex on nascent transcripts which depends on its interactions with both CBC and EJC. Our work and studies by others indicate that ALYREF uses overlapping interfaces including both the N-terminal WxHD motif and the RRM domain to bind CBC and EJC. Thus, ALYREF mutants deficient in CBC interaction will also disrupt the ALYREF-EJC interaction and are not ideal for functional studies. In addition, the CBC plays important roles in multiple steps of mRNA metabolism through interactions with a plethora of factors, which often interact competitively with CBC. Identification of separation-of-function mutations on CBC or ALYREF that specifically disrupt their interaction but not other cellular complexes containing CBC or ALYREF would be an important future area to test the model in cells. 

      We appreciate the reviewer’s insightful comments regarding yeast Yra1. Thus far, the physical and functional connection between Yra1 and CBC in yeast has not been demonstrated. There are major differences between yeast Yra1 and human ALYREF. Given the lack of an EJC in S. cerevisiae, it is unclear whether Yra1 acts in a similar manner as human ALYREF. In addition, Yra1 does not contain a WxHD motif in its N-terminal unstructured region, which is involved in CBC and EJC interactions in ALYREF. Characterization of the Yra1-CBC interaction will be an interesting future direction. We now include a discussion about yeast Yra1 in the newly added “Conclusion and perspectives” section. 

      Specific suggestions:

      The authors could put their work in context by speculating how some of the amino acids that they identify as being critical for the interactions they identify could contribute to cancer. For example, they mention mutations of interacting residues in NCBP2 are associated with human cancers, pointing out that NCBP2 R105C amino acid substitution has been reported in colorectal cancer and the NCBP2 I110M mutation has been found in head and neck cancer. Do the authors speculate that these changes would decrease the interaction between NCBP2 and ALYREF and, if so, how would this contribute to cancer? They also mention that a K330N mutation in NCBP1 in human uterine corpus endometrial carcinoma, where Y135 on the α2 helix of mALYREF2 makes a hydrogen bond with K330 of NCBP1. How do they speculate loss of this interaction would contribute to cancer?

      In the revised manuscript, we include a discussion about these CBC mutants found in human cancers in the “Conclusion and perspectives” section. We think some of these CBC mutants, such as NCBP-1 K330N, could reduce interaction with ALYREF. Compromised CBC-ALYREF interaction will affect the recruitment of the TREX complex on nascent transcripts and cause dysregulation of mRNA export. In addition, that could also change the partition of CBC and ALYREF in different cellular complexes and cause perturbation of various steps in mRNA biogenesis that are regulated by CBC and ALYREF. Thus far, it is unclear whether and how loss of the CBC-ALYREF interaction directly contributes to cancer. Our work and that of others provide molecular insights to test in future studies. 

      Reviewer #2 (Public Reviews):

      Summary: 

      In this manuscript, Bradley and his colleagues represented the cryo-EM structure of the nuclear cap-binding complex (CBC) in complex with an mRNA export factor, ALYREF, providing a structural basis for understanding CBC regulating gene expression.

      Strengths: 

      The authors successfully modeled the N-terminal region and the RRM domain of ALYREF (residues 1-183) within the CBC-ALYREF structure, which revealed that both the NCBP1 and NCBP2 subunits of the CBC interact with the RBM domain of ALYREF. Further mutagenesis and pull-down studies provided additional evidence to the observed CBC-ALYREF interface. Additionally, the authors engaged in a comprehensive discussion regarding other cellular complexes containing CBC and/or ALYREF components. They proposed potential models that elucidated coordinated events during mRNA maturation. This study provided good evidence to show how CBC effectively recruits mRNA export factor machinery, enhancing our understanding of CBC regulating gene expression during mRNA transcription, splicing, and export. 

      Weaknesses: 

      No in vivo or in vitro functional data to validate and support the structural observations and the proposed models in this study. Cryo-EM data processing and structural representation need to be strengthened. 

      We appreciate the reviewer’s comments and suggestions. The fact that ALYREF uses highly overlapped binding interfaces for CBC and EJC interactions prevents us from a clear functional dissection of the ALYREF-CBC interaction using in vitro assays or in cells at the current stage. Please also see our response to Reviewer 1. 

      In this revised manuscript, we have reprocessed the cryo-EM data using a different strategy which yields significantly improved maps. We have made improvements to the presentation of the structural work based on the reviewer’s specific comments. 

      Reviewer #3 (Public Reviews):

      Summary: 

      The authors carried out structural and biochemical studies to investigate the multiple functions of CBC and ALYREF in RNA metabolism.

      Strengths: 

      For the structural study part, the authors successfully revealed how NCBP1 and NCBP2 subunits interact with mALYREF (residues 1-155). Their binding interface was then confirmed by biochemical assays (mutagenesis and pull-down assays) presented in this study. 

      Weaknesses: 

      The authors did not provide functional data to support their proposed models. The authors should include more details regarding the workflow of their cryo-EM data processing in the figure. 

      We thank the reviewer for the comments. We completely agree that testing the proposed models in cells would be ideal. However, as we also respond to the other reviewers, functional studies are premature at the current stage because both ALYREF and CBC are components of many cellular complexes that regulate mRNA metabolism. Separation-of-function mutations on CBC or ALYREF first need to be identified in future studies for further investigation. Please also see our response to Reviewer 1. 

      As suggested by the reviewer, we have included more details of the cryo-EM workflow in this revised manuscript. We have also included various validation measures including 3DFSC analyses, map vs model FSC curves, and representative density maps at various protein-protein binding interfaces. 

      Recommendations for the Authors:

      Reviewer #1 (Recommendations for the Authors):

      Major points:

      The authors should take advantage of Figure 1, which shows the domain structures of NCBP1, NCBP2, and ALYREF to indicate for the reader specifically which protein domains are included in the biochemical and structural analyses. In the current version of the manuscript, there is plenty of space to indicate below each domain structure precisely what regions are included.

      In this revised manuscript, we have revised Figure 1A to indicate the protein constructs used in this work. 

      Although it is fine to combine the Results and Discussion, the authors should really offer a concluding paragraph to highlight the novel results from this study and put the results in context.

      We thank the reviewer for the recommendation. We now include a “Conclusion and perspectives” section in this revised manuscript.  

      Minor comments:

      Page 5, last sentence (and others) starts a sentence with the word "Since" when likely "As" which does not imply a time element to the phrase, is the correct word.

      "Since the ALYREF/mALYREF2 interaction with the CBC is conserved and mALYREF2 exhibits better solubility, we focused on mALYREF2 in the cryo-EM investigations."

      Would be more correct as: "As the ALYREF/mALYREF2 interaction with the CBC is conserved and mALYREF2 exhibits better solubility, we focused on mALYREF2 in the cryo-EM investigations."

      We thank the reviewer for the comments. We have made the corrections. 

      The word 'data' is plural so the sentence at the bottom of p.9 that includes the phrase "...in vivo data shows.." should read "..in vivo data show.." 

      Corrected in the revised manuscript.

      Reviewer #2 (Recommendations for the Authors):

      Major points:

      (1) The authors claimed the improved solubility of mouse ALYREF2 (mALYREF2, residues 1-155) compared to the previously employed ALYREF construct. However, human ALYREF has already been purified successfully for pull down assay, indicating soluble human ALYREF obtained, why not use human ALYREF directly? Please clarify. 

      Pull-down studies were performed with GST-tagged ALYREF. For cryo-EM studies, untagged ALYREF is preferred to avoid potential issues that may arise from the expression tag. However, untagged ALYREF is less soluble than GST-tagged ALYREF and is not amenable for structural studies. We have revised the text to clarify this point. 

      (2) The authors confirmed CBC-ALYREF interfaces through mutagenesis and pull-down assays in vitro. However, it would be more informative and interesting to include functional assays in vitro or/and in vivo with mutagenesis. 

      We completely concur with the reviewer that testing the proposed models in vitro and in vivo would be important. However, as we pointed out in our response to public reviews, the highly overlapped binding interfaces on ALYREF for CBC and EJC interactions pose a great challenge for functional studies. Furthermore, both ALYREF and CBC are multifunctional factors and interact with a number of partners. Ideally, separation-of-function mutants that specifically disrupt the CBC-ALYREF interaction but not others need to be identified in future studies in order to perform functional studies. 

      (3) About cryo-EM data processing and structural representation:

      (1) In the description of the cryo-EM data processing, the authors claimed they did heterogeneous refinement, homogenous refinement, and then local refinement. This reviewer is puzzled by this process because the normal procedure should be non-uniform refinement following homogenous refinement. If the authors did not perform non-uniform refinement, they should do it because it would significantly improve the quality and resolution of cryo-EM maps. In addition, the right local refinement should include mask files and only show the density/map of the local region. 

      We thank the reviewer for the suggestions. In response to the reviewer’s comment on the preferred orientation issue (point 5 below), we reprocessed the cryo-EM data and obtained significantly improved cryo-EM maps. In this revised manuscript, the CBC-mALYREF map was refined using homogeneous refinement; the CBC map was refined using homogenous refinement followed by non-uniform refinement. Refinement masks are included in Figure 2-figure supplement1. 

      (2) Further local refinements with signal subtraction should be performed to improve the density and resolution of mALYREF2. 

      We tested local refinement with or without signal subtraction using masks covering mALYREF2 and various regions of CBC. Unfortunately, this approach did not improve the density of mALYREF2. We suspect that the small size of mALYREF2 (77 residues for the RRM domain) and the intrinsic flexibility of CBC are the limiting factors in these attempts. 

      (3) Figures with cryoEM map showing the side chains of the residues on the CBC-mALYREF2 interface should be included to strengthen the claims. Authors could add the map to Figure 3b/c or present it as a supplementary figure.

      We include new supplementary figures (Figure 3-figure supplement 1) to show the electron densities corresponding to the views in Figure 3B and 3C. Residues labeled in Figure 3B and 3C are shown in sticks in these supplementary figures.

      (4) For cryo-EM date processing, the authors have omitted lots of important details. Could the authors elaborate on the data processing with more details in the corresponding Figure and Methods Sections? Only one abi-initial model from the picked good particles was displayed in the figure. Are there any other different conformations of 3D classes for the dataset? In addition, too few classes have been considered in 3D classification, more classes may give a class with better density and resolution.

      We thank the reviewer for the comments. We have reprocessed the cryo-EM data. A major change is to use Topaz for particle picking. We now include more details for data processing in Figure 2-figure supplement 1 and the method section. The cryo-EM sample is relatively uniform. Ab-initio reconstruction and heterogenous refinement yielded only one good class and the other classes are “junk” classes (omitted in the workflow figure). No major conformational changes were observed throughout the multiple rounds of heterogenous refinement for both CBC and CBCmALYREF2. In this revised manuscript, we have been able to obtain significantly improved maps through the new data processing strategy employing Topaz as illustrated in Figure 2-figure supplement 1 to 5.

      (5) Angular distribution plots should be included to show if there is a preferred orientation issue. Based on the presented maps in validation reports, there may exist a preferred orientation issue for the reported two cryo-EM maps. Detailed 3D-Histogram and directional FSC plots for all the cryo-EM maps using 3DFSC web server should be presented to show the overall qualities (https://www.nature.com/articles/nmeth.4347 and https://3dfsc.salk.edu/).

      We thank the reviewer for the recommendations. In response to the reviewer’s comment on the preferred orientation issue, we reprocessed the cryo-EM data. Topaz was used for particle picking instead of template picking. 3DFSC analyses indicate that the new CBC-mALREF2 map has a sphericity of 0.946, which is a significant improvement from the previous map which has a sphericity of 0.815. Consistently, the maps presented in this revised manuscript show significantly improved densities. We now include angular distribution and 3DFSC analyses of the EM maps (Figure 2-figure supplement 2 and 4). 

      (6) Figures of model-to-map FSCs need to be present to demonstrate the quality of the models and the corresponding ones (model resolution when FSC=0.5) should also be included in Table 1. The accuracy of the model is important for structural explanations and description.

      The model-to-map FSCs are now included in Figure 2-figure supplement 3A and 5A. The model resolutions of CBC-mALYREF2 and CBC are estimated to be 3.5 Å and 3.6 Å at an FSC of 0.5. These numbers are now included in Table 1. 

      (7) In addition, figures of local density maps with different regions of the models, showing side chains, are necessary and important to justify the claimed resolutions. 

      We now include density maps overlayed with residue side chains at various regions. For the CBCmALYREF2 map, density maps are shown at the mALYREF2-NCBP1 interfaces (Figure 3-figure supplement 1A and 1B), mALYREF2-NCBP2 interface (Figure 3-figure supplement 1C), NCBP1NCPB2 interface (Figure 2-figure supplement 5B), and the region near m7G (Figure 2-figure supplement 5C). For the CBC map, density maps are shown at the NCBP1-NCPB2 interface (Figure 2-figure supplement 3B) and the region near m7G (Figure 2-figure supplement 3C). 

      Minor points:

      (1) A figure superimposing the models from the CBC-mALYREF2 amp and mALYREF2 alone map is necessary to present that there are no other CBC binding-induced conformational changes in CBC except the claimed by the authors. In addition, a figure showing the density of m7GpppG should be included as well.  

      Overlay of CBC and CBC-mALYREF2 models is now presented in Figure 2-figure supplement 3D. Comparing CBC and CBC-mALYREF2, NCBP1 and NCBP2 have a RMSD of 0.32 Å and 0.30 Å, respectively. The density maps near the M7G cap analog are shown in Figure 2-figure supplement 3C for CBC and Figure 2-figure supplement 5C for CBC-mALYREF2. 

      (2) Authors obtained the two maps from one dataset, so "we first determined" and "we next determined" (page 6) should be replaced with something like "One class of 3D cryo-EM map revealed' and "Another class of 3D cryo-EM map defined". 

      We have revised the text as suggested by the reviewer.  

      (3) In 'Abstract', 'a mRNA export factor' should be 'an mRNA export factor'. 

      Corrected in the revised manuscript.

      (4) In 'Abstract', the final sentence 'Comparison of CBC- ALYREF to other CBC and ALYREF containing cellular complexes provides insights into the coordinated events during mRNA transcription, splicing, and export' doesn't read smoothly, I would suggest revising it to 'Comparing CBC-ALYREF with other cellular complexes containing CBC and/or ALYREF components provides insight into the coordinated events during mRNA transcription, splicing, and export.' 

      We thank the reviewer for the recommendation and have revised accordingly. 

      (5) In paragraph 'CBC-ALYREF and viral hijacking of host mRNA export pathway', line 6, the sentences preceding and following the term 'However' indicate a progressive or parallel relationship, rather than a transitional one. To enhance the coherence, I would suggest replacing 'However' with 'Furthermore' or 'In addition'. 

      Corrected in the revised manuscript.

      (6) In both Figure 5 and Figure 6, the depicted models are proposed and constructed exclusively through the comparison of the CBC-partial ALYREF with other cellular complexes containing components of CBC and/or ALYREF, which need to be confirmed by more studies. To prevent potential confusion and misunderstandings, it is recommended to replace the term 'model' with 'proposed model'. 

      Corrected in the revised manuscript.

      Reviewer #3 (Recommendations for the Authors):

      Major points:

      (1) In the Results and Discussion section, the authors mentioned "Recombinant human ALYREF protein was shown to interact with the CBC in RNase-treated nuclear extracts." However, they used mouse ALYREF for cryo-EM investigations. Can the authors include an explanation for this choice during the revision?  

      In our work, we used a mixture of glutamic acid and arginine to increase the solubility of GSTALYREF. For cryo-EM studies, we use untagged ALYREF to avoid potential issues that may arise from the expression tag. However, untagged ALYREF is less soluble than GST-tagged ALYREF and is not suitable for structural studies in standard buffers. We have made further clarification on this point in this revised manuscript. 

      (2) In the paragraph on "CBC-ALYREF interfaces", the authors stated "For example, E97 forms salt bridges with K330 and K381 of NCBP1. Y135 on the α2 helix of mALYREF2 makes a hydrogen bond with K330 of NCBP1. The importance of this interface between ALYREF and NCBP1 is highlighted by a K330N mutation found in human uterine corpus endometrial carcinoma." I fail to see a strong connection between their structural observations and previous findings regarding the role of a K330N mutation found in human uterine corpus endometrial carcinoma. The authors should add more words to thread these two parts.  

      In response to the reviewer’s comment, we now move the discussion of these CBC mutants to the newly added “Conclusion and perspectives” section. 

      (3) The authors should include side chains of the residues in their figure of Local resolution estimation and FSC curves, especially when they are presenting the binding interface between two components. 

      We have now included density maps that are overlayed with structural models showing side chains of critical residues. These maps include the NCBP1-mALYREF2 interfaces (Figure 3-figure supplement 1A and 1B), NCBP2-mALYREF2 interface (Figure 3-figure supplement 1C), NCBP1NCBP2 interface (Figure 2-figure supplement 3B and 5B), and the m7G cap region (Figure 2figure supplement 3C and 5C). 

      Minor points: 

      (1) Some grammatical mistakes need to be corrected. For example, it is "an mRNA" instead of "a mRNA".  

      Corrected in the revised manuscript.

      (2) The authors can provide more information for the audience to know better about ALYREF when it first appears in the 5th line in the Abstract section. For example, "It promotes mRNA export through direct interaction with ALYREF, a key mRNA export factor, ...". 

      We have revised the sentence based on the reviewer’s comment.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      Some of the data is problematic and does not always support the authors' conclusions:

      (1) Fig. 1K and H are identical.

      Thank you for pointing out this problem in manuscript. We apologize for this unintentional mistake and have replaced Fig. 1K.

      (2) The graph in Figure 2B contradicts the text. It is not obvious how the image was quantified to produce the histological score graph..

      We thank the reviewer for pointing out this problem in manuscript, as the reviewer suggested, we have replaced the Figure 2B.

      (3) In Figures 2C and D, there is no clear pattern of changes in pro-inflammatory or anti-inflammatory cytokines, despite the authors' claims in the text.

      We appreciate the comment, we think the reason is that the level of cytokines in the tissue is low, so the pattern of changes is not obvious.

      (4) It is unclear why the anti-dsDNA antibody does not stain the nucleus in Figure 4B. The staining with anti-dsDNA and DAPI does not match well. Figure 5H shows there is still lots of cytosolic DNA in OGT-/- HCF-1-C, measured by DAPI. These data do not support the authors' conclusion that HCFC600 eliminates cytosolic DNA accumulation (line 229). There is no support for the authors' claim that HCF-1 restrains the cGAS-STING pathway (line 330).

      We thank these insightful comments, the most critical step in staining cytosolic DNA is to proceed to a low-permeabilization as to allow the antibody to cross the cellular membrane but not the nuclear membrane, that’s why the anti-dsDNA antibody does not stain the nucleus. In Figure 5H, we think we used a high concentrated DAPI to do the staining and nucleus DNA get stained, looks like it’s the cytosolic DNA. 

      (5) In Figure 5B, there is no increase in HCF-1 cleavage after OGT over-expression.

      We appreciate the reviewer for his/her comment, we think the reason is that we used the cell line to stably overexpress OGT-GFP and we may have missed the time point when the increase in HCF-1 cleavage occurred, so there is no big increase of it. However, there is a significant increase in Figure 5C.

      (6) In Figure 7, the TNF-a staining does not inspire confidence.

      We thank the reviewer for his/her comment, from both Figure 7K (MC38 tumor model) and Figure 7N (LLC tumor model), we observed a significant increase in TNF-α+ CD8+ T cells in the group treated with the combination of OSMI-1 and anti-PD-L1 compared to the control group, as evidenced by the clear clustering.

      The writing needs significant improvement:

      (1) There are multiple English grammar mistakes throughout the paper. It is recommended that the authors run the manuscript through an editing service.

      We thank the reviewer for his/her suggestion. We apologize for the poor language of our manuscript. We worked on the manuscript for a long time and the repeated addition and removal of sentences and sections obviously led to poor readability. We have now worked on both language and readability and have also involved native English speakers for language corrections. We really hope that the flow and language level have been substantially improved.

      (2) Some passages are misleading -- lines 161-162, line 217, lines 241-242, 263-264, 299-300. They need to be changed substantially.

      We apologize for these mistakes, we have changed them.

      (3) Figure legends should be rewritten. Currently, they are too abbreviated to be understood.

      We apologize for that, we have rewritten them.

      (4) Discussion should also be thoroughly reworked. Currently, it is merely restating the authors' findings. The authors should put their findings in the broader context of the field.

      We apologize for that. For a better understanding of our study, we have reworked the discussion.

      Reviewer #2 (Recommendations For The Authors):

      (1) Previous studies (DOI: 10.1093/nar/gkw663, 10.1016/j.jgg.2015.07.002, 10.1016/j.dnarep.2022.103394) have suggested that OGT deficiency triggers DNA damage, connecting it to DNA repair and maintenance through various mechanisms. This should be acknowledged in the manuscript. Conversely, the role of HCF1 and its cleaved products in maintaining genomic integrity hasn't been previously shown. The authors investigate HCF1's role solely in the context of OGT inhibition. It is unclear whether this is also true under other stimuli that trigger DNA damage, whether fragments of HCF1 specifically reduce DNA damage, or if HSF1 is involved in the basal machinery that would be defective only in the absence of OGT.

      We have acknowledged the manuscript mentioned above. In this paper we focused on the OGT function, which is related to HCF1. The role of HCF1 and its cleaved products in maintaining genomic integrity is an interesting topic, we may focus on it in next project.

      (2) In villin-CRE-deficient mice, the authors observe generic inflammation in the intestine unrelated to tumor development. It's unclear if this also occurs in the presence of OGT inhibitors in mice, whether these inhibitors induce a systemic inflammatory (Type I interferon) response, or if certain tissues like the intestine or proliferating tumor cells are more susceptible to such a response.

      We thank the comment, yes, investigating whether OGT inhibitors induce an inflammatory response, either systemically or tissue-specifically, is a very interesting project to focus on. However, in our current paper, we use a genetic method to identify the role of OGT deficiency in intestine inflammation-induced tumor development. This approach provides convincing evidence for our hypothesis. We may test the effect of OGT inhibitors on inflammation and tumor development in our next project.

      (3) Another critical observation is the magnitude of the interferon response triggered by DNA damage in the OGT-deficient models. While it's known that DNA damage can activate cGAS-STING, the response's extent in the absence of OGT prompts the question of whether additional OGT-specific features could explain this phenomenon. For example, Lamin A, essential for nuclear envelope integrity and shown to be O-glycosylated (DOI: 10.3390/cells7050044), and other components of the nuclear envelope or its repair might be affected by OGT. The impact of OGT inhibition on nuclear envelope integrity compared to other DNA-damaging agents could be explored.

      We appreciate the comment, in this project, we find an OGT binding protein, HCF1, though LC–MS/MS assay, it’s a top one candidate in binding profiles, so we focus on it. Like Lamin A and other components of the nuclear envelope still are good targets to check, we may explore these in our next project.

      (4) The authors also demonstrate a correlation between OGT expression in tumors compared to healthy tissues. However, the reason is unclear, raising questions about whether this is a consequence of proliferation or metabolic deregulation in the cancer. The authors should address this aspect.

      We appreciate the reviewer’s insightful point. It is very good questions and very interesting research. However, in this paper we focused on how OGT influence its downstream molecules to promote tumor, we didn’t check why OGT is increased in tumors, it is not the scope of this current work, we would love to investigate it in the future.

      Minor points

      Please add the legend to Figures S2, S3 and S5.

      We thank the comment, we have added the legend to Figures S2, S3 and S5.

      The sentence line 137 should be clarified as OGT deficiency seems more related to increased inflammation in this model.

      We thank the comment, we have corrected the sentence line 137.

      Line 732 has a ( typo before the number 34.

      We thank the comment, we have corrected the sentence line 732.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      In this important study, the authors manually assessed randomly selected images published in eLife between 2012 and 2020 to determine whether they were accessible for readers with deuteranopia, the most common form of color vision deficiency. They then developed an automated tool designed to classify figures and images as either "friendly" or "unfriendly" for people with deuteranopia. While such a tool could be used by publishers, editors or researchers to monitor accessibility in the research literature, the evidence supporting the tools' utility was incomplete. The tool would benefit from training on an expanded dataset that includes different image and figure types from many journals, and using more rigorous approaches when training the tool and assessing performance. The authors also provide code that readers can download and run to test their own images. This may be of most use for testing the tool, as there are already several free, user-friendly recoloring programs that allow users to see how images would look to a person with different forms of color vision deficiency. Automated classifications are of most use for assessing many images, when the user does not have the time or resources to assess each image individually.

      Thank you for this assessment. We have responded to the comments and suggestions in detail below. One minor correction to the above statement: the randomly selected images published in eLife were from articles published between 2012 and 2022 (not 2020).

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors of this study developed a software application, which aims to identify images as either "friendly" or "unfriendly" for readers with deuteranopia, the most common color-vision deficiency. Using previously published algorithms that recolor images to approximate how they would appear to a deuteranope (someone with deuteranopia), authors first manually assessed a set of images from biology-oriented research articles published in eLife between 2012 and 2022. The researchers identified 636 out of 4964 images as difficult to interpret ("unfriendly") for deuteranopes. They claim that there was a decrease in "unfriendly" images over time and that articles from cell-oriented research fields were most likely to contain "unfriendly" images. The researchers used the manually classified images to develop, train, and validate an automated screening tool. They also created a user-friendly web application of the tool, where users can upload images and be informed about the status of each image as "friendly" or "unfriendly" for deuteranopes.

      Strengths:

      The authors have identified an important accessibility issue in the scientific literature: the use of color combinations that make figures difficult to interpret for people with color-vision deficiency. The metrics proposed and evaluated in the study are a valuable theoretical contribution. The automated screening tool they provide is well-documented, open source, and relatively easy to install and use. It has the potential to provide a useful service to the scientists who want to make their figures more accessible. The data are open and freely accessible, well documented, and a valuable resource for further research. The manuscript is well written, logically structured, and easy to follow.

      We thank the reviewer for these comments.

      Weaknesses:

      (1) The authors themselves acknowledge the limitations that arise from the way they defined what constitutes an "unfriendly" image. There is a missed chance here to have engaged deuteranopes as stakeholders earlier in the experimental design. This would have allowed [them] to determine to what extent spatial separation and labelling of problematic color combinations responds to their needs and whether setting the bar at a simulated severity of 80% is inclusive enough. A slightly lowered barrier is still a barrier to accessibility.

      We agree with this point in principle. However, different people experience deuteranopia in different ways, so it would require a large effort to characterize these differences and provide empirical evidence about many individuals' interpretations of problematic images in the "real world." In this study, we aimed to establish a starting point that would emphasize the need for greater accessibility, and we have provided tools to begin accomplishing that. We erred on the side of simulating relatively high severity (but not complete deuteranopia). Thus, our findings and tools should be relevant to some (but not all) people with deuteranopia. Furthermore, as noted in the paper, an advantage of our approach is that "by using simulations, the reviewers were capable of seeing two versions of each image: the original and a simulated version." We believe this step is important in assessing the extent to which deuteranopia could confound image interpretations. Conceivably, this could be done with deuteranopes after recoloration, but it is difficult to know whether deuteranopes would see the recolored images in the same way that non-deuteranopes see the original images. It is also true that images simulating deuteranopia may not perfectly reflect how deuteranopes see those images. It is a tradeoff either way. We have added comments along these lines to the paper.

      (2) The use of images from a single journal strongly limits the generalizability of the empirical findings as well as of the automated screening tool itself. Machine-learning algorithms are highly configurable but also notorious for their lack of transparency and for being easily biased by the training data set. A quick and unsystematic test of the web application shows that the classifier works well for electron microscopy images but fails at recognizing red-green scatter plots and even the classical diagnostic images for color-vision deficiency (Ishihara test images) as "unfriendly". A future iteration of the tool should be trained on a wider variety of images from different journals.

      Thank you for these comments. We have reviewed an additional 2,000 images, which were randomly selected from PubMed Central. We used our original model to make predictions for those images. The corresponding results are now included in the paper.

      We agree that many of the images identified as being "unfriendly" are microscope images, which often use red and green dyes. However, many other image types were identified as unfriendly, including heat maps, line charts, maps, three-dimensional structural representations of proteins, photographs, network diagrams, etc. We have uploaded these figures to our Open Science Framework repository so it's easier for readers to review these examples. We have added a comment along these lines to the paper.

      The reviewer mentioned uploading red/green scatter plots and Ishihara test images to our Web application and that it reported they were friendly. Firstly, it depends on the scatter plot. Even though some such plots include green and red, the image's scientific meaning may be clear. Secondly, although the Ishihara images were created as informal tests for humans, these images (and ones similar to them) are not in eLife journal articles (to our knowledge) and thus are not included in our training set. Thus, it is unsurprising that our machine-learning models would not classify such images correctly as unfriendly.

      (3) Focusing the statistical analyses on individual images rather than articles (e.g. in figures 1 and 2) leads to pseudoreplication. Multiple images from the same article should not be treated as statistically independent measures, because they are produced by the same authors. A simple alternative is to instead use articles as the unit of analysis and score an article as "unfriendly" when it contains at least one "unfriendly" image. In addition, collapsing the counts of "unfriendly" images to proportions loses important information about the sample size. For example, the current analysis presented in Fig. 1 gives undue weight to the three images from 2012, two of which came from the same article. If we perform a logistic regression on articles coded as "friendly" and "unfriendly" (rather than the reported linear regression on the proportion of "unfriendly" images), there is still evidence for a decrease in the frequency of "unfriendly" eLife articles over time.

      Thank you for taking the time to provide these careful insights. We have adjusted these statistical analyses to focus on articles rather than individual images. For Figure 1, we treat an article as "Definitely problematic" if any image in the article was categorized as "Definitely problematic." Additionally, we no longer collapse the counts to proportions, and we use logistic regression to summarize the trend over time. The overall conclusions remain the same.

      Another issue concerns the large number of articles (>40%) that are classified as belonging to two subdisciplines, which further compounds the image pseudoreplication. Two alternatives are to either group articles with two subdisciplines into a "multidisciplinary" group or recode them to include both disciplines in the category name.

      Thank you for this insight. We have modified Figure 2 so that it puts all articles that have been assigned two subdisciplines into a "Multidisciplinary" category. The overall conclusions remain the same.

      (4) The low frequency of "unfriendly" images in the data (under 15%) calls for a different performance measure than the AUROC used by the authors. In such imbalanced classification cases the recommended performance measure is precision-recall area under the curve (PR AUC: https://doi.org/10.1371%2Fjournal.pone.0118432) that gives more weight to the classification of the rare class ("unfriendly" images).

      We now calculate the area under the precision-recall curve and provide these numbers (and figures) alongside the AUROC values (and figures). We agree that these numbers are informative; both metrics lead to the same overall conclusions.

      Reviewer #2 (Public Review):

      Summary:

      An analysis of images in the biology literature that are problematic for people with a color-vision deficiency (CVD) is presented, along with a machine learning-based model to identify such images and a web application that uses the model to flag problematic images. Their analysis reveals that about 13% of the images could be problematic for people with CVD and that the frequency of such images decreased over time. Their model yields 0.89 AUC score. It is proposed that their approach could help making biology literature accessible to diverse audiences.

      Strengths:

      The manuscript focuses on an important yet mostly overlooked problem, and makes contributions both in expanding our understanding of the extent of the problem and in developing solutions to mitigate the problem. The paper is generally well-written and clearly organized. Their CVD simulation combines five different metrics. The dataset has been assessed by two researchers and is likely to be of high-quality. Machine learning algorithm used (convolutional neural network, CNN) is an appropriate choice for the problem. The evaluation of various hyperparameters for the CNN model is extensive.

      We thank the reviewer for these comments.

      Weaknesses:

      The focus seems to be on one type of CVD (deuteranopia) and it is unclear whether this would generalize to other types.

      We agree that it would be interesting to perform similar analyses for protanopia and other color-vision deficiencies. But we leave that work for future studies.

      The dataset consists of images from eLife articles. While this is a reasonable starting point, whether this can generalize to other biology/biomedical articles is not assessed.

      This is an important point. We have reviewed an additional 2,000 images, which were randomly selected from PubMed Central, and used our original model to make predictions for those images. The corresponding results are now included in the paper.

      "Probably problematic" and "probably okay" classes are excluded from the analysis and classification, and the effect of this exclusion is not discussed.

      We now address this in the Discussion section.

      Machine learning aspects can be explained better, in a more standard way.

      Thank you. We address this comment in our responses to your comments below.

      The evaluation metrics used for validating the machine learning models seem lacking (e.g., precision, recall, F1 are not reported).

      We now provide these metrics (in a supplementary file).

      The web application is not discussed in any depth.

      The paper includes a paragraph about how the Web application works and which technologies we used to create it. We are unsure which additional aspects should be addressed.

      Reviewer #3 (Public Review):

      Summary:

      This work focuses on accessibility of scientific images for individuals with color vision deficiencies, particularly deuteranopia. The research involved an analysis of images from eLife published in 2012-2022. The authors manually reviewed nearly 5,000 images, comparing them with simulated versions representing the perspective of individuals with deuteranopia, and also evaluated several methods to automatically detect such images including training a machine-learning algorithm to do so, which performed the best. The authors found that nearly 13% of the images could be challenging for people with deuteranopia to interpret. There was a trend toward a decrease in problematic images over time, which is encouraging.

      Strengths:

      The manuscript is well organized and written. It addresses inclusivity and accessibility in scientific communication, and reinforces that there is a problem and that in part technological solutions have potential to assist with this problem.

      The number of manually assessed images for evaluation and training an algorithm is, to my knowledge, much larger than any existing survey. This is a valuable open source dataset beyond the work herein.

      The sequential steps used to classify articles follow best practices for evaluation and training sets.

      We thank the reviewer for these comments.

      Weaknesses:

      I do not see any major issues with the methods. The authors were transparent with the limitations (the need to rely on simulations instead of what deuteranopes see), only capturing a subset of issues related to color vision deficiency, and the focus on one journal that may not be representative of images in other journals and disciplines.

      We thank the reviewer for these comments. Regarding the last point, we have reviewed an additional 2,000 images, which were randomly selected from PubMed Central, and used our original model to make predictions for those images. The corresponding results are now included in the paper.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      N/A

      Thank you.

      Reviewer #2 (Recommendations For The Authors):

      - The web application link can be provided in the Abstract for more visibility.

      We have added the URL to the Abstract.

      - They focus on deuteranopia in this paper. It seems that protanopia is not considered. Why? What are the challenges in considered this type of CVD?

      We agree that it would be interesting to perform similar analyses for protanopia and other color-vision deficiencies. But we leave that work for future studies. Deuteranopia is the most common color-vision deficiency, so we focused on the needs of these individuals as a starting point.

      - The dataset is limited to eLife articles. More discussion of this limitation is needed. Couldn't one also include some papers from PMC open access dataset for comparison?

      We have reviewed an additional 2,000 images, which we randomly selected from PubMed Central, and used our original model to make predictions for those images. The corresponding results are now included in the paper.

      - An analysis of the effect of selecting a severity value of 0.8 can be included.

      We agree that this would be interesting, but we leave it for future work.

      - "Probably problematic" and "probably okay" classes are excluded from analysis, which may oversimplify the findings and bias the models. It would have been interesting to study these classes as well.

      We agree that this would be interesting, but we leave it for future work. However, we have added a comment to the Discussion on this point.

      - Some machine learning aspects are discussed in a non-standard way. Class weighting or transfer learning would not typically be considered hyperparameters."corpus" is not a model. Description of how fine-tuning was performed could be clearer.

      We have updated this wording to use more appropriate terminology to describe these different "configurations." Additionally, we expanded and clarified our description of fine tuning.

      - Reporting performance on the training set is not very meaningful. Although I understand this is cross-validated, it is unclear what is gained by reporting two results. Maybe there should be more discussion of the difference.

      We used cross validation to compare different machine-learning models and configurations. Providing performance metrics helps to illustrate how we arrived at the final configurations that we used. We have updated the manuscript to clarify this point.

      - True positives, false positives, etc. are described as evaluation metrics. Typically, one would think of these as numbers that are used to calculate evaluation metrics, like precision (PPV), recall (sensitivity), etc. Furthermore, they say they measure precision, recall, precision-recall curves, but I don't see these reported in the manuscript. They should be (especially precision, recall, F1).

      We have clarified this wording in the manuscript.

      - There are many figures in the supplementary material, but not much interpretation/insights provided. What should we learn from these figures?

      We have revised the captions and now provide more explanations about these figures in the manuscript.

      - CVD simulations are mentioned (line 312). It is unclear whether these methods could be used for this work and if so, why they were not used. How do the simulations in this work compare to other simulations?

      This part of the manuscript refers to recolorization techniques, which attempt to make images more friendly to people with color vision deficiencies. For our paper, we used a form of recolorization that simulates how a deuteranope would see a figure in its original form. Therefore, unless we misunderstand the reviewer's question, these two types of simulation have distinct purposes and thus are not comparable.

      - relu -> ReLU

      We have corrected this.

      Reviewer #3 (Recommendations For The Authors):

      The title can be more specific to denote that the survey was done in eLife papers in the years 2012-2022. Similarly, this should be clear in the abstract instead of only "images published in biology-oriented research articles".

      Thank you for this suggestion. Because we have expanded this work to include images from PubMed Central papers, we believe the title is acceptable as it stands. We updated the abstract to say, "images published in biology- and medicine-oriented research articles"

      Two mentions of existing work that I did not see are to Jambor and colleagues' assessment on color accessibility in several fields: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8041175/, and whether this work overlaps with the 'JetFighter' tool

      (https://elifesciences.org/labs/c2292989/jetfighter-towards-figure-accuracy-and-accessibility).

      Thank you for bringing these to our attention. We have added a citation to Jambor, et al.

      We also mention JetFighter and describe its uses.

      Similarly, on Line 301: Significant prior work has been done to address and improve accessibility for individuals with CVD. This work can be generally categorized into three types of studies: simulation methods, recolorization methods, and estimating the frequency of accessible images.

      - One might mention education as prior work as well, which might in part be contributing to a decrease in problematic images (e.g., https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8041175/)

      We now suggest that there are four categories and include education as one of these.

      Line 361, when discussing resources to make figures suitable, the authors may consider citing this paper about an R package for single-cell data: https://elifesciences.org/articles/82128

      Thank you. We now cite this paper.

      The web application is a good demonstration of how this can be applied, and all code is open so others can apply the CNN in their own uses cases. Still, by itself, it is tedious to upload individual image files to screen them. Future work can implement this into a workflow more typical to researchers, but I understand that this will take additional resources beyond the scope of this project. The demonstration that these algorithms can be run with minimal resources in the browser with tensorflow.js is novel.

      Thank you.

      General:

      It is encouraging that 'definitely problematic' images have been decreasing over time in eLife. Might this have to do with eLife policies? I could not quickly find if eLife has checks in place for this, but given that JetFighter was developed in association with eLife, I wonder if there is an enhanced awareness of this issue here vs. other journals.

      This is possible. We are not aware of a way to test this formally.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Tang et al present an important manuscript focused on endogenous virus-like particles (eVLP) for cancer vaccination with solid in vivo studies. The author designed eVLP with high protein loading and transfection efficiency by PEG10 self-assembling while packaging neoantigens inside for cancer immunotherapy. The eVLP was further modified with CpG-ODN for enhanced dendritic cell targeting. The final vaccine ePAC was proven to elicit strong immune stimulation with increased killing effect against tumor cells in 2 mouse models. Below are my specific comments:

      Thanks very much to comment our work as “important”. We sincerely appreciate the extremely helpful comments from the reviewer to significantly improve the quality of our manuscript.

      (1) The figures were well prepared with minor flaws, such as missed scale bars in Figures 4B, 4K, 5B, and 5C. The author should also add labels representing statistical analysis for Figures 3C, 3D, and 3E. In Figure 6G, the authors should label which cell type is the data for.

      Thanks very much for the very suggestive comments. The scale bars and statistical analysis have been added in Figures 4B, 4K, 5B, 5C, 3C, 3D, and 3E. For Figure 6G, we have added “CD44+ CD62L- in CD8+ T cells” to explain the cell type.

      (2) In Figure 3H, the antigen-presenting cells (APCs) increased significantly, but there was also a non-negligible 10% of APCs found in the control group, indicating some potential unwanted immune response; the authors need to explain this phenomenon or add a cytotoxic test on the normal liver or other cell lines for confirmation.

      Thanks very much for this extremely helpful suggestion. The antigen-presenting cells (APCs) in Figure 3H were isolated from mouse bone marrow and then cultured in vitro for about 5 days with cytokine stimulation (IL-4 and GM-CSF). Due to the stimulation effects of IL-4 and GM-CSF, a small proportion of the APCs (~10%) was tending to mature (co-expressing CD80 and CD86) in the control group, as pointing out by the reviewer. Similarly, in Figure 3I, these 10% activated APCs can activate T cells in vitro and exhibit certain cytotoxicity. Since APCs must be induced and cultured in vitro before using in this experiment, the background cytotoxicity induced by cytokines is unavoidable, and this has been well documented in literatures.

      (3) In Figure 3I, the ePAC seems to have a very similar effect on cytotoxic T-cell tumor killing compared to the peptides + CpG group. If the concentrations were also the same, based on that, questions will arise as to what is the benefit of using the compact vector other than just free peptide and CpG? Please explain and elaborate.

      Thanks very much for the comment. In vitro experiments indeed demonstrated that peptides + CpG had the same T cell activating ability as ePAC, as pointing out by the reviewer. However, due to the instability of peptides and the lack of targeting, the efficiency of activating the immune system for peptides + CpG after subcutaneous injection is significantly lower than that for ePAC in vivo, as shown in Figure 3D and Figure 2A. Then, as expected, the antitumor efficacy induced by peptides alone + CpG is significantly lower than that induced by ePAC in Figure 5. We have provided a detailed description in “Results” section of “Antitumor effect of ePAC in subcutaneous HCC model” as follows: Furthermore, ePAC with the ability to target DCs and increased stability by encapsulating peptides, exhibited significantly higher tumor growth inhibition efficiency (p=0.0002) comparing with the eVLP + CpG-ODN treated group similar to the simple mixture of neoantigen peptides and adjuvant (Figures 5B and 5C). Meanwhile, the Kaplan-Meier analysis of tumor progression free survival (PFS) also clearly demonstrated the therapeutic advantages of our ePAC (p=0.0194, Figure 5B).

      (4) In the animal experiment in Figures 4F to L, the activation effect of APCs was similar between ePAC and CpG-only groups with no significance, but when it comes to the HCC mouse model in Figure 5, the anti-tumor effect was significantly increased between ePAC and CpG-only group. The authors should explain the difference between these two results.

      Thanks very much for the comment. Since PEG10 protein does not have an adjuvant effect, the adjuvant effect of ePAC mainly comes from the modified CpG. Therefore, although ePAC can effectively deliver tumor neoantigens, it does not have a significant advantage over free CpG in activating APCs. However, CpG only possesses the adjuvant effect and does not carry neoantigens. While it can promote the maturation of APCs, it cannot generate neoantigen-specific T cells. Consequently, the antitumor effect of CpG-only is much lower than that of ePAC in Figure 5.

      Reviewer #2 (Public Review):

      Summary:

      The authors provided a novel antigen delivery system that showed remarkable efficacy in transporting antigens to develop cancer therapeutic vaccines.

      Strengths:

      This manuscript was innovative, meaningful, and had a rich amount of data.

      Weaknesses:

      There are still some issues that need to be addressed and clarified.

      Thanks very much to comment our work as “innovative”. We sincerely appreciate the extremely helpful comments from the reviewer to significantly improve the quality of our manuscript, and the listed weaknesses have been all carefully addressed.

      (1) The format of images and data should be unified. Specifically, as follows: a. The presentation of flow cytometry results; b, The color schemes for different groups of column diagrams.

      Thanks very much. Following the reviewer’s comment, we have unified the format of all images and data as suggested.

      (2) The P-value should be provided in Figures, including Figure 1F, 1H, 3C, 3D, and 3E.

      Thanks very much. We have provided the corresponding P-values in Figure 1F, 1H, 3C, 3D, and 3E.

      (3) The quality of Figure 1C was too low to support the conclusion. The author should provide higher-quality images with no obvious background fluorescent signal. Meanwhile, the fluorescent image results of "Egfp+VSVg" group were inconsistent with the flow cytometry data. Additionally, the reviewer recommends that the authors use a confocal microscope to repeat this experiment to obtain a more convincing result.

      Thanks very much for this comment. Following the reviewer’s suggestion, we uniformly adjusted the original images in Figure 1C to reduce background interference and increase its quality. After eliminating background interference, the fluorescence image of the "Egfp+VSVg" group was consistent with the flow cytometry result.

      (4) The survival situation of the mouse should be provided in Figure 5, Figure 6, and Figure 7 to support the superior tumor therapy effect of ePAC.

      Thanks very much for the extremely helpful comment. Following the reviewer’s suggestion, we have added the progression free survival (PFS) of mice in Figure 5 and described this result in the “Results” section of “Antitumor effect of ePAC in subcutaneous HCC model” as follows: Meanwhile, the Kaplan-Meier analysis of tumor progression free survival (PFS) also clearly demonstrated the therapeutic advantages of our ePAC (p=0.0194, Figure 5B). For Figure 6 and Figure 7, to promptly detect the immune changes in the tumor microenvironment after vaccination, we were unable to conduct long-term observations on tumor-bearing mice, and therefore, we did not provide the survival curve. However, we monitored the tumor volume changes in real-time, which also can serve as an important measure for evaluating antitumor efficacy.

      (5) To demonstrate that ePAC could trigger a strong immune response, the positive control group in Figure 4K should be added.

      Thanks very much for this very helpful comment. Following the reviewer’s suggestion, the mouse anti-CD3 antibody was used as the positive control in vitro to activate splenic T cells for ELISPOT assay, and the corresponding results have been added in revised Figure 4K. To address this, we have provided a detailed description in “Figure legends” section of “Figure 4. ePAC delivery and immune activation in vivo” as follows: The mouse anti-CD3 antibody was used to activate splenic T cells in vitro as the positive control for ELISPOT assay.

      (6) In Figure 6G-I and other figures, the author should indicate the time point of detection. Meanwhile, there was no explanation for the different numbers of mice in Figure 6G-I. If the mouse was absent due to death, it may be necessary to advance the detection time to obtain a more convincing result.

      Thanks very much for the comment. The samples for Figure 6 G-I data were collected and analyzed at the day 28 after the start of treatment. Following the reviewer’s suggestion, we have specifically marked the time point of “Sacrifice for sampling” in Figure 6A. And we have provided a detailed description in “Figure legends” section of “Figure 6. Evaluation ePAC antitumor efficacy in orthotopic HCC model by αTIM-3 combination” as follows: The mice were sacrificed and sampled for analysis on the day of 28 after initiating treatment. In addition, in Figure 6G-I we have clearly indicated the sample size for each group. Although three mice in the PBS group died, we still have obtained enough samples for statistical analysis (n>3).

      (7) In Figure 6B, the rainbow color bar with an accurate number of maximum and minimum fluorescence intensity should be provided. In addition, the corresponding fluorescence intensity in Figure 6B should be noted.

      Thanks very much for this very helpful comment. Following the reviewer’s suggestion, we have added the rainbow color bar with an accurate number of maximum and minimum fluorescence intensity, and the statistic results in revised Figure 6B.

      (8) The quality of images in Figure 1D and Figure S1B could not support the author's conclusion; please provide higher-quality images.

      Thanks very much. In Figure 1D and Figure S1B, to ensure the authenticity of the results, we tried our best to improve the quality of the pictures and provided the WB results with the full membrane scan. Although some non-specific bands appeared in the results, the target bands remained prominent. Additionally, we used two tags (HA and eGFP) for verification, which fully guarantees the reliability of our findings.

      (9) In Figure 2F, the bright field in the overlay photo may disturb the observation. Meanwhile, the scale bar should be provided in enlarged images.

      Thanks very much. Following the reviewer’s suggestion, we have deleted the bright field in revised Figure 2F and added the scale bar in the enlarged images.

      Reviewer #3 (Public Review):

      Summary:

      The authors harnessed the potential of mammalian endogenous virus-like proteins to encapsulate virus-like particles (VLPs), enabling the precise delivery of tumor neoantigens. Through meticulous optimization of the VLP component ratios, they achieved remarkable stability and efficiency in delivering these crucial payloads. Moreover, the incorporation of CpG-ODN further heightened the targeted delivery efficiency and immunogenicity of the VLPs, solidifying their role as a potent tumor vaccine. In a diverse array of tumor mouse models, this novel tumor vaccine, termed ePAC, exhibited profound efficacy in activating the murine immune system. This activation manifested through the stimulation of dendritic cells in lymph nodes, the generation of effector memory T cells within the spleen, and the infiltration of neoantigen-specific T cells into tumors, resulting in robust anti-tumor responses.

      Strengths:

      This study delivered tumor neoantigens using VLPs, pioneering a new method for neoantigen delivery. Additionally, the gag protein of VLP is derived from mammalian endogenous virus-like protein, which offers greater safety compared to virus-derived gag proteins, thereby presenting a strong potential for clinical translation. The study also utilized a humanized mouse model to further validate the vaccine's efficacy and safety. Therefore, the anti-tumor vaccine designed in this study possesses both innovation and practicality.

      Thanks very much to comment our work as “novel”, “innovation” and “practicality”. We sincerely appreciate the extremely helpful comments from the reviewer to significantly improve the quality of our manuscript.

      Weaknesses:

      (1) CpG-ODN is an FDA-approved adjuvant with various sequence structures. Why was CpG-ODN 1826 directly chosen in this study instead of other types of CpG-ODN? Additionally, how does DEC-205 recognize CpG-ODN 1826, and can DEC-205 recognize other types of CpG-ODN?

      Thanks very much for the comment. CpG-ODNs are classified into three main types based on their structural composition: A, B, and C. Among them, only the B-class CpG-ODNs 1668, 1826, and 2006 have been directly proven to effectively bind DEC-205 and activate DC cells [1]. Therefore, in this study, B-class CpG-ODN 1826 was chosen as the ligand targeting DEC-205 on the surface of DC cells. DEC-205 primarily binds sequences containing the CpG motif core in a pH-dependent manner, thus theoretically allowing DEC-205 to bind a wide range of CpG-ODNs.

      [1] Lahoud MH et al. DEC-205 is a cell surface receptor for CpG oligonucleotides. PNAS. 2012

      (2) Why was it necessary to treat DCs with virus-like particles three times during the in vitro activation of T cells? Can this in vitro activation method effectively obtain neoantigen-responsive T cells?

      Thanks very much for the comment. DCs need to be pre-stimulated before being used to activate T cells. Although Single DC stimulation can activate T cells, but the activation efficiency is insufficient. Current research suggests that three DC-T interactions can more effectively activate T cells [2]. Therefore, we prepared virus-like particle stimulated DCs for three times to fully activate T cells. Our results in Figures 3I and 7D also demonstrate that three-time stimulations effectively activated antigen-specific T cells, resulting in stronger tumor cell killing effects.

      [2] Ali M et al. Induction of neoantigen-reactive T cells from healthy donors. Nature protocol. 2019.

      (3) In the humanized mouse model, the authors used Hepa1-6 cells to construct the tumor model. To achieve the vaccine's anti-tumor function, these Hepa1-6 cells were additionally engineered to express HLA-A0201. However, in the in vitro experiments, the authors used the HepG2 cell line, which naturally expresses HLA-A0201. Why did the authors not continue to use HepG2 cells to construct the tumor model, instead of Hepa1-6 cells?

      Thanks very much for the comment. HepG2 cells are derived from human liver cancer. When directly implant into immunocompetent mice, they will be cleared by the mouse immune system and will not form tumors. Therefore, we have not continued to use HepG2 cells to construct the tumor model.

      (4) The advantages of low immunogenicity viruses as vaccines compared with conventional adenovirus and lentivirus, etc. should be discussed.

      Thanks very much for the very suggestive comment. In the introduction starting from line 76, we first described the structure and function of lentiviruses and discussed the design and application of virus-like particles (VLPs) based on lentiviruses. To provide a more comprehensive comparison, we included a discussion on VLPs, lentiviruses, and adenoviruses in the discuss section (from line 441 to line 447) as follows: “Furthermore, comparing to the virus-based delivery vectors, the lentiviruses although can stably integrate into the host genome but carry risks of insertional mutagenesis; adenoviruses although have high transduction efficiency but strong immunogenicity, which leads to fast clearance by the immune system of the host and affects the efficiency of the secondary injection. Instead, our VLPs offer low immunogenicity and superior safety, making them more suitable for repeated use and vaccine development.”

      (5) In Figure 6B, the authors should provide statistical results.

      Thanks very much. We have provided the statistical results in revised Figure 6B following the reviewer’s suggestion.

      (6) The entire article demonstrates a clear logical structure and substantial content in its writing. However, there are still some minor errors, such as the misspelling of "Spleenic" in Figure 3B, and the sentence from line 234 should be revised.

      Thanks very much. We have carefully checked and corrected the typos throughout the whole manuscript as much as possible.

      (7) The authors demonstrated the efficiency of CpG-ODN membrane modification by varying the concentration of DBCO, ultimately determining the optimal modification scheme for eVLP as 3.5 nmol of DBCO. However, in Figure 2B, the author did not provide the modification efficiency when the DBCO concentration is lower than 3.5 nmol. These results should be provided.

      Thanks very much for the suggestion. We have repeated the experiment and reduced the concentration of DBCO to 2.1 nmol and 0.7 nmol, respectively. The results showed that in a 200 µl eVLP reaction system, 3.5 nmol DBCO achieved the highest modification efficiency. We have provided a detailed description in “Results” section of “Envelope decoration of neoantigen-loaded eVLP” as follows: Furthermore, by varying the concentration of DBCO-C6-NHS Ester from 0 to 14 nmol, ePAC exhibited different CpG-ODN loading efficiency as evidenced by agarose gel electrophoresis (Figure 2B and Figure S3). And the results showed that in a 200 µl eVLP reaction system, 3.5 nmol DBCO achieved the highest modification efficiency.

      (8) In Figure 3, the authors presented a series of data demonstrating that ePAC can activate mouse DC2.4 cells and BMDCs in vitro. However, in Figure 7, there is no evidence showing whether human DC cells can be activated by ePAC in vitro. This data should be provided.

      Thanks very much for this very helpful suggestion. We used ePAC to activate human DCs and the results indicate that, compared to the blank control group, both eVLP and ePAC increased the co-expression of CD80 and CD86 in DCs, and ePAC was the most efficient. We have provided a detailed description in the “Results” section of “Antitumor effect by HLA-A*0201 restricted vaccine” as follows: After the stimulation, the DCs in ePAC treated group showed the highest level of maturation comparing to the eVLP treated group and control group (Figure S4), by using flow cytometry analysis.”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      (1) Figure 2B and 2D: unlike what is written in the results part, the results are not consistent, but opposite: LSS has higher activity in 2B, less in 2D. 

      The activities in Figure 2B come from NMR kinetic experiments with pGly, whereas Figure 2D reports on activity towards whole S. aureus cells. The LytM and LSS activities in these two experiments are indeed not directly comparable, but served to highlight the fact that simple pentaglycine is a poor model substrate for M23 enzymes. We carried out a turbidity assay with pristine enzymatic preparation and indeed it is highly consistent both with the kinetic assay using pentaglycine (Fig. 2B) as well as with larger PG fragments (Fig. 2K) indicating that the catalytic domain of LSS is significantly more efficient than LytM in hydrolyzing cells from community acquired methicillin resistant S. aureus strain USA300 as well as synthetic PG fragments.  The corresponding paragraph in Results has now been updated and rephrased.

      (2) Figure 2, panel K missing statistical analysis, which makes it difficult to appreciate if the difference is significant. If it is a one-time experiment or a single value, the value should be presented as a table. The corresponding text in the results part is confusing. The fold change or drop in percentage is unclear in the figure. 

      We have added a table (panel L) to Figure 2, which shows absolute values of LSS and LytM hydrolysis rates. Indeed, most of the values are from single NMR kinetic measurements, however, PG fragment (2) for LSS and PG fragment (3) for LytM were measured as duplicates to verify the reproducibility of the data. This is now mentioned in Figure 3 legend and in the Materials and Methods. Also, the corresponding text in the Results has been updated and rephrased.

      (3) Figure 3H: the cleavage of D-ala-gly is unclear, the cleavage products need to be labeled and quantified. The experiment used purified PG treated with mutanolysin. Presumably, mixed monomers, dimers, trimers, and multimers are used. It would be helpful to show the HPLC profile of the purified muropeptide. It would be informative to analyze which fractions generate D-ala-gly. In addition, the intact murein sacculus should be included. 

      For the sake of clarity, we have moved the 13C-HMBC spectra presented in Figure 3H to Fig. S7 in the Supplementary Material. The full carbonyl carbon region of the reference (prior to addition of enzyme) 13C-HMBC spectrum together with larger expansions of spectra acquired from enzyme-treated muropeptides are now shown. Furthermore, graphical presentations of identified PG fragments due to LSS/LytM activity are included. No HPLC analysis of the muropeptides was performed at this stage. Being insoluble, the intact murein sacculus is not amenable to liquid-state NMR studies, but we envisage studies of this remarkably complex structure also with solid-state NMR.

      Reviewer #2 (Recommendations For The Authors): 

      Overall, the experiments address the question asked by the authors and no additional experiments are required to strengthen the conclusions drawn. 

      Abstract: 

      The abstract is not well written and more specific (and accurate) information should be provided by the authors. 

      We are grateful for the constructive and helpful comments to improve our manuscript. The abstract has now been modified by taking into account the Reviewer’s suggestions.

      Introduction 

      The intro is relatively long and wordy. It could most certainly be shortened and written in a more simple way to make it more impactful.

      The introduction has now been modified by taking into account the Reviewer’s suggestions.

      (2) One of the peptide stems in Figure 1 is missing a pentaglycine side chain; I would recommend increasing the font size; the peptide stem looks like it is attached to the carbon in position 2, it may be a good idea to move it to the left? 

      We thank the Reviewer for this comment. Figure 1 has been improved, the frameshift has been fixed and the non-cross-linked pGly bridge has been included to the lysine side-chain in tetraStem.

      Results 

      Figure 2 is a bit overwhelming and its description is sketchy. Fig 2B shows a much higher activity of LSS on pGly as compared to LytM whilst 2K shows a very similar rate. 

      We have rearranged Figures 1 and 2 by moving the original panel J in Figure 2 (structures of PG fragments) to Figure 1 panel C. The bar graph in Figure 2J now shows absolute rates of substrate hydrolysis for 2 mM LSS and LytM. These indicate that LSS is much more efficient against PG fragments in vitro in comparison to LytM. Rates normalized with respect to pGly are shown in Figure 2K. Also, a table showing absolute rates of hydrolysis for 2 mM LSS and 50 mM LytM has been included in Figure 2, panel L. In this Table, the values for PG fragments 2 and 3 were determined by two independent measurements to test and accredit the reproducibility of the method. This is also now elaborated further in the Materials and Methods.

      Figure 3 is impressive and very informative but again hard to follow. 

      - Panels 3A and 3B are nicely conceived but the resolution is rather poor and it is difficult to know exactly where the arrows point. 

      We very much value suggestions given by the Reviewer to improve readability of our manuscript. In the case of Figure 3, we have now greatly enhanced the resolution and readability of the figure by horizontal scaling of panels A and B.

      Figure 4 shows a comparative analysis of catalytic rate using various substrates, the authors may want to present graphs with the same y-axis to get the most out of the comparison between substrates. 

      The scaling of the y-axis is the same for all the substrates now. In addition, we have reorganized the panels in the figure to enhance readability.

      Figure 5: - The same remark as above, please cite all panels in alphabetical order. 

      Citing to Figure 5 has now been revised.

      Material and methods: 

      - How were the peptide concentrations determined? It may be useful to indicate if specific conditions were required to solubilize some peptides, pGly is particularly insoluble in aqueous solutions. 

      - Page 19, replace cpm by rpm; biological or technical replicates?

      These have now been added and edited accordingly.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1:

      After reviewing the authors' response letter and the revised manuscript, I believe they have done a commendable job in addressing my comments.

      Additionally, I concur with the concerns raised by Reviewer #2 regarding several potential confounding factors that require better control in their experimental design. These include the differences in physical properties between vocal and nonvocal stimuli, as well as the infant's exposure to the speech/auditory environment. These concerns should be thoroughly and explicitly discussed in the manuscript, ensuring a clearer understanding for the readers.

      Thank you for the suggestion. We have discussion these limitations in our revised manuscript. In this round of revision, we have tempered our conclusion due to these limitations.

      Reviewer #2:

      The revised manuscript does discuss the limitations of the control stimuli, as well as the limitations with regard to conclusions that can be drawn from this data set. I therefore expected the authors to temper a bit their recommendation that this could be a 'screening' signal for autism because these data are not sufficiently strong to make that recommendation. Also, in the same vein, perhaps the title might be adjusted somewhat to suggest less certainty, for example, by using the word "change" rather than "milestone"'? The data are of interest, but the limitations are genuine limitations.

      Thank you for your expert comments and considerations. We have moderated our recommendation for autism screening and softened the statement of “milestone” throughout the manuscript. Please see the updated article title, abstract, significance statement, and discussion.

    1. Author response:

      Reviewer #1 (Public Review):

      Summary:

      A nice study trying to identify the relationship between E. coli O157 from cattle and humans in Alberta, Canada.

      Strengths:

      (1) The combined human and animal sampling is a great foundation for this kind of study.

      (2) Phylogenetic analyses seem to have been carried out in a high-quality fashion.

      Weaknesses:

      I think there may be a problem with the selection of the isolates for the primary analysis. This is what I'm thinking:

      (1) Transmission analyses are strongly influenced by the sampling frame.

      (2) While the authors have randomly selected from their isolate collections, which is fine, the collections themselves are not random.

      (3) The animal isolates are likely to represent a broad swathe of diversity, because of the structured sampling of animal reservoirs undertaken (as I understand it).

      (4) The human isolates are all from clinical cases. Clinical cases of the disease are likely to be closely related to other clinical cases, because of outbreaks (either detected, or undetected), and the high ascertainment rate for serious infections.

      (5) Therefore, taking an equivalent number of animal and clinical isolates, will underestimate the total diversity in the clinical isolates because the sampling of the clinical isolates is less "independent" (in the statistical sense) than sampling from the animal isolates.

      (6) This could lead to over-estimating of transmission from cattle to humans.

      We appreciate the reviewer’s careful thoughts about our sampling strategy. We agree with points (1) and (2), and we will provide additional details on the animal collections as requested.

      We agree with point (3) in theory but not in fact. As shown in Figure 3a, the cattle isolates were very closely related, despite the temporal and geographic breadth of sampling within Alberta. The median SNP distance between cattle sequences was 45 (IQR 36-56), compared to 54 (IQR 43-229) SNPs between human sequences from cases in Alberta during the same years. Additionally, as shown in Figure 2, only clade A and B isolates – clades that diverge substantially from the rest of the tree – were dominated by human cases in Alberta. We will better highlight this evidence in the revision.

      We agree with the reviewer in point (4) that outbreaks can be an important confounder of phylogenetic inference. This is why we down-sampled outbreaks (based on genetic relatedness, not external designation) in our extended analyses (lines 192-194). We did not do this in the primary analysis, because there were no large clusters of identical isolates. Figure 3b shows a limited number of small clusters; however, clustered cattle isolates outnumbered clustered human isolates, suggesting that any bias would be in the opposite direction the reviewer suggests. Regarding severe cases being oversampled among the clinical isolates, this is absolutely true and a limitation of all studies utilizing public health reporting data. We will make this limitation to generalizability clearer in the discussion. However, as noted above, clinical isolates were more variable than cattle isolates, so it does not appear to have heavily biased the analysis.

      We disagree with the reviewer on point (5). While the bias toward severe cases could make the human isolates less independent, the relative sampling proportions are likely to induce greater distance between clinical isolates than cattle isolates, which is exactly what we observe (see response to point (3) above). Cattle are E. coli O157:H7’s primary reservoir, and humans are incidental hosts not able to sustain infection chains long-term. Not only is the bacteria prevalent among cattle, cattle are also highly prevalent in Alberta. Thus, even with 89 sampling points, we are still capturing a small proportion of the E. coli O157:H7 in the province. Being able to sample only a small proportion of cattle’s E. coli O157:H7 increases the likelihood of only sampling from the center of the distribution, making extreme cases such as that shown at the very bottom of the tree in Figure 3b, rare and important. In comparison, sampling from human cases constitutes a higher proportion of human infections relative to cattle, and is therefore more representative of the underlying distribution, including extremes. We will add this point to the limitations. As with the clustering above, if anything, this outcome would have biased the study away from identifying cattle as the primary reservoir. Additionally, the relatively small proportion of cattle sampled makes our finding that 15.7% of clinical isolates were within 5 SNPs of a cattle isolate, the distance most commonly used to indicate transmission for E. coli O157:H7, all the more remarkable.

      Because of the aforementioned points, we disagree with the reviewer’s conclusion in point (6). We believe transmission from cattle-to-humans is likely underestimated for the reasons given above. Not only do all prior studies indicate ruminants as the primary reservoirs of E. coli O157:H7, and humans as only incidental hosts, our specific data do not support the reviewer’s individual contentions. That said, we will conduct a sensitivity analysis as recommended to determine the impact of sampling and inclusion of the small clusters on our primary findings.

      (7) We hypothesize that the large proportion of disease associated with local transmission systems is a principal cause of Alberta's high E. coli O157:H7 incidence" - this seems a bit tautological. There is a lot of O157 because there's a lot of transmission. What part of the fact it is local means that it is a principal cause of high incidence? It seems that they've observed a high rate of local transmission, but the reasons for this are not apparent, and hence the cause of Alberta's incidence is not apparent. Would a better conclusion not be that "X% of STEC in Alberta is the result of transmission of local variants"? And then, this poses a question for future epi studies of what the transmission pathway is.

      The reviewer is correct, and the suggestion for the direction of future studies was our intent with this statement. We will revise it.

      Reviewer #2 (Public Review):

      This study identified multiple locally evolving lineages transmitted between cattle and humans persistently associated with E. coli O157:H7 illnesses for up to 13 years. Furthermore, this study mentions a dramatic shift in the local persistent lineages toward strains with the more virulent stx2a-only profile. The authors hypothesized that this phenomenon is the large proportion of disease associated with local transmission systems is a principal cause of Alberta's high E. coli O157:H7 incidence. These opinions more effectively explain the role of the cattle reservoir in the dynamics of E. coli O157:H7 human infections.

      (1) The authors acknowledge the possibility of intermediate hosts or environmental reservoirs playing a role in transmission. Further discussion on the potential roles of other animal species commonly found in Alberta (e.g., sheep, goats, swine) could enhance the understanding of the transmission dynamics. Were isolates from these species available for analysis? If not, the authors should clearly state this limitation.

      We will expand the discussion of other species in Alberta, as suggested, including other livestock, wildlife, and the potential role of birds and flies. Unfortunately, we did not have sequences available from other species, and we will add this to the limitations. Sequences from other species may be available from sequences collected by others, which as we note in the limitations do not have sufficient metadata to assign them to Alberta vs. the rest of Canada. While we have requested this data, we have been unsuccessful in obtaining it. We will continue to pursue it.

      (2) The focus on E. coli O157:H7 is understandable given its prominence in Alberta and the availability of historical data. However, a brief discussion on the potential applicability of the findings to non-O157 STEC serogroups, and the limitations therein, would be beneficial. Are there reasons to believe the transmission dynamics would be similar or different for other serogroups?

      We appreciate this comment and will expand our discussion of relevance to non-O157 STEC. Other authors have proposed that transmission dynamics differ, and studies of STEC risk factors, including our own, support this. However, there has been very little direct study of non-O157 transmission dynamics and there is even less cross-species genomic and metadata available for non-O157 isolates of concern.

      (3) The authors briefly mention the need for elucidating local transmission systems to inform management strategies. A more detailed discussion on specific public health interventions that could be targeted at the identified LPLs and their potential reservoirs would strengthen the paper's impact.

      We agree with the reviewer that this would be a good addition to the manuscript. The public health implications for control are several and extend to non-STEC reportable zoonotic enteric infections, such as Campylobacter and Salmonella. We will add a discussion of these.

      (4) Understanding the relationship between specific risk factors and E. coli O157:H7 infections is essential for developing effective prevention strategies. Have case-control or cohort studies been conducted to assess the correlation between identified risk factors and the incidence of E. coli O157:H7 infections? What methodologies were employed to control for potential confounders in these studies?

      Yes, there have been several case-control studies of reported cases. Many of these are referenced in the discussion in terms of the contribution of different sources to infection. However, we will add a more explicit discussion of risk factors.

      (5) The study's findings are noteworthy, particularly in the context of E. coli O157:H7 epidemiology. However, the extent to which these results can be replicated across different temporal and geographical settings remains an open question. It would be constructive for the authors to provide additional data that demonstrate the replication of their sampling and sequencing experiments under varied conditions. This would address concerns regarding the specificity of the observed patterns to the initial study's parameters.

      We appreciate the reviewer’s comment, as we are currently building on this analysis with an American dataset with different types of data available than were used in this study. We will add a discussion of this. We will also be adding a sensitivity analysis to the manuscript simulating a different sampling approach, which should also be informative to this question.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Weaknesses:

      The authors need to discuss their study in the context of previous papers that have shown an important role for E. tarda flagellin in inflammasome activation and test whether flagellin and/or E. tarda T3SSs needle or rod can activate NLRC4.

      We will add discussions on E. tarda flagellin and examine whether E. tarda flagellin or T3SS needle/rod can activate NLRC4.

      The authors show that eseB and its homologs can activate NLRC4, but there are also other translocon proteins that are very different such as YopB or PopB. and share little homology with eseB. It would be nice to include a section comparing the different type 3 secretion systems. are there 2 different families of T3SSs, those that feature translocon components that are recognized by NAIP-NLRC4 and those that cannot be recognized?

      The reviewer raises an interesting question. We will explore this question and provide relevant discussions/hypothesis in the revised manuscript.

      Reviewer #2 (Public Review):

      Weaknesses:

      The functional assessment of EseB homologues is limited to inflammasome activation at the protein level but does not include the effects on cell viability as shown for E. tarda EseB. Confirmation that EseB homologues have similar effects on cell death would strengthen this portion of the manuscript.

      According to the reviewer’s suggestion, we plan to examine the effects of representative EseB homologs on cell death.

    1. Author response:

      The following is the authors’ response to the current reviews.

      The concerns raised during the review have been incorporated into the discussion of the results, and the need for further research is acknowledged in the paper. This is not possible in the present study, as the clinical project has been completed and further patients cannot be enrolled without starting a new project. We are confident that the results are scientifically valid and that the methodology was scientifically sound and up to date. They were obtained on a dataset that was obviously large enough to allow 20% of it to be set aside and a machine-learned classifier to be trained on the remaining 80%, which then assigned samples to neuropathy with an accuracy better than guessing.

      Furthermore, our results are at least tentatively replicated in a completely independent data set from another patient cohort. The strengths and limitations of the study design, in particular the latter, are discussed in the necessary depth. In summary, the machine-learned results provided major hits on one side and probably unimportant lipids on the other side of the variable importance scale. Both could be verified in vitro. We are therefore confident that we have contributed to the advancement of knowledge about cancer therapy-associated neuropathy and look forward to further developments in this area.


      The following is the authors’ response to the original reviews.

      Weaknesses Reviewer 1: 

      There are a number of weaknesses in the study. The small sample size is a significant limitation of the study. Out of 31 patients, only 17 patients were reported to develop neuropathy, with significant neuropathy (grade 2/3) in only 5 patients. The authors acknowledge this limitation in the results and discussion sections of the manuscript, but it limits the interpretation of the results. Also acknowledged is the limited method used to assess neuropathy. 

      We agree with the reviewer that the cohort size and assessment of neuropathy are limitations of our study as we already described in the corresponding section of the manuscript. However, occurrence and grade of the neuropathy are in line with results reported from previous studies. From these studies, the expected occurrence of neuropathy with our therapeutic regimen is around 50-70% (54.9% in our cohort), and most patients (80-90%) are expected to experience Grade 1 neuropathy after 12 weeks (13). In these studies, neuropathy is assessed by using questionnaires or by grading via NCTCTCAE as in our study. In summary, assessment and occurrence of neuropathy of our reported cohort are in line with previous reports.

      Potentially due to this small number of patients with neuropathy, the machine learning algorithms could not distinguish between samples with and without neuropathy. Only selected univariate analyses identified differences in lipid profiles potentially related to neuropathy.  

      The data analysis consistently followed a "mixture of experts" approach, as this seems to be the most successful way to deal with omics data. We have elaborated on this in the Methods section, including several supporting references. Regarding the quoted sentence from the results section, after rereading it, we realized that it was somewhat awkwardly worded. What we mean is now better worded in the results section, namely “Although the three algorithms detected neuropathy in new cases, unseen during training, at balanced accuracy of up to 0.75, while only the guess level of 0.5 was achieved when using permuted data for training, the 95% CI of the performance measures was not separated from guess level”. Therefore, multivariate feature selection was not considered a valid approach, since it requires that the algorithms from which the feature importance is read can successfully perform their task of class assignment (4). Therefore, univariate methods (Cohen's d, FPR, FWE) were preferred, as well as a direct hypothesis transfer of the top hits from the abovementioned day1/2 assessments to neuropathy. Classical statistics consisting of direct group comparisons using Kruskal-Wallis tests (5) were performed.” 

      It was our approach to investigate the data set in an unbiased manner by different machine learning algorithms and select those lipids that the majority of the algorithms considered important for distinguishing the patient groups (majority voting). This way, the inconsistencies and limitations of a single evaluation method, such as regression analysis, that occur in some datasets, can be mitigated. 

      Three sphingolipid mediators including SA1P differed between patients with and without neuropathy at the end of treatment. These sphingolipids were elevated at the end of treatment in the cohort with neuropathy, relative to those without neuropathy. However, across all samples from pre to post-paclitaxel treatment, there was a significant reduction in SA1P levels. It is unclear from the data presented what the underlying mechanism for this result would be. 

      We agree with the reviewer that our study does not identify the mechanism by which paclitaxel treatment alters sphingolipid concentrations in the plasma of patients. It has been reported before that paclitaxel may increase expression and activity of serine palmitoyltransferase (SPT) which is the crucial enzyme and rate-limiting step in the denovo synthesis of sphingolipids. This may be associated with a shift towards increased synthesis of 1-deoxysphingolipids and a decrease of “classical” sphingolipids (6) and may explain the general reduction of SA1P and other sphingolipid levels after paclitaxel treatment in our study. 

      It is also conceivable that paclitaxel reduces the release of sphingolipids into the plasma. Paclitaxel is a microtubule stabilizing agent (7) that may interfere with intracellular transport processes and release of paracrine mediators. 

      The mechanistic details of paclitaxel involvement in sphingolipid metabolism or transport are highly interesting but identifying them is beyond the scope of our manuscript.

      If elevated SA1P is associated with neuropathy development, it would be expected to increase in those who develop neuropathy from pre to post-treatment time points. 

      There is a general trend of reduced plasma SA1P concentrations following paclitaxel treatment. Nevertheless, patients experiencing neuropathy exhibit significantly elevated SA1P levels post-treatment. 

      It has been shown before that paclitaxel-induced neuropathic pain requires activation of the S1P1 receptor in a preclinical study (8). Moreover, a meta-analysis of genome-wide association studies (GWAS) from two clinical cohorts identified multiple regulatory elements and increased activity of S1PR1 associated with paclitaxel-induced neuropathy (9). These data imply that enhanced S1P receptor activity and signaling are key drivers of paclitaxel-induced neuropathy. It seems that both, increased levels of the sphingolipid ligands in combination with enhanced expression and activity of S1P receptors can potentiate paclitaxel-induced neuropathy in patients. This explains why also decreased SA1P concentrations after paclitaxel treatment can still enhance neuropathy via the S1PRTRPV1 axis in sensory neurons.

      We added this paragraph to the discussions section of our manuscript.

      Primary sensory neuron cultures were used to examine the effects of SA1P application.

      SA1P application produced calcium transients in a small proportion of sensory neurons. It is not clear how this experimental model assists in validating the role of SA1P in neuropathy development as there is no assessment of sensory neuron damage or other hallmarks of peripheral neuropathy. These results demonstrate that some sensory neurons respond to SA1P and that this activity is linked to TRPV1 receptors. However, further studies will be required to determine if this is mechanistically related to neuropathy.

      As we detected elevated levels of SA1P in the plasma of PIPN patients, we can assume higher concentrations in the vicinity of sensory neurons. These neurons are the main drivers for neuropathy and neuropathic pain and are strongly affected by paclitaxel in their activity (10-15). Also, TRPV1 shows altered activity patterns in response to paclitaxel treatment (16). Because of its relevance for nociception and pathological pain, TRPV1 activity is a suitable and representative readout for pathological pain states in peripheral sensory neurons (17, 18), which is why we investigated them.

      We would like to point out the potency of SA1P to increase capsaicin-induced calciumtransients in sensory neurons at submicromolar concentrations. 

      We also agree with the reviewer that further studies need to investigate the underlying mechanisms in more detail. We added this sentence to the final paragraph in the discussion section of our manuscript.

      Weaknesses Reviewer 2: 

      The article is poorly written, hindering a clear understanding of core results. While the study's goals are apparent, the interpretation of sphingolipids, particularly SA1P, as key mediators of paclitaxel-induced neuropathy lacks robust evidence. 

      We agree that the relevance of SA1P as key mediator of paclitaxel-induced neuropathy might be overstated and changed the wording throughout the manuscript accordingly. However, we would like to point out the potency of this lipid to increase capsaicin-induced calcium-transients in sensory neurons at submicromolar concentrations. 

      Also, the lipid signature in the plasma of PIPN patients shows a unique pattern and sphingolipids are the group that showed the strongest alterations when comparing the patient groups. We also measured eicosanoids, such as prostaglandins, linoleic acid metabolites, endocannabinoids and other lipid groups that have previously been associated with influences on pain perception or nociceptor sensitization. However, none of these lipids showed significant differences in their concentrations in patient plasma. This is why we consider sphingolipids as contributors to or markers of paclitaxel-induced neuropathy in patients.

      We also revised the entire article to improve its clarity.

      The introduction fails to establish the significance of general neuropathy or peripheral neuropathy in anticancer drug-treated patients, and crucial details, such as the percentage of patients developing general neuropathy or peripheral neuropathy, are omitted. This omission is particularly relevant given that only around 50% of patients developed neuropathy in this study, primarily of mild Grade 1 severity with negligible symptoms, contradicting the study's assertion of CIPN as a significant side effect. 

      As we already described in the introduction, CIPN is a serious dose- and therapy-limiting side effect, which affects up to 80% of treated patients. This depends on dose and combination of chemotherapeutic agents. For paclitaxel, therapeutic doses range from 80 – 225 mg/m². As CIPN symptoms are dose-dependent, the number of PIPN patients that receive a high paclitaxel dose is higher than the number of PIPN patient receiving a low dose.

      In our study, we mainly used a low dose paclitaxel, because this therapeutic regimen is the most widely used paclitaxel monotherapy. From previous studies, the expected occurrence of neuropathy with this therapeutic regimen is around 50-70%, and most patients (8090%) are expected to experience Grade 1 neuropathy after 12 weeks (1-3).

      Our results are within the range reported by these studies (54.9% patients with neuropathy). Also, as we highlight in Table S1, the neuropathy symptoms persist in most cases for several years after chemotherapy, affecting quality of life of these patients which makes it far from being a negligible symptom.

      We added some more information concerning PIPN in the introduction section in which we emphasize the clinical problem.

      The lack of clarity in distinguishing results obtained by lipidomics using machine learning methods and conventional methods adds to the confusion. The poorly written results section fails to specify SA1P's downregulation or upregulation, and the process of narrowing down to sphingolipids and SA1P is inadequately explained. 

      We have tried to keep the machine learning part in the main manuscript short and moved major parts of it to a supplement. However, as this has been claimed to have led to a lack of clarity, we have expanded the description of the data analysis and added extensive explanations and supporting references for the mixed expert approach that was used throughout the analysis. We hope this is now clear.

      Integrating a significant portion of the discussion section into the results section could enhance clarity. An explanation of the utility of machine learning in classifying patient groups over conventional methods and the citation of original research articles, rather than relying on review articles, may also add clarity to the usefulness of the study. 

      As suggested by the reviewer, we moved the relevant parts from the discussion to the results section in the revised version of our manuscript.

      Reviewer #1 (Recommendations For The Authors): 

      Figure 2 should be better explained or removed. In its current form, it does not add to the interpretation of the manuscript.  

      As mentioned above, we have expanded the description of the ESOM/U-matrix method in the Methods section and rewritten the figure legend. In addition, we have annotated the U-matrix in the figure. The method has been reported extensively in the computer science and biomedical literature, and a more detailed description in the referenced papers would go beyond the current focus on lipidomics. However, we believe that this discussion is sufficiently detailed for the readers of this report: "… a second unsupervised approach was used to verify the agreement between the lipidomics data structure and the prior classification, implemented as self-organizing maps (SOM) of artificial neurons (19). In the special form of an “emergent” SOM (ESOM (20)), the present map consisted of 4,000 neurons arranged on a two-dimensional toroidal grid with 50 rows and 80 columns (21, 22). ESOM was used because it has been repeatedly shown to correctly detect subgroup structures in biomedical data sets comparable to the present one (20, 22, 23). The core principle of SOM learning is to adjust the weights of neurons based on their proximity to input data points. In this process, the best matching unit (BMU) is identified as the neuron closest to a given data point. The adaptation of the weights is determined by a learning rate (η) and a neighborhood function (h), both of which gradually decrease during the learning process. Finally, the groups are projected onto separate regions of the map. On top of the trained ESOM, the distance structure in the high-dimensional feature space was visualized in the form of a so-called U-matrix (24) which is the canonical tool for displaying the distance structures of input data on ESOM (21). 

      The visual presentation facilitates data group separation by displaying the distances between BMUs in high-dimensional space in a color-coding that uses a geographical map analogy, where large "heights" represent large distances in feature space, while low "valleys" represent data subsets that are similar. "Mountain ranges" with "snow-covered" heights visually separate the clusters in the data. Further details about ESOM can be found in (24)."

      The second patient cohort is only included in the discussion - with cohort details in the supplementary material and figures included in the main text. Perhaps these data should be removed entirely. The findings are described as trends and not statistically significant and multiple issues with this second cohort are mentioned in the discussion. 

      We agree with the reviewer that including the second patient cohort in the discussion is inadequate. Of course, there are differences between the patient cohorts that do not allow direct comparison and that are highlighted in the section on limitations of the study. However, we still think it is interesting and relevant to show these data, because we used our algorithms trained on the first patient cohort to analyze the second cohort. And these data support the main results. 

      We therefore moved the entire paragraph to the results section of to improve coherence of our manuscript. The passage was introduced with the subheading:  “Support of the main results in an independent second patient cohort”.

      The title does not reflect the content of the paper and should be changed to better reflect the content and its significance. 

      We change the title to “Machine learning and biological validation identify sphingolipids as potential mediators of paclitaxel-induced neuropathy in cancer patients” to avoid overstating the results as suggested by the Reviewer.

      Further, the discussion should be modified to avoid overstating the results. 

      As the reviewer suggests, we changed the wording to avoid overstating the results. 

      Reviewer #2 (Recommendations For The Authors): 

      Please address the absence of clear neuropathy in the majority of patients after treatment with paclitaxel in your discussion. 

      As stated above, occurrence and grade of the neuropathy are in line with the results from previous studies. From these studies, the expected occurrence of neuropathy with our therapeutic regimen is around 50-70%, (the variability is due to differences in the assessment methods) and most patients (80-90%) are expected to experience Grade 1 neuropathy after 12 weeks (1-3). 

      We added this information in the discussion section of the revised manuscript.

      Line 65: Kindly replace review articles with original research articles for proper citation. 

      We replaced the review articles with original publications, focusing on clinical observations. We added the following publications: Jensen et al., Front Neurosci 2020; Chen et al., Neurobiol Aging 2018; Igarashi et al., J Alzheimers Dis. 2011; Kim et al., Oncotarget 2017 as references 17-20 in the revised version of our manuscript.

      Line 260: The mention of SA1P is introduced here without prior reference (do not use words like "again", or "see above", if it is not previously mentioned). Adjust the text for coherence.

      We agree with the reviewer that the introduction of SA1P in this passage in incoherent. We replaced the sentence in line 260 with: 

      The small set of lipid mediators emerging from all three methods as informative for neuropathy included the sphingolipid sphinganine-1-phosphate (SA1P), also known as dihydrosphingosine-1-phosphate (DH-S1P)…”

      Lines 301-315: Consider relocating several lines from this section to the results section for improved clarity. 

      We moved the lines 309-312 explaining the algorithm selection and their validation success in the corresponding results section (Lipid mediators informative for assigning postpaclitaxel therapy samples to neuropathy).

      Lines 382-396: Move this content to the results section to enhance the organization and coherence of the manuscript. 

      We moved the entire paragraph to the results section of our manuscript to improve coherence. The passage was introduced with the subheading:  “Support of the main results in an independent second patient cohort”.

      References

      (1) Barginear M, Dueck AC, Allred JB, Bunnell C, Cohen HJ, Freedman RA, et al. Age and the Risk of Paclitaxel-Induced Neuropathy in Women with Early-Stage Breast Cancer (Alliance A151411): Results from 1,881 Patients from Cancer and Leukemia Group B (CALGB) 40101. Oncologist. 2019;24(5):617-23.

      (2) Mauri D, Kamposioras K, Tsali L, Bristianou M, Valachis A, Karathanasi I, et al. Overall survival benefit for weekly vs. three-weekly taxanes regimens in advanced breast cancer: A metaanalysis. Cancer Treat Rev. 2010;36(1):69-74.

      (3) Budd GT, Barlow WE, Moore HC, Hobday TJ, Stewart JA, Isaacs C, et al. SWOG S0221: a phase III trial comparing chemotherapy schedules in high-risk early-stage breast cancer. J Clin Oncol. 2015;33(1):58-64.

      (4) Lötsch J, and Ultsch A. Pitfalls of Using Multinomial Regression Analysis to Identify ClassStructure-Relevant Variables in Biomedical Data Sets: Why a Mixture of Experts (MOE) Approach Is Better. BioMedInformatics. 2023;3(4):869-84.

      (5) Kruskal WH, and Wallis WA. Use of Ranks in One-Criterion Variance Analysis. J Am Stat Assoc. 1952;47(260):583-621.

      (6) Kramer R, Bielawski J, Kistner-Griffin E, Othman A, Alecu I, Ernst D, et al. Neurotoxic 1deoxysphingolipids and paclitaxel-induced peripheral neuropathy. FASEB J. 2015;29(11):4461-72.

      (7) Field JJ, Diaz JF, and Miller JH. The binding sites of microtubule-stabilizing agents. Chem Biol. 2013;20(3):301-15.

      (8) Janes K, Little JW, Li C, Bryant L, Chen C, Chen Z, et al. The development and maintenance of paclitaxel-induced neuropathic pain require activation of the sphingosine 1-phosphate receptor subtype 1. J Biol Chem. 2014;289(30):21082-97.

      (9) Chua KC, Xiong C, Ho C, Mushiroda T, Jiang C, Mulkey F, et al. Genomewide Meta-Analysis Validates a Role for S1PR1 in Microtubule Targeting Agent-Induced Sensory Peripheral Neuropathy. Clin Pharmacol Ther. 2020;108(3):625-34.

      (10) Kawakami K, Chiba T, Katagiri N, Saduka M, Abe K, Utsunomiya I, et al. Paclitaxel increases high voltage-dependent calcium channel current in dorsal root ganglion neurons of the rat. J Pharmacol Sci. 2012;120(3):187-95.

      (11) Pittman SK, Gracias NG, Vasko MR, and Fehrenbacher JC. Paclitaxel alters the evoked release of calcitonin gene-related peptide from rat sensory neurons in culture. Exp Neurol. 2013.

      (12) Luo H, Liu HZ, Zhang WW, Matsuda M, Lv N, Chen G, et al. Interleukin-17 Regulates NeuronGlial Communications, Synaptic Transmission, and Neuropathic Pain after Chemotherapy.

      Cell reports. 2019;29(8):2384-97 e5.

      (13) Pease-Raissi SE, Pazyra-Murphy MF, Li Y, Wachter F, Fukuda Y, Fenstermacher SJ, et al. Paclitaxel Reduces Axonal Bclw to Initiate IP3R1-Dependent Axon Degeneration. Neuron. 2017;96(2):373-86 e6.

      (14) Duggett NA, Griffiths LA, and Flatters SJL. Paclitaxel-induced painful neuropathy is associated with changes in mitochondrial bioenergetics, glycolysis, and an energy deficit in dorsal root ganglia neurons. Pain. 2017.

      (15) Li Y, Adamek P, Zhang H, Tatsui CE, Rhines LD, Mrozkova P, et al. The Cancer Chemotherapeutic Paclitaxel Increases Human and Rodent Sensory Neuron Responses to TRPV1 by Activation of TLR4. J Neurosci. 2015;35(39):13487-500.

      (16) Hara T, Chiba T, Abe K, Makabe A, Ikeno S, Kawakami K, et al. Effect of paclitaxel on transient receptor potential vanilloid 1 in rat dorsal root ganglion. Pain. 2013;154(6):882-9.

      (17) Jardin I, Lopez JJ, Diez R, Sanchez-Collado J, Cantonero C, Albarran L, et al. TRPs in Pain Sensation. Front Physiol. 2017;8:392.

      (18) Julius D. TRP Channels and Pain. Annual review of cell and developmental biology.

      2013;29:355-84.

      (19) Kohonen T. Self-Organized Formation of Topologically Correct Feature Maps. Biol Cybern. 1982;43(1):59-69.

      (20) Lötsch J, Lerch F, Djaldetti R, Tegder I, and Ultsch A. Identification of disease-distinct complex biomarker patterns by means of unsupervised machine-learning using an interactive R toolbox (Umatrix). Big Data Analytics. 2018;3(1):5.

      (21) Ultsch A. 2003.

      (22) Lotsch J, Geisslinger G, Heinemann S, Lerch F, Oertel BG, and Ultsch A. Quantitative sensory testing response patterns to capsaicin- and ultraviolet-B-induced local skin hypersensitization in healthy subjects: a machine-learned analysis. Pain. 2018;159(1):11-24.

      (23) Lötsch J, Thrun M, Lerch F, Brunkhorst R, Schiffmann S, Thomas D, et al. Machine-Learned Data Structures of Lipid Marker Serum Concentrations in Multiple Sclerosis Patients Differ from Those in Healthy Subjects. Int J Mol Sci. 2017;18(6).

      (24) Lötsch J, and Ultsch A. Cham: Springer International Publishing; 2014:249-57.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Wu et al. introduce a novel approach to reactivate the Muller glia cell cycle in the mouse retina by simultaneously reducing p27Kip1 and increasing cyclin D1 using a single AAV vector. The approach effectively promotes Muller glia proliferation and reprograming without disrupting retinal structure or function. Interestingly, reactivation of the Muller glia cell cycle downregulates IFN pathway, which may contribute to the induced retinal regeneration. The results presented in this manuscript may offer a promising approach for developing Müller glia cell-mediated regenerative therapies for retinal diseases.

      Strengths:

      The data are convincing and supported by appropriate, validated methodology. These results are both technically and scientifically exciting and are likely to appeal to retinal specialists and neuroscientists in general.

      Weaknesses:

      There are some data gaps that need to be addressed.

      (1) Please label the time points of AAV injection, EdU labeling, and harvest in Figure 1B.

      We thank the reviewer for highlighting the lack of clarity in our experimental design. We will label all experiment timelines in the figures where appropriate in the revised version.

      (2) What fraction of Müller cells were transduced by AAV under the experimental conditions?

      We apologize for not clearly conveying the transduction efficiency. The retinal region adjacent to the injection site, typically near the central retina, exhibits a transduction efficiency of nearly 100%. In contrast, the peripheral retina shows a lower transduction efficiency compared to the central region. We will include the quantification of AAV transduction efficiency in the revised manuscript.

      The quantification of Edu+ MG or other markers was conducted in the area with the highest efficiency. 

      (3) It seems unusually rapid for MG proliferation to begin as early as the third day after CCA injection. Can the authors provide evidence for cyclin D1 overexpression and p27 Kip1 knockdown three days after CCA injection?

      In our pilot study, we tested the onset time of GFP expression from AAV-GFAP-GFP following intravitreal injection. We observed GFP expression in MG as early as two days post-infection. These findings will be included in the revised manuscript. Additionally, we plan to perform qPCR or Western blot analysis to confirm cyclin D1 overexpression and p27kip1 knockdown at the onset of Müller glia proliferation, which will also be included in the revised manuscript.

      (4) The authors reported that MG proliferation largely ceased two weeks after CCA treatment. While this is an interesting finding, the explanation that it might be due to the dilution of AAV episomal genome copies in the dividing cells seems far-fetched.

      We believe that the lack of durability in high Cyclin D1 and low p27kip1 levels in MG contributes to the cessation of their proliferation. A potential reason for the loss of high Cyclin D1 overexpression and p27kip1 knockdown during MG proliferation could be the dilution of the AAV episomal genome. However, testing this hypothesis is challenging. Instead, we plan to provide direct evidence in the revised manuscript by examining the levels of Cyclin D1 and p27kip1 in the retina treated with CCA before and after the peak of MG proliferation.

      Reviewer #2 (Public Review):

      This manuscript by Wu, Liao et al. reports that simultaneous knockdown of P27Kip1 with overexpression of Cyclin D can stimulate Muller glia to re-enter the cell cycle in the mouse retina. There is intense interest in reprogramming mammalian muller glia into a source for neurogenic progenitors, in the hopes that these cells could be a source for neuronal replacement in neurodegenerative diseases. Previous work in the field has shown ways in which mouse Muller glia can be neurogenically reprogrammed and these studies have shown cell cycle re-entry prior to neurogenesis. In other works, typically, the extent of glial proliferation is limited, and the authors of this study highlight the importance of stimulating large numbers of Muller glia to re-enter the cell cycle with the hopes they will differentiate into neurons. While the evidence for stimulating proliferation in this study is convincing, the evidence for neurogenesis in this study is not convincing or robust, suggesting that stimulating cell cycle-reentry may not be associated with increasing regeneration without another proneural stimulus.

      Below are concerns and suggestions.

      Intro:

      (1) The authors cite past studies showing "direct conversion" of MG into neurons. However, these studies (PMID: 34686336; 36417510) show EdU+ MG-derived neurons suggesting cell cycle re-entry does occur in these strategies of proneural TF overexpression.

      We thank the reviewer for pointing this out. We will revise the statement to "MG neurogenesis," which encompasses both direct conversion and Müller glia proliferation followed by neuronal differentiation.

      (2) Multiple citations are incorrectly listed, using the authors first name only (i.e. Yumi, et al; Levi, et al;). Studies are also incompletely referenced in the references.

      We apologize for the mistake with the reference. We will fix these mistakes in the revised version.

      Figure 1:

      (3) When are these experiments ending? On Figure 1B it says "analysis" on the end of the paradigm without an actual day associated with this. This is the case for many later figures too. The authors should update the paradigms to accurately reflect experimental end points.

      We thank the reviewer for highlighting the lack of clarity in our experimental design. We will label all experiment timelines in the figures where appropriate in the revised version.

      (4) Are there better representative pictures between P27kd and CyclinD OE, the EdU+ counts say there is a 3 fold increase between Figure 1D&E, however the pictures do not reflect this. In fact, most of the Edu+ cells in Figure 1E don't seem to be Sox9+ MG but rather horizontally oriented nuclei in the OPL that are likely microglia.

      Thanks to the reviewer for pointing this out. We will replace the image of Cyclin D1 which a better representative image.

      (5) Is the infection efficacy of these viruses different between different combinations (i.e. CyclinD OE vs. P27kd vs. control vs. CCA combo)? As the counts are shown in Figure 1G only Sox9+/Edu+ cells are shown not divided by virus efficacy. If these are absolute counts blind to where the virus is and how many cells the virus hits, if the virus efficacy varies in efficiency this could drive absolute differences that aren't actually biological.

      Because the AAV-GFAP-Cyclin D1 and AAV-GFAP-Cyclin D1-p27kip1 shRNA viruses do not carry a fluorescent reporter gene, we cannot easily measure viral efficacy in the same experiment. We believe that variations in viral efficacy cannot account for the significant differences in MG proliferation for two reasons: 1) We injected the same titer for all three viruses, and 2) Viral infection efficacy is very high, approaching 100% in the central retina. Nonetheless, to rule out the possibility that the differences in MG proliferation among the Cyclin D overexpression, p27kip1 knockdown, and CCA groups are due to variations in viral efficacy, we will include the p27kip1 knockdown and Cyclin D1 overexpression efficiencies for all four groups using qPCR and/or Western blot analysis in the revised manuscript.

      (6) According to the Jax laboratories, mice aren't considered aged until they are over 18months old. While it is interesting that CCA treatment does not seem to lose efficacy over maturation I would rephrase the findings as the experiment does not test this virus in aged retinas.

      Thank you to the reviewer for bringing this to our attention. We will void using “aged mice” in our revised manuscript.

      (7) Supplemental Figure 2c-d. These viruses do not hit 100% of MG, however 100% of the P27Kip staining is gone in the P27sh1 treatment, even the P27+ cell in the GCL that is likely an astrocyte has no staining in the shRNA 1 picture. Why is this?

      For Supplementary Figure 2c-d, we focused on the central area where knockdown efficiency was high, approaching 100%. We will replace this image with one that includes both high and low Müller glia transduction efficiency regions, clearly demonstrating the complete loss of p27kip1 staining in the area of high transduction efficiency.

      Figure 2

      (8) Would you expect cells to go through two rounds of cell cycle in such a short time? The treatment of giving Edu then BrdU 24 hours later would have to catch a cell going through two rounds of division in a very short amount of time. Again the end point should be added graphically to this figure.

      We thank the reviewer for raising this important point. While the typical cell cycle time for human cells is approximately 24 hours, we hypothesized that 24 hours would be the most likely timepoint to capture cells continuously progressing through the cell cycle. However, we acknowledge that we cannot exclude the possibility of some cells entering a second cell cycle at much later timepoints.

      In the revised manuscript, we will carefully qualify our conclusion to state that the majority of MG do not immediately undergo another cell division, rather than making a definitive statement. This more cautious phrasing will better reflect the limitations of the 24-hour timepoint and allow for the potential of a small subset of cells proceeding through additional rounds of division at later stages.

      Figure 3

      (9) I am confused by the mixing of ratios of viruses to indicate infection success. I know mixtures of viruses containing CCA or control GFP or a control LacZ was injected. Was the idea to probe for GFP or LacZ in the single cell data to see which cells were infected but not treated? This is not shown anywhere?

      The virus infection was not uniform across the entire retina. To mark the infection hotspots, we added 10% GFP virus to the mixture. Regions of the retina with low infection efficiency were removed by dissection and excluded from the scRNA-seq analysis. We apologize for not clearly explaining this methodological detail in the original text, and will update the Methods section accordingly.

      (10) The majority of glia sorted from TdTomato are probably not infected with virus. Can you subset cells that were infected only for analysis? Otherwise it makes it very hard to make population judgements like Figure 3E-H if a large portion are basically WT glia.

      This question is related to the last one. Since the regions with high virus infection efficiency were selectively dissected and isolated for analysis, the percentage of CCA-infected MG should constitute the majority in the scRNA-seq data.

      (11) Figure 3C you can see Rho is expressed everywhere which is common in studies like this because the ambient RNA is so high. This makes it very hard to talk about "Rod-like" MG as this is probably an artifact from the technique. Most all scRNA-seq studies from MG-reprogramming have shown clusters of "rods" with MG hybrid gene expression and these had in the past just been considered an artifact.

      We agree that the low levels of Rho in other MG clusters (such as quiescent, reactivated, and proliferating MG) are likely due to RNA contamination. However, the level of Rho in the rod-like MG is significantly higher than in the other clusters, indicating that this is unlikely to be solely due to contamination.

      As shown in Supplementary Figure 7A-C, a cluster of MG-rod hybrid cells (cluster C4) was present in all three experimental groups at similar ratios, and this hybrid cluster was excluded from further analysis. In contrast, the rod-like Müller glia (cluster C3) were predominantly found in the CCA and CCANT groups, suggesting a genuine response to CCA treatment.

      Furthermore, we will conduct Rho and Gnat1 RNA in situ hybridization on the dissociated retinal cells to further support the conclusion that rod-specific genes are upregulated in a subset of MG in the revised manuscript.

      (12) It is mentioned the "glial" signature is downregulated in response to CCA treatment. Where is this shown convincingly? Figure H has a feature plot of Glul , which is not clear it is changed between treatments. Otherwise MG genes are shown as a function of cluster not treatment.

      We will add box plots of several MG-specific genes to better illustrate the downregulation of the glial signature in the relevant cell cluster in the revised manuscript.

      Figure 4

      (13) The authors should be commended for being very careful in their interpretations. They employ the proper controls (Er-Cre lineage tracing/EdU-pulse chasing/scRNA-seq omics) and were very careful to attempt to see MG-derived rods. This makes the conclusion from the FISH perplexing. The few puncta dots of Rho and GNAT in MG are not convincing to this reviewer, Rho and GNAT dots are dense everywhere throughout the ONL and if you drew any random circle in the ONL it would be full of dots. The rigor of these counts also comes into question because some dots are picked up in MG in the INL even in the control case. This is confusing because baseline healthy MG do not express RNA-transcripts of these Rod genes so what is this picking up? Taken together, the conclusion that there are Rod-like MG are based off scRNA-seq data (which is likely ambient contamination) and these FISH images. I don't think this data warrants the conclusion that MG upregulate Rod genes in response to CCA.

      We performed RNA in situ hybridization on retinal sections because we aimed to correlate cell localization with rod gene expression. We understand the reviewer’s concern that the punctate signals of Rho and GNAT1 in the ONL MG may actually originate from neighboring rods. In the revised manuscript, we will conduct RNAscope on dissociated retinal cells to avoid this issue.

      Figure 5

      (14) Similar point to above but this Glul probe seems odd, why is it throughout the ONL but completely dark through the IPL, this should also be in astrocytes can you see it in the GCL? These retinas look cropped at the INL where below is completely black. The whole retinal section should be shown. Antibodies exist to GS that work in mouse along with many other MG genes, IHC or western blots could be done to better serve this point.

      Indeed, the GCL was cropped out in Figure 5 A-B. We have other images with all retinal layers, which we will use in the revised manuscript. Additionally, we will perform the GS antibody staining to demonstrate partial MG dedifferentiation following CCA treatment.

      Figure 6

      (15) Figure 6D is not a co-labeled OTX2+/ TdTomato+ cell, Otx2 will fill out the whole nucleus as can be seen with examples from other MG-reprogramming papers in the field (Hoang, et al. 2020; Todd, et al. 2020; Palazzo, et al. 2022). You can clearly see in the example in Figure 6D the nucleus extending way beyond Otx2 expression as it is probably overlapping in space. Other examples should be shown, however, considering less than 1% of cells were putatively Otx2+, the safer interpretation is that these cells are not differentiating into neurons. At least 99.5% are not.

      We have additional examples of Otx2+ Tdt+ Edu+ cells, which suggest that MG neurogenesis to Otx2+ cells does occur, despite the low efficiency. We will include these images in the revised manuscript.

      (16) Same as above Figure 6I is not convincingly co-labeled HuC/D is an RNA-binding protein and unfortunately is not always the clearest stain but this looks like background haze in the INL overlapping. Other amacrine markers could be tested, but again due to the very low numbers, I think no neurogenesis is occurring.

      We have additional examples of HuC/D+ Tdt+ Edu+ cells, which we will show in the revised manuscript.

      (17) In the text the authors are accidently referring to Figure 6 as Figure 7.

      We thank the reviewer for pointing out the mistake. We will correct the mistake in the revised manuscript.

      Figure 7

      (18) I like this figure and the concept that you can have additional MG proliferating without destroying the retina or compromising vision. This is reminiscent of the chick MG reprogramming studies in which MG proliferate in large numbers and often do not differentiate into neurons yet still persist de-laminated for long time points.

      General:

      (19) The title should be changed, as I don't believe there is any convincing evidence of regeneration of neurons. Understanding the barriers to MG cell-cycle re-entry are important and I believe the authors did a good job in that respect, however it is an oversell to report regeneration of neurons from this data.

      We thank the reviewer for the suggestion. We will consider changing the title in the revised manuscript.

      (20) This paper uses multiple mouse lines and it is often confusing when the text and figures switch between models. I think it would be helpful to readers if the mouse strain was added to graphical paradigms in each figure when a different mouse line is employed.

      We will label the mouse lines used in each experiment in the figures where appropriate.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      Shore et al. report important effects of a heterozygous mutation in the KCNT1 potassium channel on ion currents and firing behavior of excitatory and inhibitory neurons in the cortex of KCNT1-Y777H mice. The authors provide solid evidence of physiological differences between this heterozygous mutation and their previous work with homozygotes. The reviewers appreciated the inclusion of recordings in ex vivo slices and dissociated cortical neurons, as well as the additional evidence showing an increase in persistent sodium currents (INaP) in parvalbumin-positive interneurons in heterozygotes. However, they were unclear regarding the likelihood of the increased sodium influx through INaP channels increasing sodium-activated potassium currents in these neurons.

      Regarding the last sentence of the eLife assessment, we’ve added a new paragraph to the Discussion section of the paper to address this concern. Please see the response to comment 1B of Reviewer #1 below for more details. We feel that the question of whether an increase in INaP would further increase KCNT1 activity is a valid discussion point but not a limitation of the importance or rigor of the work itself.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This manuscript reports the effects of a heterozygous mutation in the KCNT1 potassium channels on the properties of ion currents and firing behavior of excitatory and inhibitory neurons in the cortex of mice expressing KCNT1-Y777H. In humans, this mutation as well as multiple other heterozygotic mutations produce very severe early-onset seizures and produce a major disruption of all intellectual function. In contrast, in mice, this heterozygous mutation appears to have no behavioral phenotype or any increased propensity to seizures. A relevant phenotype is, however, evident in mice with the homozygous mutation, and the authors have previously published the results of similar experiments with the homozygotes. As perhaps expected, the neuronal effects of the heterozygous mutation presented in this manuscript are generally similar but markedly smaller than the previously published findings on homozygotes. There are, however, some interesting differences, particularly on PV+ interneurons, which appear to be more excitable than wild type in the heterozygotes but more excitable in the heterozygotes. This raises the interesting question, which has been explicitly discussed by the authors in the revised manuscript, as to whether the reported changes represent homeostatic events that suppress the seizure phenotype in the mouse heterozygotes or simply changes in excitability that do not reach the threshold for behavioral outcomes.

      Strengths and Weaknesses:

      (1) The authors find that the heterozygous mutation in PV+ interneurons increases their excitability, a result that is opposite from their previous observation in neurons with the corresponding homozygous mutation. They propose that this results from the selective upregulation of a persistent sodium current INaP in the PV+ interneurons. These observations are very interesting ones, and they raised some issues in the original submission:

      A) The protocol for measuring the INaP current could potentially lead to results that could be (mis)interpreted in different ways in different cells. First, neither K currents nor Ca currents are blocked in these experiments. Instead, TTX is applied to the cells relatively rapidly (within 1 second) and the ramp protocol is applied immediately thereafter. It is stated that, at this time, Na currents and INaP are fully blocked but that any effects on Na-activated K currents are minimal. In theory this would allow the pre- to post- difference current to represent a relatively uncontaminated INaP. This would, however, only work if activation of KNa currents following Na entry is very slow, taking many seconds. A good deal of literature has suggested that the kinetics of activation of KNa currents by Na influx vary substantially between cell types, such that single action potentials and single excitatory synaptic events rapidly evoke KNa currents in some cell types. This is, of course, much faster than the time of TTX application. Most importantly, the kinetics of KNa activation may be different in different neuronal types, which would lead to errors that could produce different estimates of INaP in PV+ interneurons vs other cell types.

      In their revised manuscript, the authors have provided good data demonstrating that, at least for the PV and SST neurons, loss of KNa currents after TTX application is slow relative to the time course of loss of INaP, justifying the use of this protocol for these neuronal types.

      B) As the authors recognize, INaP current provides a major source of cytoplasmic sodium ions for the activation. An expected outcome of increased INaP is, therefore, further activation of KNa currents, rather than a compensatory increase in an inward current that counteracts the increase in KNa currents, as is suggested in the discussion.

      The authors comment in the rebuttal that, despite the fact that sodium entry through INaP is known to activate KNa channels, an increase in INaP does not necessarily imply increased KNa current. This issue should be addressed directly somewhere in the text, perhaps most appropriately in the discussion.

      We’ve added the following new paragraph to the Discussion section of the manuscript to address this concern:

      “As the persistent sodium current has been shown to act as a source of cytoplasmic sodium ions for KCNT1 channel activation in some neuron types (Hage & Salkoff, 2012), one might expect that the compensatory increase in INaP in YH-HET PV neurons would further increase, rather than counteract, KNa currents. Unfortunately, there is insufficient information on the relative locations of the INaP and KCNT1 channels, as well as the kinetics of sodium transfer to KCNT1 channels, among cortical neuron subtypes, and even less is known in the context of KCNT1 GOF neurons; thus, it is difficult to predict how alterations in one of these currents may affect the other. One plausible reason that increased INaP would not alter KNa currents in YH-HET PV neurons is that the particular sodium channels that are responsible for the increased INaP are not located within close proximity to the KCNT1 channels. Moreover, homeostatic mechanisms that modify the length and/or location of the sodium channel-enriched axon initial segment (AIS) in neurons in response to altered excitability are well described (Grubb & Burrone, 2010; Kuba et al., 2010); thus, it is possible that in YH-HET PV neurons, the length or location of the AIS is altered, leading to uncoupling of the sodium channels that are responsible for the increased INaP to the KCNT1 channels. Future studies will aim to further investigate potential mechanisms of neuron-type-specific alterations in NaP and KNa currents downstream of KCNT1 GOF.”  

      C) The numerical simulations, in general, provide a very useful way to evaluate the significance of experimental findings. Nevertheless, while the in-silico modeling suggests that increases in INaP can increase firing rate in models of PV+ neurons, there is as yet insufficient information on the relative locations of the INaP channels and the kinetics of sodium transfer to KNa channels to evaluate the validity of this specific model.

      The authors have now put in all of the appropriate caveats on this very nicely in the revised manuscript.

      (2) The effects of the KCNT1 channel blocker VU170 on potassium currents are somewhat larger and different from those of TTX, suggesting that additional sources of sodium may contribute to activating KCNT1, as suggested by the authors. Because VU170 is, however, a novel pharmacological agent, it may be appropriate to make more careful statements on this. While the original published description of this compound reported no effect on a variety of other channels, there are many that were not tested, including Na and cation channels that are known to activate KCNT1, raising the possibility of off-target effects.

      In the revised version, the authors have added more to the manuscript on this issue and have added a very clear discussion of this to the text (in the discussion section).

      This is a very clear and thorough piece of work, and the authors are to be congratulated on this. My one remaining suggestion would be to make an explicit statement about whether increased sodium influx through INaP channels, which is thought to activate KNa channels, would be likely to increase KNa current in these neurons (see comment 1B).

      Please see response to comment 1B.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Shore et al. investigate the consequent changes in excitability and synaptic efficacy of diverse neuronal populations in an animal model of juvenile epilepsy. Using electrophysiological patch-clamp recordings from dissociated neuronal cultures, the authors find diverging changes in two major populations of inhibitory cell types, namely somatostatin (SST)- and parvalbumin (PV)-positive interneurons, in mice expressing a variant of the KCNT1 potassium channel. They further suggest that the differential effects are due to a compensatory increase in the persistent sodium current in PV interneurons in pharmacological and in silico experiments. It remains unclear why this current is selectively enhanced in PV-interneurons.

      Strengths:

      (1) Heterozygous KCNT1 gain of function variant was used which more accurately models the human disorder.

      (2) The manuscript is clearly written, and the flow is easy to follow. The authors explicitly state the similarities and differences between the current findings and the previously published results in the homozygous KCNT1 gain of function variant.

      (3) This study uses a variety of approaches including patch clamp recording, in silico modeling and pharmacology that together make the claims stronger.

      (4) Pharmacological experiments are fraught with off-target effects and thus it bolsters the authors' claims when multiple channel blockers (TTX and VU170) are used to reconstruct the sodium-activated potassium current.

      Weaknesses:

      (1) This study mostly relies on recordings in dissociated cortical neurons. Although specific WT interneurons showed intrinsic membrane properties like those reported for acute brain slices, it is unclear whether the same will be true for those cells expressing KCNT1 variants, especially when the excitability changes are thought to arise from homeostatic compensatory mechanisms. The authors do confirm that mutant SST-interneurons are hypoexcitable using an ex vivo slice preparation which is consistent with work for other KCTN1 gain of function variants (e.g. Gertler et al., 2022). However, the key missing evidence is the excitability state of mutant PV-interneurons, given the discrepant result of reduced excitability of PV cells reported by Gertler et al in acute hippocampal slices.

      Reviewer #3 (Public Review):

      Summary:

      The present manuscript by Shore et al. entitled Reduced GABAergic Neuron Excitability, Altered Synaptic Connectivity, and Seizures in a KCNT1 Gain-of-Function Mouse Model of Childhood Epilepsy" describes in vitro and in silico results obtained in cortical neurons from mice carrying the KCNT1-Y777H gain-of-function (GOF) variant in the KCNT1 gene encoding for a subunit of the Na+-activated K+ (KNa) channel. This variant corresponds to the human Y796H variant found in a family with Autosomal Dominant Nocturnal Frontal lobe epilepsy. The occurrence of GOF variants in potassium channel encoding genes is well known, and among potential pathophysiological mechanisms, impaired inhibition has been documented as responsible for KCNT1-related DEEs. Therefore, building on a previous study by the same group performed in homozygous KI animals, and considering that the largest majority of pathogenic KCNT1 variants in humans occur in heterozygosis, the Authors have investigated the effects of heterozygous Kcnt1-Y777H expression on KNa currents and neuronal physiology among cortical glutamatergic and the 3 main classes of GABAergic neurons, namely those expressing vasoactive intestinal polypeptide (VIP), somatostatin (SST), and parvalbumin (PV), crossing KCNT1-Y777H mice with PV-, SST- and PV-cre mouse lines, and recording from GABAergic neurons identified by their expression of mCherry (but negative for GFP used to mark excitatory neurons).

      The results obtained revealed heterogeneous effects of the variant on KNa and action potential firing rates in distinct neuronal subpopulations, ranging from no change (glutamatergic and VIP GABAergic) to decreased excitability (SST GABAergic) to increased excitability (PV GABAergic). In particular, modelling and in vitro data revealed that an increase in persistent Na current occurring in PV neurons was sufficient to overcome the effects of KCNT1 GOF and cause an overall increase in AP generation.

      Strengths:

      The paper is very well written, the results clearly presented and interpreted, and the discussion focuses on the most relevant points.

      The recordings performed in distinct neuronal subpopulations (both in primary neuronal cultures and, for some subpopulations, in cortical slices, are a clear strength of the paper. The finding that the same variant can cause opposite effects and trigger specific homeostatic mechanisms in distinct neuronal populations is very relevant for the field, as it narrows the existing gap between experimental models and clinical evidence.

      Weaknesses:

      My main concern regarding the epileptic phenotype of the heterozygous mice investigated has been clarified in the revision, where the infrequent occurrence of seizures is more clearly stated. Also, a more detailed statistical analysis of the modeled neurons has been added in the revision.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      This is a very clear and thorough piece of work, and the authors are to be congratulated on this. My one remaining suggestion would be to make an explicit statement about whether increased sodium influx through INaP channels, which is thought to activate KNa channels, would be likely to increase KNa current in these neurons (see comment 1B).

      Please see response to comment 1B.

      Reviewer #2 (Recommendations For The Authors):

      This revised manuscript is significantly improved and addresses most of my concerns. However, I would still recommend including the ex vivo slice recordings in mutant PV-interneurons as the authors proposed in their rebuttal. The I-V recordings using sequential TTX and VU170 blockade in WT SST and PV-interneurons that are provided in the rebuttal are interesting and may point to a preferential expression of persistent sodium currents in PV-interneurons normally. It would be helpful to readers as a supplemental figure.

      As proposed in the rebuttal, we are currently recording PV neurons using ex vivo slice preparations from WT and Kcnt1-YH Het mice. We look forward to including those data in a future manuscript.

      We agree with the reviewer that the differences in INaP between WT PV and SST neurons are notable. The data provided in the rebuttal were only from 5 neurons/group, and they were meant to illustrate a side-by-side comparison of TTX and VU170 subtraction methods to assess KNa currents. However, in Figure 7 of the manuscript, we performed more robust measurements of INaP and observed differences in the current between WT PV and SST neurons. Thus, we’ve added the following sentence to the Results section:

      “Interestingly, the mean peak amplitude of INaP in WT PV neurons was 70% larger than that in WT SST neurons (-1.42 ± 0.16 vs. -0.85 ± 0.07 pA/pF; Fig. 7B and 7D), suggesting there may be differences in sodium channel expression, localization, or regulation inherent to each neuron type that confer their differential response to KCNT1 GOF.”

      References

      Grubb, M. S., & Burrone, J. (2010). Activity-dependent relocation of the axon initial segment fine-tunes neuronal excitability. Nature, 465(7301), 1070-1074. https://doi.org/10.1038/nature09160

      Hage, T. A., & Salkoff, L. (2012). Sodium-activated potassium channels are functionally coupled to persistent sodium currents. J Neurosci, 32(8), 2714-2721. https://doi.org/10.1523/JNEUROSCI.5088-11.2012

      Kuba, H., Oichi, Y., & Ohmori, H. (2010). Presynaptic activity regulates Na(+) channel distribution at the axon initial segment. Nature, 465(7301), 1075-1078. https://doi.org/10.1038/nature09087

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors show for the first time that deleting GLS from rod photoreceptors results in the rapid death of these cells. The death of photoreceptor cells could result from loss of synaptic activity because of a decrease in glutamate, as has been shown in neurons, changes in redox balance, or nutrient deprivation. 

      Strengths: 

      The strength of this manuscript is that the author shows a similar phenotype in the mice when Gls was knocked out early in rod development or the adult rod. They showed that rapid cell death is through apoptosis, and there is an increase in the expression of genes responsive to oxidative stress. 

      We thank the reviewer for their time reviewing the manuscript and their comments regarding the potential mechanism(s) by which rod photoreceptors rapidly degenerate upon knockout of GLS.

      Weaknesses: 

      In this manuscript, the authors show a "metabolic dependency of photoreceptors on glutamine catabolism in vivo". However, there is a potential bias in their thinking that glutamine metabolism in rods is similar to cancer cells where it feeds into the TCA cycle. They should consider that as in neurons, GLS1 activity provides glutamate for synaptic transmission. The modest rescue shown by providing α-ketoglutarate in the drinking water suggests that glutamine isn't a key metabolic substrate for rods when glucose is plentiful. The ERG studies performed on the iCre-Glsflox/flox mice showed a large decrease in the scotopic b wave at saturating flashes which could indicate a decrease in glutamate at the rod synapse as stated by the authors. While EM micrographs of wt and iCre-Glsflox/flox mice were shown for the outer retina at p14, the synapse of the rods needs to be examined by EM. 

      We agree with the reviewer that in the presence of sufficient glucose, it appears a lack of GLS-driven glutamine (Gln) catabolism does not drastically alter the levels of TCA cycle metabolites or mitochondrial function as we demonstrated in Figure 4, and supplementation with alpha-ketoglutarate improved outer nuclear layer thickness by only a small amount as observed in Figure 5e. Hence, as we stated in the Results and Discussion, at least in the mouse where Gls is selectively deleted from rod photoreceptors by crossing Glsfl/fl mice with Rho-Cre mice (Glsfl/fl; Rho-Cre+, cKO), Gln’s role in supporting the TCA cycle is not the major mechanism by which rod photoreceptors utilize Gln to suppress apoptosis.

      With regards to GLS-driven Gln catabolism providing glutamate (Glu) for synaptic transmission, we again agree with the reviewer that Glu is an important excitatory neurotransmitter, but it is also a key metabolite necessary for the synthesis of glutathione, amino acids, and proteins. As noted and discussed at length in the manuscript, a lack of GLS-driven Gln catabolism in rod photoreceptors leads to reduced levels of oxidized glutathione (Figure 4D) possibly signaling an overall reduction in the biosynthesis of glutathione as Glu is directly and indirectly responsible for its synthesis. Furthermore, Gln and GLS-derived Glu play a central role in the biosynthesis of several nonessential amino acids and proteins. To this end, we see a reduction in the level of Glu, which is the product of the GLS reaction and further confirms the loss of GLS function. We also noted a significant decrease in aspartate (Asp), which can be constructed from the carbons and nitrogens of Gln as discussed at length in the manuscript (Figure 6A). Finally, we noted a significant decrease in global protein synthesis in the cKO retina as compared to the wild-type animal as well (Figure 6E). Therefore, the data suggest that GLS-driven Gln catabolism is critical for amino acid metabolism and protein synthesis and to some degree redox balance; although, the small but statistically significant changes in oxidized glutathione, NADP/NADPH, and redox gene expression may not fully account for the rapid and complete photoreceptor degeneration observed. Future studies are necessary to shed light on the role of redox imbalance in this novel transgenic mouse model.

      Glu also plays a role in synaptic transmission, and we considered this scenario as described in Figure 1 – figure supplement 5. Here, the synaptic connectivity between photoreceptors and the inner retina did not demonstrate significant differences in the labeling of photoreceptor synaptic membranes in the outer plexiform layer nor alterations in the labeling of a key protein (Bassoon) in ribbon synapses. These data suggest that the synaptic connectivity between photoreceptors and second-order neurons was unaltered at P14 in the cKO retina, which is the time just prior to rapid photoreceptor degeneration. We agree, though, that to obtain greater insight into the alterations in the ribbon synapse, EM images can be examined. The EM images shown in Figure 1 – figure supplement 4 are from P21 and will be utilized to assess the ribbon synapse for the revised version of the article.

      With regards to the ERG changes noted in Figure 2, we agree with the reviewer that a large decrease was noted in the scotopic b-wave at P21 and P42 in the cKO. However, an even larger reduction in the scotopic a-wave was noted at these ages as well. In animal models that disrupt photoreceptor synaptic function (Dick et al. Neuron. 2003; Johnson et al. J Neuroscience. 2007; Haeseleer et al. Nature Neuroscience. 2004; Chang et al. Vis Neurosci. 2006), a more negative ERG pattern is typically observed with the b-wave altered to a much larger degree than the a-wave. Additionally, in these models that disrupt photoreceptor synaptic transmission, the overall structure of the retina with respect to thickness is maintained (Dick et al. Neuron. 2003) or noted to have modest changes in the outer plexiform layer within the first two months of age with the outer nuclear layer not significantly altered until 8-10 months of age (Haeseleer et al. Nature Neuroscience. 2004). In contrast, a rapid decline in the outer nuclear layer thickness was observed in the cKO retina after P14 likely contributing to the ERG changes noted in Figure 2.  Also, Gln is catabolized to Glu primarily by GLS as suggested by the approximately 50% reduction in Glu levels in the cKO retina (Figure 6A), but other enzymes are also capable of catabolizing Gln to Glu, so Glu levels in the rod photoreceptors are unlikely to be zero. Coupling this with the fact that rods are equipped with a self-sufficient Glu recollecting system at their synaptic terminals (Hasegawa et al. Neuron. 2006; Winkler et al. Vis Neurosci. 1999) and that GLS activity is at least two-fold higher in the photoreceptor inner segments, which support energy production and metabolism, than any other layer in the retina (Ross et al. Brain Res. 1987) suggests that altered synaptic transmission secondary to reduced levels of Glu likely does not account in full for the rapid and robust photoreceptor degeneration observed in the cKO retina.

      The authors note that the outer segments are shorter but they do not address whether there is a decrease in the number of cones. 

      The number of cones will be assessed and provided in the revised version of the article.

      Rod-specific Gls ko mice with an inducible promoter were generated by crossing the Pde6g-CreERT2 and homozygous for either the WT or floxed Gls allele (IND-cKO). In Figure 3 the authors document that by western blots and antibody labeling the GLS1 expression is lost in the IND-cKO 10 days post tamoxifen. OCT images show a decrease in the thickness of the outer nuclear layer between 17 and 38 days post-TAM. Ergs should be performed on the animals at 10 and 30 days post TAM, before and after major structural changes in rod photoreceptor cells, to determine if changes in light-stimulated responses are observed. These studies could help to parse out the cause of photoreceptor cell death. 

      We agree with the reviewer that the IND-cKO is a useful tool to help parse out the cause of photoreceptor cell death in this model as well as shed light on the role of GLS-driven Gln catabolism in photoreceptor synaptic transmission as discussed at length above. Hence, ERG analyses will be provided for these animals in the revised version of the article.

      The studies in Figure 4 were all performed on iCre-Glsflox/flox and control mice at p14, why weren't the IND-cKO mice used for these studies since the findings would not be confounded by development? 

      To gain further insight into the role of GLS-driven Gln catabolism in the maintenance of rod photoreceptors as compared to their development/maturation, we will provide ERG and targeted metabolomic analyses of the IND-cKO retina in the revised version of the article.

      In all rescue studies, the endpoint was an ONL thickness, which only addressed rod cell death. The authors should also determine whether there are small improvements in the ERG, which would distinguish the role of GLS in preventing oxidative stress. 

      Optical coherence tomography (OCT) provides a sensitive in vivo method to detect small changes in retinal thickness without potential artifacts incurred through histological processing. Considering the Gls cKO retina demonstrates significant and rapid photoreceptor degeneration, we wanted to assess pathways that may be critical to photoreceptor survival downstream of GLS-driven Gln catabolism using rescue experiments with pharmacologic treatment or metabolite supplementation. That said, disruption of GLS-driven Gln catabolism may also significantly alter rod photoreceptor function beyond that which is secondary to photoreceptor cell death. As such, changes in ERG will be examined and provided in the revised version of the article for certain rescue experiments that demonstrated a robust change in ONL thickness.

      Reviewer #2 (Public Review): 

      Summary: 

      Photoreceptor neurons are crucial for vision, and discovering pathways necessary for photoreceptor health and survival can open new avenues for therapeutics. Studies have shown that metabolic dysfunction can cause photoreceptor degeneration and vision loss, but the metabolic pathways maintaining photoreceptor health are not well understood. This is a fundamental study that shows that glutamine catabolism is critical for photoreceptor cell health using in vivo model systems. 

      Strengths: 

      The data are compelling, and the consideration of potential confounding factors (such as glutaminase 2 expression) and additional experiments to examine the synaptic connectivity and inner retina added strength to this work. The authors were also careful not to overstate their claims, but to provide solid conclusions that fit the results and data provided in their study. The findings linking asparagine supplementation and the inhibition of the integrated stress response to glutamine catabolism within the rod photoreceptor cell are intriguing and innovative. Overall, the authors provide convincing data to highlight that photoreceptors utilize various fuel sources to meet their metabolic needs, and that glutamine is critical to these cells for their biomass, redox balance, function, and survival. 

      We greatly appreciate the reviewer’s thoughtful comments and time spent reviewing this manuscript.

      Weaknesses: 

      Recent studies have explored the metabolic "crosstalk" that exists within the mammalian retina, where metabolites are transferred between the various retinal cells and the retinal pigment epithelium. It would be of interest to test whether the conditional knockout mice have changes in metabolism (via qPCR such as shown in Figure 4 - Supplemental Figure 1) within the retinal pigment epithelium that may be contributing to the authors' findings in the neural retina. Additionally, the authors have very compelling data to show that inhibition of eIF2a or supplementation with asparagine can delay photoreceptor death via OCT measurements in their conditional knockout mouse model (Figure 6G, H). However, does inhibition of eIF2a or asparagine adversely impact the WT retina? It would also be impactful to know whether this has a prolonged effect, or if it is short-term, as this would provide strength to potential therapeutic targeting of these pathways to maintain photoreceptor health. 

      We agree with the reviewer that metabolic communication in the outer retina is crucial to the function and survival of both photoreceptors and RPE. We will perform qRT-PCR on the eyecups of these mice to assess any changes in the expression of metabolic genes. This data will be provided in the revised manuscript.

      We have data demonstrating systemic treatment with ISRIB does not adversely impact the anatomy of the wild-type retina; this data will be included in the revised manuscript as a supplement to Figure 6. Additionally, we have recent data to suggest that the effect of ISRIB extends beyond P21 in the cKO mouse. This data will be included in the revised manuscript.

      Reviewer #3 (Public Review): 

      Summary: 

      The authors explored the role of GLS, a glutaminase, which is an enzyme that catalyzes the conversion of glutamine to glutamate, in rod photoreceptor function and survival. The loss of GLS was found to cause rapid autonomous death of rod photoreceptors. 

      Strengths: 

      Interesting and novel phenotype. Two types of cre-lines were rigorously used to knockout the Gls gene in rods. Both of the conditional knockouts led to a similar phenotype, i.e. rod death. Histology and ERG were carefully done to characterize the loss of rods over specific ages. A necessary metabolomic study was performed and appreciated. Some rescue experiments were performed and revealed possible mechanisms. 

      We thank the reviewer for their comments and appreciation of the methods utilized herein to address the role of GLS-driven Gln catabolism in rod photoreceptors.

      Weaknesses: 

      No major weaknesses were identified. The mechanism of GLS-loss-induced rod death seems not fully elucidated by this study but could be followed up in the future, and the same for GLS's role in cones.

      We agree with the reviewer that the downstream metabolic and molecular mechanisms by which Gln catabolism impacts rod photoreceptor health are not fully elucidated. Defining these mechanisms will advance our understanding of photoreceptor metabolism and identify therapeutic targets promoting photoreceptor resistance to stress. Future studies are underway to uncover these mechanisms. Additionally, while outside the scope of the current manuscript, we have generated mice lacking GLS in cone photoreceptors specifically and are currently elucidating the role of GLS in cone photoreceptor metabolism, function, and survival. These results will be published in a separate manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Response to reviewers

      A general comment was that this study left several key questions unanswered, in particular the causal mechanism for the reported ribosomal distributions.  We have been interested in the evolution of asymmetric bacterial growth and aging for many years. However, a motivational difference is that we are more interested in the evolutionary process, and evolution by natural selection works on the phenotype.  Thus, we wanted to start with the phenotype closest to fitness, appropriately defined for the conditions, work downwards.  We examined first the asymmetry of elongation rates in single cells, then gene products, and now ribosomes.  As we have pointed out, our demonstration of ribosomal asymmetry shows that the phenomenon was not peculiar and unique to the gene products we examined.  Rather, the asymmetry is acting higher up in the metabolic network and likely affecting all genes.  We find such conceptual guidance to be important.  In the ideal world, of course we would have liked to have worked out the causal mechanisms in one swoop.  In a less than ideal situation, it is a subjective decision as where to stop.  We believe that the publication of this manuscript is more than appropriate at this juncture.  We work at the interface of evolutionary theory and microbiology.  Our results could appeal to both fields.  If we attract new researchers, progress could be accelerated.  Could the delay caused by publishing only completed stories slow the rate of discovery?  These questions are likely as old as science (e.g., https://telliamedrevisited.wordpress.com/2021/01/28/how-not-to-write-a-response-to-reviewers/).

      We present below our response to specific comments by reviewers.  We have not added a new discussion of papers suggested by Reviewer #1 because we feel that the speculations would have been too unfocused.  We were already criticized for speculation in the Discussion about a link between aggregate size and ribosomal density.

      Respond to Major comments by Reviewer #1.

      a) Fig. 1 only shows 2 divisions (rather than 3 as per Rev1) to avoid an overly elaborate figure.  We have added text to the figure legend that the old and new poles and daughters in the subsequent 3, 4, 5, 6, and 7 generations can be determined by following the same notations and tracking we presented for generations 1 and 2 in Fig. 1.  For example, if we know the old and new poles of any of the four daughters after 2 divisions (as in Fig. 1), and allow that daughter to elongate, become a mother, and divide to produce 2 “grand-daughters”, the polarity of the grand-daughters can also be determined.

      b) Because division times were normalized and analyzed as quartiles, the raw values were never used.  Rather than annotating unused values, we have provided the mean division times in the Material and Methods section on normalization to provide representative values.

      c) We did not quantify in our study the changes over generations for three reasons.  First, the sample sizes for the first generations (cohorts of 1, 2, 4, and 8 cells) are statistically small.  Second, and most importantly, cells on an agar pad in a microscope slide, despite being inoculated as fresh exponentially growing cells, experience a growth lag, as all cells transferred to a new physiological condition.  Thus, to be safe, we do not collect data from cohorts 1, 2, 4, and 8 to ensure that our cells are as much as possible physiologically uniform.  Lastly, as we noted in the Material and Methods they also slow down after 7 generations (128 cells).  Thus, we have collected ribosome and length measurements primarily from cohorts 16, 32, 64, and 128.  Measurable cells from the 128 cohort are actually rare because a colony with that many cells often starts to form double layers, which are not measurable.  Most of our measurements came from the 16, 32, and 64 cohorts, in which case a time series would not be meaningful.  Some of these details were not included in our manuscript but have been added to the Material and Methods (Microscopy and time-lapse movies).  For these reasons we have not added a time series as requested by the reviewer.

      d) We have added the additional figure as requested, but as a supplement rather than in the main article (Supplemental Materials Fig. S1).  This figure showed the normalized density of ribosomes along the normalized length of old and new daughters.  The density was continuous rather than quartiles.  This figure was included in the original manuscript, but readers recommended that it be removed because the all the analyzed data had been done with quartiles.  Readers felt mislead and confused.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This study presents careful biochemical experiments to understand the relationship between LRRK2 GTP hydrolysis parameters and LRRK2 kinase activity. The authors report that incubation of LRRK2 with ATP increases the KM for GTP and decreases the kcat. From this they suppose an autophosphorylation process is responsible for enzyme inhibition. LRRK2 T1343A showed no change, consistent with it needing to be phosphorylated to explain the changes in G-domain properties. The authors propose that phosphorylation of T1343 inhibits kinase activity and influences monomer-dimer transitions.

      Strengths:

      Strengths of the work are the very careful biochemical analyses and interesting result for wild type LRRK2.

      Weaknesses:

      The conclusions related to involvement of a monomer-dimer transition are to this reviewer, premature and an independent method needs to be utilized to bolster this aspect of the story.

      The monomer-dimer transition has been described in detail in our recent preprint Guaitoli et al., 2023 (doi: 10.1101/2023.08.11.549911). Where we in addition to mass-photometry have used blue-native page. Furthermore, to better elucidate the mechanistic impact of the phosphorylation, we have provided AlphaFold3 models. As the new AlphaFold version allows to consider PTMs as well as small molecules, we compared the models of the GDP vs the GTP-state of pT1343 LRRK2. Interestingly, the AF3 model suggests, that the phosphate of the pT1343 is orientated inwards thereby substituting the gamma phosphate (see Supplementary Figure 5). This finding is in well agreement with MD simulations published recently (Stormer et al., 2023, doi: 10.1042/BCJ20230126). As we are determining GTP hydrolysis in a multi turnover situation, the pT1343 might hamper the hydrolysis by competing with GTP re-binding. Final models have been deposited on Zenodo (https://doi.org/10.5281/zenodo.11242230).

      Reviewer #2 (Public Review):

      As discussed in the original review, this manuscript is an important contribution to a mechanistic understanding of LRRK2 kinase. Kinetic parameters for the GTPase activity of the ROC domain have been determined in the absence/presence of kinase activity. A feedback mechanism from the kinase domain to GTP/GDP hydrolysis by the ROC domain is convincingly demonstrated through these kinetic analyses. However, a regulatory mechanism directly linking the T1343 phosphosite and a monomer/dimer equilibrium is not fully supported. The T1343A mutant has reduced catalytic activity and can form similar levels of dimer as WT. The revised manuscript does point out that other regulatory mechanisms can also play a role in kinase activity and GTP/GDP hydrolysis (Discussion section). The environmental context in cells cannot be captured from the kinetic assays performed in this manuscript, and the introduction contains some citations regarding these regulatory factors. This is not a criticism, the detailed kinetics here are rigorous, but it is simply a limitation of the approach. Caveats concerning effects of membrane localization, Rab/14-3-3 proteins, WD40 domain oligomers, etc... should be given more prominence than a brief (and vague) allusion to 'allosteric targeting' near the end of the Discussion.

      We thank the reviewer for the evaluation of the manuscript and suggestions made. With respect to the mentioned caveats regarding the complex regulation of LRRK2 in its native cellular environment by effectors, localization and effector binding, we have revised the discussion, accordingly. We nevertheless, want to emphasize that the phospho-null mutant T1343A leads to an increase in Rab10 phosphorylation in cells, demonstrating a relevance of this regulatory mechanism under near physiological conditions (shown in Figure 6). In addition, to further elucidate the molecular mechanisms of the p-loop phosphorylation at T1343, we have performed AlphaFold3 modelling allowing to include phosphoresidues (see comment above, Supplemental Figure 5).

      Specific comments

      (1) The revised version is better organized with respect to the significance of monomer/dimer equilibrium and the relevance of the GTP-binding region of ROC domain that encompasses the T1343 phospho-site. The relevance of monomers/dimers of LRRK2 from previous studies is better articulated and readers are able to follow the reasoning for the various mutations.

      We thank the reviewer for the positive feedback. 

      (2) As a suggestion I would change the following on page 6 to clarify for readers: "...would show no change in kcat and KM values upon in vitro ATP treatment" to:

      "...would show no change in kcat and KM values for GTP hydrolysis upon in vitro

      ATP treatment"

      (3) The levels of dimer in WT (+ATP) and T1343A (+/- ATP) are the same, about 40-45%. These data are cited when the authors state that ATP-induced monomerization is 'abolished' (page 6). My suggestion is to re-phrase this conclusion for consistency with data (Fig 5). For example, one can state that 'ATP incubation does not affect the percentage of dimer for the T1343A variant of LRRK2'. This would be similar to the authors' description of these data on page 8 - 'no difference in dimer formation upon ATP treatment'.

      We thank the reviewer for the suggestions. We revised the manuscript accordingly. Changes have been highlighted in the version provided for reviewing purposes.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      Minor revisions

      -change 'Although functional work on LRRK2 has been made significant progress...' to 'Although there is significant progress toward functional characterization of LRRK2...'

      -change 'exact mechanisms' to 'precise mechanisms', and similarly 'exact interplay' to 'precise interplay'

      -change 'On a contrary' to 'On the contrary' in Discussion

      -change remained to be unchanged' to 'remains unchanged', page 8

      We thank the reviewer for having noticed this. We have revised the manuscript accordingly.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this paper, the researchers aimed to address whether bees causally understand string-pulling through a series of experiments. I first briefly summarize what they did:

      - In experiment 1, the researchers trained bees without string and then presented them with flowers in the test phase that either had connected or disconnected strings, to determine what their preference was without any training. Bees did not show any preference.

      - In experiment 2, bees were trained to have experience with string and then tested on their choice between connected vs. disconnected string.

      - experiment 3 was similar except that instead of having one option which was an attached string broken in the middle, the string was completely disconnected from the flower.

      - In experiment 4, bees were trained on green strings and tested on white strings to determine if they generalize across color.

      - In experiment 5, bees were trained on blue strings and tested on white strings.

      - In experiment 6, bees were trained where black tape covered the area between the string and the flower (i.e. so they would not be able to see/ learn whether it was connected or disconnected).

      - In experiments 2-6, bees chose the connected string in the test phase.

      - In experiment 7, bees were trained as in experiment 3 and then tested where the string was either disconnected or coiled i.e. still being 'functional' but appearing different.

      - In experiment 8, bees were trained as before and then tested on a string that was in a different coiled orientation, either connected or disconnected.

      - In experiments 7 and 8 the bees showed no preference.

      Strengths:

      I appreciate the amount of work that has gone into this study and think it contains a nice, thorough set of experiments. I enjoyed reading the paper and felt that overall it was well-written and clear. I think experiment 1 shows that bees do not have an untrained understanding of the function of the string in this context. The rest of the experiments indicate that with training, bees have a preference for unbroken over broken string and likely use visual cues learned during training to make this choice. They also show that as in other contexts, bees readily generalize across different colors.

      Weaknesses:

      (1) I think there are 2 key pieces of information that can be taken from the test phase - the bees' first choice and then their behavior across the whole test. I think the first choice is critical in terms of what the bee has learned from the training phase - then their behavior from this point is informed by the feedback they obtain during the test phase. I think both pieces of information are worth considering, but their behavior across the entire test phase is giving different information than their first choice, and this distinction could be made more explicit. In addition, while the bees' first choice is reported, no statistics are presented for their preferences.

      We agree with the reviewer that the first choice is critical in terms of what the bumblebees have learned from the training phase. We analyzed the bees’ first choice in Table 1, and we added the tested videos. The entire connected and disconnected strings were glued to the floor, the bees were unable to move either the connected or disconnected strings, and avoid learning behavior during the tests. We added the data of bee's each choice in the Supplementary table.

      (2) It seemed to me that the bees might not only be using visual feedback but also motor feedback. This would not explain their behavior in the first test choice, but could explain some of their subsequent behavior. For example, bees might learn during training that there is some friction/weight associated with pulling the string, but in cases where the string is separated from the flower, this would presumably feel different to the bee in terms of the physical feedback it is receiving. I'd be interested to see some of these test videos (perhaps these could be shared as supplementary material, in addition to the training videos already uploaded), to see what the bees' behavior looks like after they attempt to pull a disconnected string.

      We added supplementary videos of testing phase. As noted in General Methods, both connected and disconnected strings were glued to the floor to prevent the air flow generated by flying bumblebees’ wings from changing the position of the string during the testing phase. The bees were unable to move either the connected or disconnected strings during the tests, and only attempted to pull them. Therefore, the difference in the friction/weight of pulling the both strings cannot be a factor in the test.

      (3) I think the statistics section needs to be made clearer (more in private comments).

      We changed the statistical analysis section as suggested by the reviewer.

      (4) I think the paper would be made stronger by considering the natural context in which the bee performs this behavior. Bees manipulate flowers in all kinds of contexts and scrabble with their legs to achieve nectar rewards. Rather than thinking that it is pulling a string, my guess would be that the bee learns that a particular motor pattern within their usual foraging repertoire (scrabbling with legs), leads to a reward. I don't think this makes the behavior any less interesting - in fact, I think considering the behavior through an ecological lens can help make better sense of it.

      Here we respectfully disagree. The solving of Rubik’s cube by humans could be said to be version of finger-movements naturally required to open nuts or remove ticks from fur, but this is somewhat beside the point: it’s not the motor sequences that are of interest, but the cognition involved. A general approach in work on animal intelligence and cognition is to deliberately choose paradigms that are outside the animals’ daily routines-this is what we have done here, in asking whether there is means-end comprehension in bee problem solving. Like comparable studies on this question in other animals, the experiments are designed to probe this question, not one of ecological validity.

      Reviewer #2 (Public Review):

      Summary:

      The authors wanted to see if bumblebees could succeed in the string-pulling paradigm with broken strings. They found that bumblebees can learn to pull strings and that they have a preference to pull on intact strings vs broken ones. The authors conclude that bumblebees use image matching to complete the string-pulling task.

      Strengths:

      The study has an excellent experimental design and contributes to our understanding of what information bumblebees use to solve a string-pulling task.

      Weaknesses:

      Overall, I think the manuscript is good, but it is missing some context. Why do bumblebees rely on image matching rather than causal reasoning? Could it have something to do with their ecology? And how is the task relevant for bumblebees in the wild? Does the test translate to any real-life situations? Is pulling a natural behaviour that bees do? Does image matching have adaptive significance?

      We appreciate the valuable comment from the reviewer. Our explanation, which we have now added to the manuscript, is as follows:

      “Different flower species offer varying profitability in terms of nectar and pollen to bumblebees; they need to make careful choices and learn to use floral cues to predict rewards (Chittka, 2017). Bumblebees can easily learn visual patterns and shapes of flower (Meyer-Rochow, 2019); they can detect stimuli and discriminate between differently coloured stimuli when presented as briefly as 25 ms (Nityananda et al., 2014). In contrast, causal reasoning involves understanding and responding to causal relationships. Bumblebees might favor, or be limited to, a visual approach, likely due to the efficiency and simplicity of processing visual cues to solve the string-pulling task. ”

      As above, it worth noting that our work is not designed as an ecological study, but one about the question of whether causal reasoning can explain how bees solve a string-pulling puzzle. We have a cognitive focus, in line with comparable studies on other animals. We deliberately chose a paradigm that is to some extent outside of the daily challenges of the animal.

      Reviewer #3 (Public Review):

      Summary:

      This paper presents bees with varying levels of experience with a choice task where bees have to choose to pull either a connected or unconnected string, each attached to a yellow flower containing sugar water. Bees without experience of string pulling did not choose the connected string above chance (experiment 1), but with experience of horizontal string pulling (as in the right-hand panel of Figure 4) bees did choose the connected string above chance (experiments 2-3), even when the string colour changed between training and test (experiments 4-5). Bees that were not provided with perceptual-motor feedback (i.e they could not observe that each pull of the string moved the flower) during training still learned to string pull and then chose the connected string option above chance (experiment 6). Bees with normal experience of string pulling then failed to discriminate between connected and unconnected strings when the strings were coiled or looped, rather than presented straight (experiments 7-8).

      Weaknesses:

      The authors have only provided video of some of the conditions where the bees succeeded. In general, I think a video explaining each condition and then showing a clip of a typical performance would make it much easier to follow the study designs for scholars. Videos of the conditions bees failed at would be highly useful in order to compare different hypotheses for how the bees are solving this problem. I also think it is highly important to code the videos for switching behaviours. When solving the connected vs unconnected string tasks, when bees were observed pulling the unconnected string, did they quickly switch to the other string? Or did they continue to pull the wrong string? This would help discriminate the use of perceptual-motor feedback from other hypotheses.

      We added the test videos as suggested by the reviewer, and we added the data for each bee's choice. However, both connected and disconnected strings were glued to the floor, and therefore perceptual-motor feedback was equal and irrelevant between the choices during the test.

      The experiments are also not described well, for my below comments I have assumed that different groups of bees were tested for experiments 1-8, and that experiment 6 was run as described in line 331, where bees were given string-pulling training without perceptual feedback rather than how it is described in Figure 4B, which describes bees as receiving string pulling training with feedback.

      We now added figures of Experiment 6 and 7 in the Figure 1B, and we mentioned that different groups of bees were tested for Experiments 1-9.

      The authors suggest the bees' performance is best explained by what they term 'image matching'. However, experiment 6 does not seem to support this without assuming retroactive image matching after the problem is solved. The logic of experiment 6 is described as "This was to ensure that the bees could not see the familiar "lollipop shape" while pulling strings....If the bees prefer to pull the connected strings, this would indicate that bees memorize the arrangement of strings-connected flowers in this task." I disagree with this second sentence, removing perceptual feedback during training would prevent bees memorising the lollipop shape, because, while solving the task, they don't actually see a string connected to a yellow flower, due to the black barrier. At the end of the task, the string is now behind the bee, so unless the bee is turning around and encoding this object retrospectively as the image to match, it seems hard to imagine how the bee learns the lollipop shape.

      We agree with the reviewer that while solving the task in the last step during training, the bees don't actually see a string connected to a yellow flower, due to the black barrier. Since the full shape is only visible after the pulling is completed and this requires the bee to “check back” on the entire display after feeding, to basically conclude “ this is the shape that I need to be looking for later”.

      Another possibility is that bumblebees might remember the image of the “lollipop shape” while training the bees in the first step, in which the “lollipop shape” was directly presented to the bumblebee in the early step of the training.

      We added the experiment suggested by the reviewer, and the result showed that when a green table was placed behind the string to obscure the “lollipop shape” at any point during the training phase, the bees were unable to identify the connected string. The result further supports that bumblebees learn to choose the connected string through image matching.

      Despite this, the authors go on to describe image matching as one of their main findings. For this claim, I would suggest the authors run another experiment, identical to experiment 6 but with a black panel behind the bee, such that the string the bee pulls behind itself disappears from view. There is now no image to match at any point from the bee's perspective so it should now fail the connectivity task.

      Strengths:

      Despite these issues, this is a fascinating dataset. Experiments 1 and 2 show that the bees are not learning to discriminate between connected and unconnected stimuli rapidly in the first trials of the test. Instead, it is clear that experience in string pulling is needed to discriminate between connected and unconnected strings. What aspect of this experience is important? Experiment 6 suggests it is not image matching (when no image is provided during problem-solving, but only afterward, bees still attend to string connectivity) and casts doubt on perceptual-motor feedback (unless from the bee's perspective, they do actually get feedback that pulling the string moves the flower, video is needed here). Experiments 7 and 8 rule out means-end understanding because if the bees are capable of imagining the effect of their actions on the string and then planning out their actions (as hypotheses such as insight, means-end understanding and string connectivity suggest), they should solve these tasks. If the authors can compare the bees' performance in a more detailed way to other species, and run the experiment suggested, this will be a highly exciting paper

      We appreciate the valuable comment from the reviewer. We compared the bees' performance to other species, and conducted the experiment as suggested by the reviewer.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Smaller comments:

      Line 64: is the word 'simple' needed here? It could also be explained by more complex forms of associative learning, no?

      We deleted “simple”.

      Methods:

      Line 230: was it checked that this was high-contrast for the bees?

      We added the relevant reference in the revised manuscript.

      Line 240: how much sucrose solution was present in the flowers?

      We added 25 microliters sucrose solution in the flowers. We added the information in the revised manuscript.

      Line 266: check grammar.

      We checked the grammar as follows: “During tests, both strings were glued to the floor of the arena to prevent the air flow generated by flying bumblebees’ wings from changing the position of the string.”

      Statistical analysis:

      - What does it mean that "Bees identity and colony were analyzed with likelihood ratio tests"?

      Bees identity and colony was set as a random variable. We changed the analysis methods in the revised manuscript, and results of the all the experiments did not changed.

      - Line 359: do you mean proportion rather than percentage?

      We mean the percentage.

      - "the number of total choices as weights" - this should be explained further. This is the number of choices that each bee made? What was the variation and mean of this number? If bees varied a lot in this metric, it might make more sense to analyze their first choice (as I see you've done) and their first 10 choices or something like that - for consistency.

      This refers to the total number of choices made by each bumblebee. We added the mean and standard error of each bee’s number of choices in Table 1. Some bees pulled the string fewer than 10 times; we chose to include all choices made by each bee.

      - More generally I think the first test is more informative than the subsequent choices, since every choice after their first could be affected by feedback they are getting in that test phase. Or rather, they are telling you different things.

      All the bees were tested only once, however, you might be referring to the first choice. We used Chi-square test to analyze the bumblebees’ first choices in the test. It is worth noting that both connected and disconnected strings were glued to the floor. The bees were unable to move either the connected or disconnected strings during the tests, and only attempted to pull them. Therefore,the feedback from pulling either the connected or disconnected strings is the same.

      - Line 362: I think I know what you mean, but this should be re-phrased because the "number of" sounds more appropriate for a Poisson distribution. I think what you are testing is whether each individual bee chose the connected or the disconnected string - i.e. a 0 or 1 response for each bee?

      We agree with the reviewer that each bee chose the connected or the disconnected string - i.e. a 0 or 1 response for each bee, but not the number. We clarify this as: “The total number of the choices made by each bee was set as weights.” 

      - Line 364-365: here and elsewhere, every time you mention a model, make it clear what the dependent and independent variables are. i.e. for the mixed model, the 'bee' is the random factor? Or also the colony that the bee came from? Were these nested etc?

      We clarify this in the revised manuscript. The bee identity and colony is the random factor in the mixed model.

      - Line 368: "Latency to the first choice of each bee was recorded" - why? What were the hypotheses/ predictions here?

      The latency to the first choice was intended to see if the bumblebees were familiarizing with the testing pattern. A shorter delay time might indicate that the bumblebees were more familiar with the pattern.

      - Line 371: "Multiple comparisons among experiments were.." - do you mean 'within' experiments? It seems that treatments should not be compared between different experiments.

      We mean multiple comparisons among different experiments; we clarify this in the revised manuscript.

      Results

      Experiment 1: From the methods, it sounded like you both analyzed the bees' first choice and their total no. of choices, but in the results section (and Figure 1) I only see the data for all choices combined here.

      In table 1 and in the text you report the number of bees that chose each option on their first choice, but there are no statistical results associated with these results. At the very least, a chi square or binomial test could be run.

      Line 138: "Interestingly, ten out of fifteen bees pulled the connected string in their first choice" - this is presented like it is a significant majority of bees, but a chi-square test of 10 vs 5 has a p-value = 0.1967

      We used the Chi square test to analyzed of the bees’ first choice. We also added the analyzed data in the Table 1.

      Line 143: "It makes sense because the bees could see the "lollipop shape" once they pulled it out from the table." - this feels more like interpretation (i.e. Discussion) rather than results.

      We moved the sentence to the discussion.

      Line 162: again this feels more like interpretation/ conjecture than results.

      We removed the sentence in the results.

      Line 184: check grammar.

      We checked the grammar. We changed “task” to “tasks”.

      Figures

      I really appreciated the overview in Figure 5 - though I think this should be Figure 1? Even if the methods come later in eLife, I think it would be nice to have that cited earlier on (e.g. at the start of the results) to draw the reader's attention to it quickly, since it's so helpful. It also then makes the images at the bottom of what is currently Figure 1 make more sense. I also think that the authors could make it clearer in Figure 5 which strings are connected vs disconnected in the figure (even if it means exaggerating the distance more than it was in real life). I had to zoom in quite a bit to see which were connected vs. not. Alternatively, you could have an arrow to the string with the words "connected" "disconnected" the first time you draw it - and similar labels for the other string conditions.

      We appreciate the valuable comment from the reviewer. We changed Figure 5 to Figure 2, and Figure 4 to Figure 1. We cited the Figures at the start of the results. We also changed the gap distance between the disconnected strings. Additionally, we added arrows to indicate “connected” and “disconnected” strings in the Figure.

      Figure 1 - I think you could make it clearer that the bars refer to experiments (e.g. have an x-axis with this as a label). Also, check the grammar of the y-axis.

      We added the experiments number in the Figures. Additionally, we checked the grammar of the y-axis. We changed “percentages” to “parentage”. 

      I also think it's really helpful to see the supplementary videos but I think it would be nice to see some examples of the test phase, and not just the training examples.

      We added Supplementary videos of the testing phase.

      Reviewer #2 (Recommendations For The Authors):

      Below are also some minor comments:

      L40: "approaches".

      We changed “approach” to “approaches”.

      L42: but likely mainly due to sampling bias of mammals and birds.

      We changed the sentence as follows: String pulling is one of the most extensively used approaches in comparative psychology to evaluate the understanding of causal relationships (Jacobs & Osvath, 2015), with most research focused on mammals and birds, where a food item is visible to the animal but accessible only by pulling on a string attached to the reward (Taylor, 2010; Range et al., 2012; Jacobs & Osvath, 2015; Wakonig et al., 2021).

      L64: remove "in this study"

      We removed “in this study”.

      L64: simple associative learning of what? Isn't your image matching associative too?

      We removed “ simple”.

      L97: remove "a" before "connected".

      We removed “a” before “connected”.

      L136-138: but maybe they could still feel the weight of the flower when pulling?

      Because both strings were glued to the floor in the test phase, the feedback was the same and therefore irrelevant. This information is noted in the General Methods.

      L161: what are these numbers?

      We removed the latency in the revised manuscript.

      L167/ Table 1: I realise that the authors never tried slanted strings to check if bumblebees used proximity as a cue. Why?

      This was simply because we wanted to focus on whether bumblebees could recognize the connectivity of the string.

      Discussion: Why did you only control for colour of the string? What if you had used strings with different textures or smells? Unclear if the authors controlled for "bumblebee smell" on the strings, i.e., after a bee had used the string, was the string replaced by a new one or was the same one used multiple times?

      We used different colors to investigate featural generalization of the visual display of the string connected to the flower in this task. We controlled for color because it is a feature that bumblebees can easily distinguish.

      Both the flowers and the strings were used only once, to prevent the use of chemosensory cues. We clarify this in the revised manuscript.

      L182: since what?

      We deleted “since” in the revised manuscript.

      L182-188: might be worth mentioning that some crows and parrots known for complex cognition perform poorly on broken strings (e.g., https://doi.org/10.1098/rspb.2012.1998 ; https://doi.org/10.1163/1568539X-00003511 ; https://doi.org/10.1038/s41598-021-94879-x ) and Australian magpies use trial and error (https://doi.org/10.1007/s00265-023-03326-6).

      We added the following sentences as suggested by the reviewer: “It is worth noting that some crows and parrots known for complex cognition perform poorly on the broken string task without perceptual feedback or learning. For example, New Caledonian crows use perceptual feedback strategies to solve the broken string-pulling task, and no individual showed a significant preference for the connected string when perceptual feedback was restricted (Taylor et al., 2012). Some Australian magpies and African grey parrots can solve the broken string task, but they required a high number of trials, indicating that learning plays a crucial role in solving this task (Molina et al., 2019; Johnsson et al., 2023).”

      L193: maybe expand on this to put the task into a natural context?

      We added the following sentences as suggested by the reviewer:

      “Different flower species offer varying profitability in terms of nectar and pollen to bumblebees; they need to make careful choices and learn to use floral cues to predict rewards (Chittka, 2017). Bumblebees can easily learn visual patterns and shapes of flower (Meyer-Rochow, 2019); they can detect stimuli and discriminate between differently coloured stimuli when presented as briefly as 25 ms (Nityananda et al., 2014). In contrast, causal reasoning involves understanding and responding to causal relationships. Bumblebees might favor, or be limited to, a visual approach, likely due to the efficiency and simplicity of processing visual cues to solve the string-pulling task. ”

      L204: is causal understanding the same as means-end understanding?

      Means-end understanding is expressed as goal-directed behavior, which involves the deliberate and planned execution of a sequence of steps to achieve a goal. Includes some understanding of the causal relationship (Jacobs & Osvath, 2015; Ortiz et al., 2019). .

      L235: this is a very big span of time. Why not control for motivation? Cognitive performance can vary significantly across the day (at least in humans).

      Bumblebee motivation is understood to be rather consistent, as those that were trained and tested came to the flight arena of their own volition and were foragers looking to fill their crop load each time to return it to the colony.

      L232: what is "(w/w)" ? This occurs throughout the manuscript.

      “w/w” represents the weight-to-weight percentage of sugar.

      L250: this sentence sounds odd. "containing in the central well.." ?? Perhaps rephrase? Unclear what central well refers to? Did the flowers have multiple wells?

      We rephrased the sentence as follows: For each experiment, bumblebees were trained to retrieve a flower with an inverted Eppendorf cap at the center, containing 25 microliters of 50% sucrose solution, from underneath a transparent acrylic table

      L268: why euthanise?

      The reason for euthanizing the bees is that new foragers will typically only become active after the current ones were removed from the hive.

      L270: chemosensory cues answer my concern above. Maybe make it clear earlier.

      We moved this sentence earlier in the result.

      L273: did different individuals use different pulling strategies? Do you have the data to analyse this? This has been done on birds and would offer a nice comparison.

      We analyzed the string-pulling strategies among different individuals, and provided Supplementary Table 1 to display the performances of each individual in different string-pulling experiments.

      L365: unclear why both models. Would be nice to see a GLM output table.

      The duration of pulling different kinds of strings were first tested with the Shapiro-Wilk test to assess data normality. The duration data that conforms to a normal distribution was compared using linear mixed-effects models (LMM), while the data that deviates from normality were examined with a generalized linear-mixed model (GLMM). We added a GLM and GLMM output table in the revised manuscript.

      L377: should be a space between the "." and "This".

      We added a space between the “.” and “This”.

      L383-390: some commas and semicolons are in the wrong places.

      We carefully checked the commas and semicolons in this sentence.

      Reviewer #3 (Recommendations For The Authors):

      Minor comments

      Line 32: seems to be missing a word, suggest "the bumblebees' ability to distinguish".

      we added “the” in the revised manuscript.

      Line 47: it would be good to reference other scholars here, this is the central focus of all work in comparative psychology.

      We added the reference in the revised manuscript.

      Line 50-61: I think the string-pulling literature could be described in more detail here, with mention of perceptual-motor feedback loops as a competing hypothesis to means-end understanding (see Taylor et al 2010, 2012). It seems a stretch to suggest that "String-pulling studies have directly tested means-end comprehension in various species", when perceptual-motor feedback is a competing hypothesis that we have positive evidence for in several species.

      We mentioned the perceptual-motor feedback in the introduction as follow:

      “Multiple mechanisms can be involved in the string-pulling task, including the proximity principle, perceptual feedback and means-end understanding (Taylor et al., 2012; Wasserman et al., 2013; Jacobs & Osvath, 2015; Wang et al., 2020). The principle of proximity refers to animals preferring to pull the reward that is closest to them (Jacobs & Osvath, 2015). Taylor et al. (2012) proposed that the success of New Caledonian crows in string-pulling tasks is based on a perceptual-motor feedback loop, where the reward gradually moves closer to the animal as they pull the strings. If the visual signal of the reward approaching is restricted, crows with no prior string-pulling experience are unable to solve the broken string task (Taylor et al., 2012).

      However, when a green table was placed behind the string to obscure the “lollipop” structure during the training, the bees could not see the “lollipop” during the initial training stage or after pulling the string from under the table. In this situation, the bees were unable to identify the connected string, further proving that bumblebees chose the connected string based on image matching.

      Line 68: suggest remove 'meticulously'.

      We removed “meticulously”.

      Line 99: This is an exciting finding, can the authors please provide a video of a bee solving this task on its first trial?

      We added videos in the supplementary materials.

      Line 133: perceptual-motor feedback loops should be introduced in the introduction.

      We introduced perceptual-motor feedback loops in the revised manuscript.

      Line 136: please clarify the prior experience of these bees, it is not clear from the text.

      We clarified the prior experience of these bees as follow: Bumblebees were initially attracted to feed on yellow artificial flowers, and then trained with transparent tables covered by black tape (S7 video) through a four-step process.

      Line 138: from the video it is not possible to see the bee's perspective of this occlusion. Do the authors have a video or image showing the feedback the bees received? I think this is highly important if they wish to argue that this condition prevents the use of both image matching and a perceptual-motor feedback loop.

      We prevented the use of image matching: the bees were unable to see the flower moving towards them above the table during the training phase in this condition. But the bees may receive visual image both after pulling the string out from the table and in the initial stages of training in this condition.

      Line 147: please clarify what experience these bees had before this test.

      We added the prior experience of bumblebees before training as follow: We therefore designed further experiments based on Taylor et al. (2012) to test this hypothesis. Bumblebees were first trained to feed on yellow artificial, and then trained with the same procedure as Experiment 2, but the connected strings were coiled in the test.

      Line 155: This is a highly similar test to that used in Taylor et al 2012, have the authors seen this study?

      We mentioned the reference in the revised manuscript as follows: We therefore designed further experiments based on Taylor et al. (2012) to test this hypothesis.

      Line 183: This sentence needs rewriting "Since the vast majority of animals, including dogs 183 (Osthaus et al., 2005), cats (Whitt et al., 2009), western scrub-jays (Hofmann et al.,2016) and azure-winged magpies (Wang et al., 2019) are failing in such tasks spontaneously".

      We changed the sentence as suggested by the reviewer as follow:  Some animals, including dogs (Osthaus et al., 2005), cats (Whitt et al., 2009), western scrub-jays (Hofmann et al., 2016) and azure-winged magpies (Wang et al., 2019) fail in such task spontaneously.

      Line 186: "complete comprehension of the functionality of strings is rare" I am not sure the evidence in the current literature supports any animal showing full understanding, can the authors explain how they reach this conclusion?

      We wished to say that few animal species could distinguish between connected and disconnected strings without trial and error learning. We revised the sentence as follows:

      It is worth noting that some crows and parrots known for complex cognition perform poorly on broken string task without perceptual feedback or learning. For example, New Caledonian crows use perceptual feedback strategies to solve broken string-pulling task, and no individual showed a significant preference for the connected string when perceptual feedback is restricted (Taylor et al., 2012). Some Australian magpies and African grey parrots can solve the broken string task, but it required a high number of trials, indicating that learning plays a crucial role in solving this task (Molina et al., 2019; Johnsson et al., 2023).

      Line 190: the authors need to clarify which part of their study provides positive evidence for this conclusion.

      We added the evidence for this conclusion as follows: Our findings suggest that bumblebees with experience of string pulling prefer the connected strings, but they failed to identify the interrupted strings when the string was coiled in the test.

      Line 265: was the far end of the string glued only?

      The entire string was glued to the floor, not just the far ends of the string.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review):

      Summary: 

      In this paper, the authors used target agnostic MBC sorting and activation methods to identify B cells and antibodies against sexual stages of Plasmodium falciparum. While they isolated some Mabs against PFs48/45 and PFs230, two well-known candidates for "transmission blocking" vaccines, these antibodies' efficacies, as measured by TRA, did not perform as well as other known antibodies. They also isolated one cross-reactive mAb to proteins containing glutamic acid-rich repetitive elements, that express at different stages of the parasite life cycle. They then determined the structure of the Fab with the highest protein binder they could determine through protein microarray, RESA, and observed homotypic interactions. 

      Strengths: 

      -  Target agnostic B cell isolation (although not a novel methodology). 

      -  New cross-reactive antibody with some "efficacy" (TRA) and mechanism (homotypic interactions) as demonstrated by structural data and other biophysical data. 

      Weaknesses: 

      The paper lacks clarity at times and could benefit from more transparency (showing all the data) and explanations. 

      We have added the oocyst count data from the SMFA experiments as Supplementary Table 2, and ELISA binding curves underlying Figure 4B as Supplementary Figure 5.

      In particular: 

      - define SIFA 

      - define TRAbs 

      We have carefully gone through the manuscript and have introduced abbreviations at first use, removed unnecessary abbreviations and removed unnecessary jargon to increase readability.

      - it is not possible to read the Figure 6B and C panels. 

      We regret that the labels in Supplementary Figures 6 and 7 were of poor quality and have now included higher resolution images to solve this issue.

      Reviewer #2 (Public Review): 

      This manuscript by Amen, Yoo, Fabra-Garcia et al describes a human monoclonal antibody B1E11K, targeting EENV repeats which are present in parasite antigens such as Pfs230, RESAs, and 11.1. The authors isolated B1E11K using an initial target agnostic approach for antibodies that would bind gamete/gametocyte lysate which they made 14 mAbs. Following a suite of highly appropriate characterization methods from Western blotting of recombinant proteins to native parasite material, use of knockout lines to validate specificity, ITC, peptide mapping, SEC-MALS, negative stain EM, and crystallography, the authors have built a compelling case that B1E11K does indeed bind EENV repeats. In addition, using X-ray crystallography they show that two B1E11K Fabs bind to a 16 aa RESA repeat in a head-to-head conformation using homotypic interactions and provide a separate example from CSP, of affinity-matured homotypic interactions. 

      There are some minor comments and considerations identified by this reviewer, These include that one of the main conclusions in the paper is the binding of B1E11K to RESAs which are blood stage antigens that are exported to the infected parasite surface. It would have been interesting if immunofluorescence assays with B1E11K mAb were performed with blood-stage parasites to understand its cellular localization in those stages. 

      In the current manuscript, we provide multiple lines of evidence that B1E11K binds (with high affinity) to repeats that are present in RESAs, i.e. through micro-array studies, in vitro binding experiments such as Western blot, ELISA and BLI, and through X-ray crystallography studies on B1E11k – repeat peptide complexes. Taken together, we think we provide compelling evidence that B1E11k binds to repeats present in RESA proteins. We do agree that studies on the function of this mAb against other stages of the parasite could be of interest, but as our manuscript focuses on the sexual stage of the parasite, we feel that this is beyond scope of the current work. However, this line of inquiry will be strongly considered in follow up studies.   

      Reviewer #3 (Public Review): 

      The manuscript from Amen et al reports the isolation and characterization of human antibodies that recognize proteins expressed at different sexual stages of Plasmodium falciparum. The isolation approach was antigen agnostic and based on the sorting, activation, and screening of memory B cells from a donor whose serum displays high transmission-reducing activity. From this effort, 14 antibodies were produced and further characterized. The antibodies displayed a range of transmission-reducing activities and recognized different Pf sexual stage proteins. However, none of these antibodies had substantially lower TRA than previously described antibodies. 

      The authors then performed further characterization of antibody B1E11K, which was unique in that it recognized multiple proteins expressed during sexual and asexual stages. Using protein microarrays, B1E11K was shown to recognize glutamate-rich repeats, following an EE-XX-EE pattern. An impressive set of biophysical experiments was performed to extensively characterize the interactions of B1E11K with various repeat motifs and lengths. Ultimately, the authors succeeded in determining a 2.6 A resolution crystal structure of B1E11K bound to a 16AA repeat-containing peptide. Excitingly, the structure revealed that two Fabs bound simultaneously to the peptide and made homotypic antibody-antibody contacts. This had only previously been observed with antibodies directed against CSP repeats. 

      Overall I found the manuscript to be very well written, although there are some sections that are heavy on field-specific jargon and abbreviations that make reading unnecessarily difficult. For instance, 'SIFA' is never defined. 

      We have carefully gone through the manuscript and have introduced abbreviations at first use, removed unnecessary abbreviations and removed unnecessary jargon to increase readability.

      Strengths of the manuscript include the target-agnostic screening approach and the thorough characterization of antibodies. The demonstration that B1E11K is cross-reactive to multiple proteins containing glutamate-rich repeats, and that the antibody recognizes the repeats via homotypic interactions, similar to what has been observed for CSP repeat-directed antibodies, should be of interest to many in the field. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Figure 1 - why only gametes ELISA and Spz or others?  

      The volumes of the single B cell supernatants were too small to screen against multiple antigens/parasite stages. As we aimed to isolate antibodies against the sexual stages of the parasite, our assay focused on this stage and supernatants were not tested against other stages. Furthermore, we screened for reactivity against gametes as TRA mAbs likely target gametes rather than other forms of sexual stage parasites.

      Figure 2 A 

      (a) Wild type (WT) and Pfs48/45 knock-out (KO) gametes.

      (b) I am a bit confused about what GMT is vs Pfs48/45 

      We have changed the column titles in Figure 2A to “wild-type gametes” and “Pfs48/45 knockout gametes” to improve clarity.  

      (c) Binding is high % why is it red? 

      We chose to present the results in a heatmap format with a graded color scale, from strong binders in red to weak binders in green. It has now been clarified in the legend of the figure. 

      Please state acronyms clearly 

      TRA - transmission reducing activity 

      SMFA - standard membrane feeding assay 

      We have added the full terms to clarify the acronyms.

      1123- VRC01 (not O1)

      We have corrected this.

      Figure 2 C bottom panels, clarify which ones are TRAbs (Assuming the Mabs with over 80% TRA at 500 ug/ml) (right gel) and the ones that are not (left gel)? 

      In the Western blot in Figure 2c, we have marked the antibodies with >80% TRA with an asterisk.

      Furthermore, we have replaced ‘TRAbs’ by ‘mAbs with >80% TRA at 500 µg/mL’ in the figure legend.

      ITC show the same affinity of the Fab to the 2 peptides but not the ELISA, not the BLI/SPR would be more appropriate. Any potential explanation?  

      The way binding affinity is determined across various techniques can result in slight differences in determined values. For instance, ELISAs utilize long incubation times with extensive washing steps and involve a spectroscopic signal, isothermal titration calorimetry (ITC) uses calorimetric signal at different concentration equilibriums to extract a KD, and BLI determines kinetic parameters for KD determination. Discrepancies in binding affinities between orthologous techniques have indeed been observed previously in the context of peptide-antibody binding (e.g. PMID: 34788599).

      Despite this, regardless of technique, the relative relationships in all three sets of data is the same - higher binding affinity is observed to the longer P2 peptide. This is the main takeaway of the section. As the reviewer suggests, BLI is likely the most appropriate readout here and is the only value explicitly mentioned in the main text. We primarily use ITC to support our proposed binding stoichiometry which is important to substantiate the SEC-MALS and nsEM data in Figure 4H-I. We added the following sentences to help reinforce these points: “The determined binding affinity from our ITC experiments (Table 1) differed from our BLI experiments (Fig. 4D and 4E), which can occur when measuring antibody-peptide interactions. Regardless, our data across techniques all trend toward the same finding in which a stronger binding affinity is observed toward the longer RESA P2 (16AA) peptide.”

      Figure 5C - would be helpful to have the peptide sequence above referring to what is E1, E2 etc... 

      We added two panels (Figure 5C-D) showcasing the binding interface that shows the peptide numbering in the context of the overall complex. We hope that this will help better orient the reader. 

      Figure S4 - maybe highlight in different colors the EENVV, EEIEE, Etc, etc 

      Repeats found in the sequence of the various proteins in Figure S4 have now been highlighted with different colors.

      Line 163 - why 14 mabs if 11 wells? Isn't it 1 B cell per well? The authors should explain right away that some wells have more than 1 B cell and some have 1 HC, 1LC, and 1 KC. 

      We agree that this was somewhat confusing and have modified the text which now reads: “We obtained and cloned heavy and light chain sequences for 11 out of 84 wells. For three wells we obtained a kappa light chain sequence and for five wells a lambda light chain sequence. For three wells we obtained both a lambda and kappa light chain sequence suggesting that either both chains were present in a single B cell or that two B cells were present in the well. For all 14 wells we retrieved a single heavy chain sequence. Following amplification and cloning, 14 mAbs, from 11 wells, were expressed as full human IgG1s (Table S1) (Dataset S1).”

      Line 166-167 - were they multiple HC (different ones) as well when Lambda and kappa were present?

      This is not clear at first. 

      We clarified this point in the text, see also comment above.

      Line 177 - expressed Pfs48/45 and Pfs230, is it lacking both or just Pfs48/45 (as stated on line 172)? 

      Pfs48/45 binds to the gamete surface via a GPI anchor, while Pfs230 is retained to the surface through binding to Pfs48/45. Hence, the Pfs48/45 knockout parasite will therefore also lack surfacebound Pfs230. We have added a sentence to the Results clarifying this: “The mAbs were also tested for binding to Pfs48/45 knock-out female gametes, which lack surface-bound Pfs48/45 and Pfs230”.

      Show the ELISA data used to calculate EC50 in Figure 3. 

      ELISA binding curves are now shown as Figure S5.

      Line 313-315 - what if you reverse, capture the Fab (peptide too small even if biotinylated?) 

      As anticipated by the Reviewer, immobilizing the Fab and dipping into peptide did not yield appreciable signal for kinetic analysis and thus the experiment from this setup is not reported. 

      Line 341 - add crystal structure 

      This has now been added.

      There is a bit too much speculation in the discussion. For e.g. "The B1C5L and B1C5K mAbs were shown to recognize Domain 2 of Pfs48/45 and exhibited moderate potency, as previously described for Abs with such specificity (27). These 2 mAbs were isolated from the same well and shared the same heavy chain; their three similar characteristics thus suggest that their binding is primarily mediated by the heavy chain". Actual data will reinforce this statement. 

      As B1C5L and B1C5K recognized domain 2 of Pfs48/45 with similar affinity, this strongly suggests that binding is mediated though the heavy chain. Structural analysis could confirm this statement, but this is out of the scope of this study.  

      Reviewer #2 (Recommendations For The Authors): 

      Figure 1: This figure provides a description of the workflow. To make it more relevant for the paper, the authors could add relevant numbers as the workflow proceeds. 

      (a) For example, how many memory B cells were sorted, how many supernatants were positive, and then how many mAbs were produced? These numbers can be attached to the relevant images in the workflow. 

      We modified the figure to include the numbers. 

      (b) For the "Supernatant screening via gamete extract ELISA", please change to "Supernatant screening via gamete/gametocyte extract ELISA". 

      We modified the statement as suggested. 

      Line 155: The manuscript states that 84 wells reacted with gamete/gametocyte lysate. The following sentence states that "Out of the 21 supernatants that were positive...". Can the authors provide the summary of data for all 84 wells or why focus on only 21 supernatants? 

      We screened all supernatants against gamete lysate, and only a subset against gametocyte lysate. In total, we found 84 positive supernatants that were reactive to at least one of the two lysates. 21 of those 84 positive were screened against both lysates. We have modified the text to clarify the numbers:

      “After activation, single cell culture supernatants potentially containing secreted IgGs were screened in a high-throughput 384-well ELISA for their reactivity against a crude Pf gamete lysate (Fig. S1B). A subset of supernatants was also screened against gametocyte lysate (S1C). In total, supernatants from 84 wells reacted with gamete and/or gametocyte lysate proteins, representing 5.6% of the total memory B cells. Of the 21 supernatants that were screened against both gamete and gametocyte lysates, six recognized both, while nine appeared to recognize exclusively gamete proteins, and six exclusively gametocyte proteins.”

      Please note that all 84 positive wells were taken forward for B cell sequencing and cloning. 

      Line 171: SIFA is introduced for the first time and should be completely spelled out.

      We have corrected this. 

      Figure 2: 

      (a) In Figure 2A, can you change the column title from "% pos KO GMT" to "% pos Pfs48/45 KO GMT"?

      We have changed the column titles.  

      (b) In Figure 2B, the SMFA results have been converted to %TRA. Can the authors please provide the raw data for the oocyst counts and number of mosquitoes infected in Supplementary Materials? 

      We have added oocyst count data in Table S2, to which we refer in the figure legend. 

      (c) For Figure 2F, the authors do have other domains to Pfs230 as described in Inklaar et al, NPJ Vaccines 2023. An ELISA/Western to the other domains could identify the binding site for B2C10L, though we appreciate this is not the central result of this manuscript. 

      We thank the reviewer for this suggestion. We are indeed planning to identify the target domain of B2C10L using the previously described fragments, but agree with the reviewer that this not the focus of the current manuscript and decided to therefore not include it in the current report.

      Line 116: The word sporozoites appears in subscript and should be corrected to be normal text. 

      We have corrected this.

      Line 216: Typo "B1E11K" 

      We have corrected this.

      Materials and Methods: 

      (a) PBMC sampling: Please add the ethics approval codes in this section. 

      Donor A visited the hospital with a clinical malaria infection and provided informed consent for collection of PBMCs. We have modified the method section to clarify this. 

      “Donor A had lived in Central Africa for approximately 30 years and reported multiple malaria infections during that period. At the time of sampling PBMCs, Donor A had recently returned to the Netherlands and visited the hospital with a clinical malaria infection. After providing informed consent, PBMCs were collected, but gametocyte prevalence and density were not recorded.”

      (b) Gamete/Gametocyte extract ELISA: Can the authors please provide the concentration of antibodies used for the positive and negative controls (TB31F, 2544, and 399) 

      We have added the concentrations for these mAbs in the methods section.

      Recombinant Pfs48/45 and Pfs230 ELISA: Please state the concentration or molarity used for the coating of recombinant Pfs48/45 and Pfs230CMB. 

      We have added the concentrations, i.e. 0.5 µg/mL, to the methods section.

      Western Blotting: The protocol states that DTT was added to gametocyte extracts (Line 594), but Western Blots in Figures 2 and 3 were performed in non-reducing conditions. Please confirm whether DTT was added or not. 

      Thank you for noting this. We did not use DTT for the western blots and have removed this line from the methods section.

      Reviewer #3 (Recommendations For The Authors): 

      Below are a few minor comments to help improve the manuscript. 

      (1) In Figure 4E, are the BLI data fit to a 1:1 binding model? The fits seem a bit off, and from ITC and X-ray studies it is known that 2 Fabs bind 1 peptide. The second Fab should presumably have higher affinity than the first Fab since the second Fab will make interactions with both the peptide and the first Fab. It may be better to fit the BLI data to a 2:1 binding model. 

      The 2:1 (heterogeneous ligand) model assumes that there are two different independent binding sites. However, the second binding event described is dependent on the first binding event and thus this model also does not accurately reflect the system. Given that there is not an ideal model to fit, we instead are careful about the language used in the main text to describe these results. Additionally, we also include a sentence to the results section to ensure that the proper findings/interpretations are highlighted: “…our data all trend toward the same finding in which a stronger binding affinity is observed toward the longer RESA P2 (16AA) peptide.”

      (2) The sidechain interactions shown in Figures 5C and D could probably be improved. The individual residues are just 'floating' in space, causing them to lack context and orientation. 

      We added two panels (Fig. 5C-D) showcasing the binding interface that shows the peptide numbering in the context of the overall complex. We hope that this will help orient the reader.  

      (3) The percentage of Ramachandran outliers should be listed in Table 2. Presumably, the value is 0.2%, but this is omitted in the current table. 

      Table 2 has been modified to include the requested information explicitly.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      The work is well performed and thoroughly convincing. 

      However, a few points could be improved, by adjusting the manuscript: 

      (1) The wording of the abstract is confusing for the casual reader. The initial impression is that the 2-copy complexes contain the majority of the PSD95 copies. This is not the case, as shown in panel cii. It would be important for the authors to explain in the abstract the exact percentage of molecules found within 2-copy complexes. 

      We have now amended the abstract, making it clear that it’s not most of the complexes.  

      (2) Did the authors find a sizeable population of 2-copy complexes by investigating wild-type proteins, using nanobody labeling (Figure S2)? It would be important to quantify and discuss these data. 

      It was not possible to perform this analysis on the wild-type proteins. The quantification would rely on all the PSD95 molecules being bound by the antibody, which we cannot guarantee. Furthermore, the nanobody labeling would need to be stoichiometric. 

      (3) The authors quote the separation value of 12.7 nm throughout their text, including the abstract. This may be somewhat misleading since the authors investigate the PSD95-GFP molecules, labeled using anti-GFP nanobodies. The large size of the two GFP molecules (~3 nm), and that of the nanobodies, will influence the readout. Two groups have already reported a separation of ~7-8 nm between neighboring PSD95 molecules in synapses, using PSD95 nanobodies, to minimize the linkage

      error: https://doi.org/10.1101/2022.08.03.502284 and https://doi.org/10.1101/2023.10.18.562 700  

      The difference observed here is consistent with an effect of the additional GFP moieties; the authors should cite these works (albeit they are now only provided as biorXiv pre-prints) and should mention this discrepancy, and its potential tagging-related explanation. 

      We have now referenced the work and referred to this in the discussion.

      (4) The authors may want to re-check the manuscript; some minor problems should be corrected, such as the mislabeling of Figure 2 and "Figure 5". 

      This has now been corrected.  

      Reviewer #2 (Recommendations For The Authors): 

      The authors suggest that the stability of the PSD95 dimeric complex correlates with memory formation. However, the turnover experiments were conducted only on three-month-old animals, which can be considered to be at a stage of lower synaptic functionality turnover. It would be appropriate to study dimer turnover during the memory formation period at three to four weeks of age, for example in comparison to the oldest mice. 

      Alternatively, it might be interesting to study the turnover in the hippocampus following exposure to a memory test. 

      Whilst potentially useful, these experiments are outside of the scope of this manuscript.   

      It is not clear whether the different turnover identified in various brain areas is statistically significant, as apparently no statistical analysis has been conducted. 

      The findings were significant, and the SI table containing the p-values has been emphasized further in the manuscript.  

      Reviewer #3 (Recommendations For The Authors): 

      (1) In the last paragraph of the Results section, it could be made clearer what the nature is of the correlation between PSD95 half-life and mixed supercomplexes to understand how to interpret this correlation. In the discussion, it is concluded that stable synapses have long protein lifetimes and slow replacement of scaffolding proteins. However, this is based on the correlation of protein lifetime and mixed supercomplexes in the cortex, which does not provide any evidence that this relation is true in single synapses or is specific for stable synapses. To make this statement, the authors could for instance directly correlate the stoichiometry of supercomplexes with the protein lifetime and size of individual synapses. 

      Unfortunately, we can’t directly measure the lifetime of each complex, and so it’s only possible to compare region-to-region. In doing so, we found that there was a correlation between the protein lifetime and the “mixed” population.  

      (2) Some essential parts seem missing: the materials and methods and Figure 2 are not included. Also, the numbering of figures is incorrect. Both in the figure legends and the text. 

      This has been added. 

      (3) Figure 1a could contain more details of the experimental procedures. For example, it could be made clearer how PSD95 supercomplexes are isolated from brain homogenate. 

      This is now presents in the methods. 

      (4) In Figure 1c, single molecules of PSD95 are identified using PALM with a resolution of 30 nm. However, in Figure 1d it is shown that PSD95 molecules reside on average 13 nm apart, indicating that a resolution of 30 nm is not sufficient to resolve single PSD95 molecules. In addition, it would be of interest to show the distribution of fluorophore separation (assessed with MINFLUX) of only the supercomplexes with two PSD95 molecules, since only these were used to calculate the average distance. 

      The 13 nm distance was measured using MINFLUX, as stated in the text. The fluorophore separation distances are shown in Figure 1dii.

      (5) In the introduction, the authors could be more explicit in their explanation of memory formation and storage and how the presented study contributes to that field. 

      We thank the reviewer for the suggestion, but feel that such a discussion in the introduction would detract from the main points of the manuscript.  

      (6) Throughout the manuscript the authors prominently cite their own work, but relevant literature on synaptic plasticity and synapse nanostructure (EM and super-resolution studies) is lacking. 

      Further references have now been added.  

      (7) The results depicted in Figure 4b would be easier to interpret if a stacked histogram (including error bars) was used. 

      We agree that the data could be presented in such a way, but that would prevent the results from the biological repeats, along with the variation, being presented.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Intro. 

      47-48 rewrite sentence

      This sentence has been rewritten as: Photoreceptor synapses are specialized with a vesicle-associated ribbon organelle and postsynaptic neurites of horizontal and bipolar cells that invaginate deep within the terminal

      Results 

      Major comment. Lines 100-103 

      The new rod data presented here looks like an n = 1. Neither the Results section nor Supp Fig S1, describe the number of cells used. Nor do the authors offer a statistical description with averages, etc.. In addition, the single traces are much improved over their previous study (Maddox et al eLife 2020), but the authors have not described any new approach or trick that improved their rod Ica. Neither Methods section nor Supp section describes the procedure for patching rods (solutions, or Vh which is critical for assessing T-type currents). 

      Suggestion, if more data exists, then present it. Otherwise, drop the argument. 

      The recording methodology for recording rods was like that for cones and this has been clarified in the Methods section (lines 725-752). Averaged data (n= at least 5 per group) and statistical analyses have been added to Fig.S1 (renamed Figure 2-Figure Supplement 1), and clearly show that no Ca2+ currents are present in the KI rods.

      Supp Fig S2. The legend needs to be fixed. Conversion to PDF file may have created these formatting errors. 

      This has been corrected (renamed Figure 3-Figure Supplement 2).

      Fig 8 a. The position of the light stimulus bar in the KO panel appears to be out of place, shifted too far to the left. 

      This has been corrected.

      Major comments. 219-221 

      The use of Fluo3-AM is not properly stated here. The text reads "cone pedicles filled with the Ca2+ indicator Fluo3". The wording used could be wrongly interpreted as: whole-cell filling of the cones via patch electrode. However, the Methods section describes bathing the retina in Fluo3-AM, which presumably fills PRs, HCs tips, Mueller glia and bpc dendrites. The Results section should acknowledge that the retina was loaded with Fluo3-AM. 

      The cell types, and their processes (Muellers, HCs, bpc, PRs), present in a cone pedicle ROI will likely contribute to the Fluo3 readout of Ca2+ in the OPL, because 1) the EM images in Fig 7 highlight how interdigitated the processes are with the presynapse, 2) all express Cav channels, and many if not all express L-Type Cavs in their processes (glia, HC, on-bcs and PRs), and 3) all are depolarized with the addition of high extracellular KCl. The inclusion of Isradipine will inhibit L-type Cavs on pre- and post-synaptic targets, failing to specifically isolate PR Ca2+. Furthermore, Glu Receptor blockers are used here, which would be a great idea if the cones were stimulated with light; however, KCl bypasses the excitatory synaptic pathway and depolarizes all processes within the ROI. Hence, all cellular parts in the ROI will potentially contribute to Fluo3-Ca2+ signals. 

      Suggestions for presentation of these findings. Ultimately your conclusion is suitable " 233 to 234...... Taken together, our results suggest that Cav3 channels nominally support Ca2+ signals and synaptic transmission in cones of G369i KI mice". The dramatic reduction in Fluo3-Ca2+ signals in the OPL G369i retinas (Fig 9) is a valuable finding for the following reasons: 1) the results do not show a clear compensation from intracellular stores that could potentially supersede the T-type currents in the G369i (which is an argument you make), and 2) there is a massive loss of Ca2+ influx in the OPL of G369i retinas. Since G369i is specific to the PRs, and only cones are present in the mutant G369i, the loss of Fluo3-Ca2+ signal in the mutant ROI reflects in large part loss of cone Fluo3-Ca2+ signals. Your findings illustrate the severity of the mutation, which has also been addressed in the various electro-physio sections of the MS. 

      Figure 9 also needs to be more clear about 1) the loading of the cells with AM-dye, and 2) the presence of glia, HCs and bc dendrites in the PNA demarcated ROIs. 

      We regret that we did not make this more clear, but our Fluo 3 loading protocol of whole retina followed by vertical slicing allowed for loading primarily of photoreceptors in the portion of the outer retina that we imaged. We clarified this with the following edit to the text (lines 220-226):

      “To test if the diminished HC light responses correlated with lower presynaptic Ca2+ signals in G369i KI cones, we performed 2-photon imaging of vertical slices prepared from whole retina that was incubated  with the Ca2+ indicator Fluo3-AM and  Alexa-568-conjugated peanut agglutinin (PNA) to demarcate regions of interest (ROIs) corresponding to cone pedicles. With this approach Fluo3 fluorescence was detected only in photoreceptors and ganglion cells and not inner retinal cell-types (e.g., horizontal cells, bipolar cells, Mueller cell soma). Thus, Ca2+ signals reported by Fluo3 fluorescence near PNA-labeling originated primarily from cones.”

      We also note that given the considerably larger volume of the cone pedicle relative to the postsynaptic neurites of horizontal and bipolar cells, as well as neighboring glia, it seems unlikely that the latter would contribute significantly to the isradipine-sensitive Ca2+ signal measured in the ROI above the PNA labeling. Moreover, to our knowledge the contribution of Cav1 L-type channels to postsynaptic Ca2+ signals in the dendritic tips of horizontal cells and bipolar cells has not been demonstrated.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Public Review):

      Major shortcomings include the unusual normalization strategies used for many experiments and the lack of quantification/statistical analyses for several experiments. Because of these omissions, it is difficult to conclude that the data justify the conclusions. The significance of the data presented is overstated, as many of the experiments presented confirm/support previously published work. The study provides a modest advance in the understanding of the complex issue of SHH membrane extraction.

      Major shortcomings include the unusual normalization strategies used for many experiments and the lack of quantification/statistical analysis for several experiments.

      This statement is not correct for the revised manuscript: The normalization strategies used are clearly described in the manuscript and are not unusual. Each experiment is now statistically analyzed.

      The significance of the data presented is overstated, as many of the experiments presented confirm/support previously published work.

      As reviewer 2 correctly points out, there are many competing models for Hedgehog release. Our study cannot possibly support them all - the reviewer's statement is therefore misleading. In fact, our careful biochemical analysis of the mechanistics of Dispatched- mediated Shh export supports only two of them: The model of proteolytic processing of Shh lipid anchors (shedding) and the model of lipoprotein-mediated Shh transport. In contrast, our study does not support the predominant model of Dispatched-mediated extraction of dual-lipidated Shh and delivery to Scube2, which is currently thought to act as a soluble Shh chaperone. We also do not support Dispatched function in Shh endocytic recycling and cytoneme loading, or any of the other models such as exosome-mediated or micelle Shh transport.

      Reviewer #2 (Public Review):

      A novel and surprising finding of the present study is the differential removal of Shh N- or C- terminal lipid anchors depending on the presence of HDL and/or Disp. In particular, the identification of a non-palmitoylated but cholesterol-modified Shh variant that associates with lipoproteins is potentially important. The authors use RP-HPLC and defined controls to assess the properties of processed forms of Shh, but their precise molecular identity remains to be defined. One caveat is the heavy reliance on overexpression of Shh in a single cell line. The authors detect Shh variants that are released independently of Disp and Scube2 in secretion assays, but these are excluded from interpretation as experimental artifacts. Therefore, it would be important to demonstrate key findings in cells that endogenously secrete Shh.

      We would like to respond as follows:

      The authors use RP-HPLC and defined controls to assess the properties of processed forms of Shh, but their precise molecular identity remains to be defined.

      This is the original reviewers statement regarding our original manuscript submission. We believe that the biochemical and functional data presented in the VOR clearly describe the molecular identity of solubilized Shh: it is monolipidated, lipoprotein-associated, and highly biologically active in two established Shh bioassays.

      One caveat is the heavy reliance on overexpression of Shh in a single cell line.

      As stated by reviewer 1, the strength of our work is the use of a bicistronic SHH-Hhat system to consistently generate doubly lipidated ligand to determine the amount and lipidation status of SHH released into cell culture media. This unique system therefore eliminates the artifacts of protein overexpression. We have also added two other cell lines to our VOR that produce the same results (including Panc1 cells that endogenously produce Shh, Supplementary Figure 1).

      The authors detect Shh variants that are released independently of Disp and Scube2 in secretion assays, but these are excluded from interpretation as experimental artifacts.

      As the reviewer correctly points out, these variants are released independently of Disp and Scube2, both of which are known as essential release factors in vivo. These variants are therefore by definition experimental artifacts. The forms we have included in our analysis are the alternative forms that are clearly dependent on Dispatched and Scube2 for their release - as shown in the first figure in the manuscript, and in pretty much every other figure after that.


      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      Key shortcomings include the unusual normalization strategies used for many experiments and the lack of quantification/statistical analyses for several experiments.

      In the updated version of the paper, we have addressed all of this reviewer's criticisms. Most importantly, we have performed several additional experiments to address the concern that unusual normalization strategies were used in our paper and that quantification and statistical analyses were lacking for several experiments. We have now analyzed the full set of release conditions for Shh and engineered proteins from Disp-expressing n.t. control cells and Disp-/- cells both in the presence and absence of Scube2 (Figure 1A'-D', Figure 2E added to the paper, Figure 3B'-D', Figure 5C and Figure S2F-H). Previously, we had only quantified protein release from n.t. controls and Disp-/- cells in the presence but not in the absence of Scube2 under serum-depleted conditions. Quantifications of serum-free protein release and Shh release under conditions ranging from 0.05% FCS to 10% FCS were completely missing from the earlier versions of the manuscript, but have now been added to our paper. In addition, we have reanalyzed all of the data sets in the above figures, as well as Figures 2C and S1B, to address the issue of "unusual normalization strategies": unlike previous assays in which the highest amount of protein detected in the media was set to 100% and all other proteins in that experiment were expressed relative to that value, we now directly compare the relative amounts of cellular and corresponding solubilized proteins as a method to quantify release without the need for data normalization (Figs. 1A'-D', 2C,E, 3B'-D', E, 5C, Fig. S1B, S2F-H).

      We have also repeated the qPCR analyses in C3H10T1/2 cells and now show that the same Shh/C25AShh activities can be observed when using another Shh responsive cell line, NIH3T3 cells (Fig. 4B, 6B, fig. S5B).

      We would like to point out that if the criticism refers to the presentation of our RP-HPLC and SEC data, the normalization of the strongest eluted protein signal to 100% for all proteins tested is necessary to put their behavior in a clearer relationship. This is because only the relative positions of protein elution, and not their amounts, are important in these experiments.

      The significance of the data provided is overstated because many of the presented experiments confirm/support previously published work.

      To mitigate the first reviewer's comment that the significance of the data presented is overstated, we now clearly distinguish between our novel results and the known aspect of Hh release on lipoproteins throughout our paper. We now clearly describe what is new and important in our paper: First, contrary to the general perception in the field, Disp and Scube2 are not sufficient to solubilize Shh, casting doubt on the currently accepted model that Scube2 accepts dual-lipidated Shh from Disp and transports it to the receptor Ptch. Second, lipoproteins shift dual Shh processing to N-terminal peptide processing only to generate different soluble Hh forms with different activities (as shown in Figure 4C). Third, and again contrary to popular belief, this new release mode does not inactivate Shh, as we now show in two established cellular assays for Hh biofunction (Figures 4A-C, 5B'', 6B and S5C-G). Fourth, and most importantly, we show that spatiotemporally controlled, Disp-, Scube2- and HDL-mediated Shh release absolutely requires dual lipidation of the membrane-associated Shh precursor prior to its release. This finding (as shown in Figures 1 and S2) changes the interpretation of previously published in vivo data that have long been interpreted as evidence for the requirement of dual Shh lipidation for full receptor binding and activation.

      The study provides a modest advance in our understanding of the complex issue of Shh membrane extraction.

      Although we agree that our results integrate our novel observations into previously established concepts of Hh release and trafficking, we also hope that our data cast well-founded doubt on the current view that the issue of Hh release and trafficking is largely resolved by the model of Disp-mediated Shh hand-over to Scube2 and then to Ptch, which requires interactions with both Shh lipids. Our data show that this is clearly not the case in the presence of lipoproteins. Thus, the significance of our data is that models of Shh lipid-regulated signaling to Ptch obtained using the dual-lipidated Shh precursor prior to its Disp- and Scube2-mediated conversion into a delipidated or monolipidated, HDL-associated soluble ligand are likely to describe a non-physiological interaction. Instead, our work describes a highly bioactive soluble ligand with only one lipid still attached, which has not been described before in the literature. The in vivo endpoint analyses presented in Fig. S8 suggest that this new protein variant is likely to play an important role during development.

      Reviewer #2 (Public Review):

      The precise molecular identity (of the released Shh) remains to be defined.

      We would like to respond that the direct comparison of soluble proteins and their well-defined double-lipidated precursors side-by-side in the same experiment, as shown in our paper, determines all relevant molecular changes in the Shh release process. Most importantly, we show by SDS-PAGE and RP-HPLC that HDL restricts Shh processing to the N-terminus and that the absence of HDL results in double processing of Shh during its release. We also show by SEC that the C-terminus binds the protein to HDL. In addition, the fly experiments confirm the requirement for N-terminal Hh processing, but not for processing of the C-terminal peptide, and suggest that the N-terminal Cardin-Weintraub sequence replaced by the functionally blocking tag represents the physiological cleavage site.

      It would be important to demonstrate key findings in cells that secrete Shh endogenously.

      We now confirm the key findings of our study in Panc1 cells that endogenously produce and secrete Shh: As shown in Fig. S1D, we find that soluble proteins are processed but retain the C-cholesterol, which we now directly confirm by RP-HPLC (Fig. S4F-H). The in vivo analyses shown in Fig. S8 suggest that the key finding - that N-terminal but not C-terminal Hh shedding is required for release - can be supported, at least in the fly: here, Hh variants impaired in their ability to be processed N-terminally strongly repress the endogenous protein, whereas the same protein impaired in its ability to be processed C-terminally does not.

      The authors detect Shh variants that are expressed independently of Disp and Scube2 in secretion assays, but are excluded from interpretation as experimental artifacts.

      We agree with the reviewer's criticism that the amounts of Shh released independently of Disp and Scube2 in secretion assays were not quantified and analyzed statistically to justify their proposed status as not physiologically relevant. We now show that these forms are indeed secretion artifacts (Fig. 3E and Fig. S2F-H show quantification of the lower electrophoretic mobility protein fraction (i.e., the "top" band representing the double-lipidated soluble protein fraction)) because this fraction is released independently of Disp and Scube2.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This interesting study explores the mechanism behind an increased susceptibility of daf-18/PTEN mutant nematodes to paralyzing drugs that exacerbate cholinergic transmission. The authors use state-of-theart genetics and neurogenetics coupled with locomotor behavior monitoring and neuroanatomical observations using gene expression reporters to show that the susceptibility occurs due to low levels of DAF-18/PTEN in developing inhibitory GABAergic neurons early during larval development (specifically, during the larval L1 stage). DAF-18/PTEN is convincingly shown to act cell-autonomously in these cells upstream of the PI3K-PDK-1-AKT-DAF-16/FOXO pathway, consistent with its well-known role as an antagonist of this conserved signaling pathway. The authors exclude a role for the TOR pathway in this process and present evidence implicating selectivity towards developing GABAergic neurons. Finally, the authors show that a diet supplemented with a ketogenic body, β-hydroxybutyrate, which also counteracts the PI3K-PDK-1-AKT pathway, promoting DAF-16/FOXO activity, partially rescues the proper development (morphology and function) of GABAergic neurons in daf-18/PTEN mutants, but only if the diet is provided early during larval development. This strongly suggests that the critical function of DAF18/PTEN in developing inhibitory GABAergic neurons is to prevent excessive PI3K-PDK-1-AKT activity during this critical and particularly sensitive period of their development in juvenile L1 stage worms. Whether or not the sensitivity of GABAergic neurons to DAF-18/PTEN function is a defining and widespread characteristic of this class of neurons in C. elegans and other animals, or rather a particularity of the unique early-stage GABAergic neurons investigated remains to be determined.

      Strengths:

      The study reports interesting and important findings, advancing the knowledge of how daf-18/PTEN and the PI3K-PDK-1-AKT pathway can influence neurodevelopment, and providing a valuable paradigm to study the selectivity of gene activities towards certain neurons. It also defines a solid paradigm to study the potential of dietary interventions (such as ketogenic diets) or other drug treatments to counteract (prevent or revert?) neurodevelopment defects and stimulate DAF-16/FOXO activity.

      Weaknesses:

      (1) Insufficiently detailed methods and some inconsistencies between Figure 4 and the text undermine the full understanding of the work and its implications.

      The incomplete methods presented, the imprecise display of Figure 4, and the inconsistency between this figure and the text, make it presently unclear what are the precise timings of observations and treatments around the L1 stage. What exactly do E-L1 and L1-L2 mean in the figure? The timing information is critical for the understanding of the implications of the findings because important changes take place with the whole inhibitory GABAergic neuronal system during the L1 stage into the L2 stage. The precise timing of the events such as neuronal births and remodelling events are welldescribed (e.g., Figure 2 in Hallam and Jin, Nature 1998; Fig 7 in Mulcahy et al., Curr Biol, 2022). Likewise, for proper interpretation of the implication of the findings, it is important to describe the nature of the defects observed in L1 larvae reported in Figure 1E - at present, a representative figure is shown of a branched commissure. What other types of defects, if any, are observed in early L1 larvae? The nature of the defects will be informative. Are they similar or not to the defects observed in older larvae?

      We thank the reviewer for highlighting these areas for improvement. We have updated and clarified the timing of observation in the text, figures, and methodology section accordingly.

      All experiments were conducted using age-synchronized animals. Gravid worms were placed on NGM plates and removed after two hours. The assays were then carried out on animals that hatched from the eggs laid during this specific timeframe.

      Regarding the detailed timings outlined in the original Figure 4 (now Figure 5 in the revised version), we provided the following information in the revised version: For experiments involving continuous exposure to βHB throughout development, the gravid worms were placed on NGM plates containing the ketone body and removed after two hours. Therefore, this exposure covered the ex-utero embryonic development period up to the L4-Young adult stage when the experiments were conducted.

      In experiments involving exposure at different developmental stages as those depicted in Figure 4 of the original version, (now Figure 5, revised version), animals were transferred between plates with and without βHB as required. We exposed daf-18/PTEN mutant animals to βHB-supplemented diets for 18-hour periods at different developmental stages (Figure 5A, revised version). The earliest exposure occurred during the 18 hours following egg laying, covering ex-utero embryonic development and the first 8-9 hours of the L1 stage. The second exposure period encompassed the latter part of the L1 stage, the entire L2 stage, and most of the L3 stage. The third exposure spanned the latter part of the L3 stage (~1-2 hours), the entire L4 stage, and the first 6-7 hours of the adult stage.

      All this information has been conveniently included in Figure 5, text (Page13, lines 259-276), and in methodology (Page 4, Lines 85-90, Revised Methods and Supplementary information) of the revised manuscript.

      In response to the reviewer's suggestion, we have also included photos of daf-18 worms at the L1 stage (30 min/1h post-hatching). Defects are already present at this early stage, such as handedness and abnormal branching commissures, which are also observed in adult worm neurons (see Supplementary Figure 4, revised version). 

      These defects manifest in DD neurons shortly after larval birth. The prevalence of animals with errors is higher in L4 worms (when both VDs and DDs are formed) compared to early L1s (Figures 3 C-E and Supplementary Figure 4, revised version). This suggests that defects in VD neurons also occur in daf-18 mutants. Indeed, when we analyzed the neuronal morphology of several wild-type and daf-18 mutant animals, we found defects in the commissures corresponding to both DD and VD neurons (Supplementary Figure 3, revised version). 

      These data are now included in the revised version (Results (Page 10, lines 177-196), Discussion (Pages 14-16), Main Figure 3, and Supplementary Figures 3, 4 and 7 revised version)

      (2) The claim of proof of concept for a reversal of neurodevelopment defects is not fully substantiated by data.

      The authors state that the work "constitutes a proof of concept of the ability to revert a neurodevelopmental defect with a dietary intervention" (Abstract, Line 56), however, the authors do not present sufficient evidence to distinguish between a "reversal" or prevention of the neurodevelopment defect by the dietary intervention. This clarification is critical for therapeutic purposes and claims of proof-of-concept. From the best of my understanding, reversal formally means the defect was present at the time of therapy, which is then reverted to a "normal" state with the therapy. On the other hand, prevention would imply an intervention that does not allow the defect to develop to begin with, i.e., the altered or defective state never arises. In the context of this study, the authors do not convincingly show reversal. This would require showing "embryonic" GABAergic neuron defects or showing convincing data in newly hatched L1 (0-1h), which is unclear if they do so or not, as I have failed to find this information in the manuscript. Again, the method description needs to be improved and the implications can be very different if the data presented in Figure 2D-E regard newly born L1 animals (0-1h) or L1 animals at say 5-7h after hatching. This is critical because the development of the embryonically-born GABAergic DD neurons, for instance, is not finalized embryonically. Their neurites still undergo outgrowth (albeit limited) upon L1 birth (see DataS2 in Mulcahy et al., Curr Biol 2022), hence they are susceptible to both committing developmental errors and to responding to nutritional interventions to prevent them. In contrast to embryonic GABAergic neurons, embryonic cholinergic neurons (DA/DB) do not undergo neurite outgrowth post-embryonically (Mulcahy et al., Curr Biol 2022), a fact which could provide some mechanistic insight considering the data presented. However, neurites from other post-embryonically-born neurons also undergo outgrowth postembryonically, but mostly during the second half of the L1 stage following their birth up to mid-L2, with significant growth occurring during the L1-L2 transition. These are the cholinergic (VA/VB and AS neurons) and GABAergic (VD) neurons. The fact that AS neurons undergo a similar amount of outgrowth as VD neurons is informative if VD neurons are or are not susceptible to daf-18/PTEN activity. Independently, DD neurons are still quite unique on other aspects (see below), which could also bring insight into their selective response.

      Finally, even adjusting the claim to "constitutes a proof-of-concept of the ability of preventing a neurodevelpmental defect with a dietary intervention" would not be completely precise, because it is unclear how much this work "constitutes a proof of concept". This is because, unless I misunderstood something, dietary interventions are already applied to prevent neurodevelopment defects, such as when folic acid supplementation is recommended to pregnant women to prevent neural tube defects in newborns.

      Thank you very much for pointing out this issue and highlighting the need to further investigate the ameliorative capacity of βHB on GABAergic defects in daf-18 mutants. In the revised version, we have included experiments to address this point.

      Our microscopy analyses strongly indicate that the development of DD neurons is affected, with errors observed as early as one-hour post-hatching (Main Figure 3, and Supplementary Figures  4 and 7, revised version). Additionally, based on the position of the commissures in L4s, our results strongly suggest that VD neurons are also affected (Supplementary Figure 3, revised version). Both, the frequency of animals with errors and the number of errors per animal are higher in L4s compared to L1 larvae (Main Figures 3,  and Supplementary Figure 4 and 7, revised version). It is very likely that the errors in VD neurons, which are born in the late L1 stage, are responsible for the higher frequency of defects observed in L4 animals. 

      As the reviewer noted, GABAergic DD neurons, which are born embryonically, do not complete their development during the embryonic stages. Some defects in DD neurons may arise during the postembryonic period. Following the reviewer's suggestion, we analyzed L1 larvae at different times before the appearance of VDs (1 hour post-hatching and 6 hours post-hatching). We did not observe an increase in error prevalence, suggesting that DD defects in daf-18 mutants are mostly embryonic (Supplementary Fig 4B, Revised Version). 

      Our findings suggest that βHB's enhancement is not due to preventive effects in DDs, as defects persist in newly hatched larvae regardless of βHB presence (Supplementary Figure 7, revised version), and postembryonic DD growth does not introduce new errors (Supplementary Figure 4, revised version). This lack of preventive effect could be due to βHB's limited penetration into the embryonic environment. Unlike early L1s, significant improvement occurs in L4s upon βHB early exposure (Supplementary Figure 7, revised version). This could be explained by a reversing effect on malformed DD neurons and/or a protective influence on VD neuron development. While we cannot rule out the first option, even if all errors in DDs in L1 were repaired (which is very unlikely), it wouldn't explain the level of improvement in L4 (Supplementary Figure 7, revised version). Therefore, we speculate that VDs may be targeted by βHB. The notion that exposure to βHB during early L1 can ameliorate defects in neurons primarily emerging in late L1s (VDs) is intriguing. We may hypothesize that residual βHB or a metabolite from prior exposure could forestall these defects in VD neurons. Notably, βHB has demonstrated a capacity for long-lasting effects through epigenetic modifications (Reviewed in He et al, 2023, https://doi.org/10.1016%2Fj.heliyon.2023.e21098). More work is needed to elucidate the underlying fundamental mechanisms regarding the ameliorating effects of βHB supplementation. We have now discussed these possibilities under discussion (Page 17, lines 369-383, revised version).

      We agree with the reviewer that the term "reversal" is not accurate, and we have avoided using this terminology throughout the text. Furthermore, in the title, we have decided to change the word "rescue" to "ameliorate," as our experiments support the latter term but not the former. Additionally, the reviewer is correct that folic acid administration to pregnant women is already a metabolic intervention to prevent neural tube defects. In light of this, we have avoided claiming this as proof of concept in the revised manuscript 

      (3) The data presented do not warrant the dismissal of DD remodeling as a contributing factor to the daf-18/PTEN defects.

      Inhibitory GABAergic DD neurons are quite unique cells. They are well-known for their very particular property of remodeling their synaptic polarity (DD neurons switch the nature of their pre- and postsynaptic targets without changing their wiring). This process is called DD remodeling. It starts in the second half of the L1 stage and finishes during the L2 stage. Unfortunately, the fact that the authors find a specific defect in early GABAergic neurons (which are very likely these unique DD neurons) is not explored in sufficient detail and depth. The facts that these neurons are not fully developed at L1, that they still undergo limited neurite growth, and that they are poised for striking synaptic plasticity in a few hours set them apart from the other explored neurons, such as early cholinergic neurons, which show a more stable dynamics and connectivity at L1 (see Mulcahy et al., Curr Biol 2022).

      The authors use their observation that daf-18/PTEN mutants present morphological defects in GABAergic neurons prior to DD remodeling to dismiss the possibility that the DAF-18/PTEN-dependent effects are "not a consequence of deficient rearrangement during the early larval stages". However, DD remodeling is just another cell-fate-determined process and as such, its timing, for instance, can be affected by mutations in genes that affect cell fates and developmental decisions, such as daf-18 and daf-16, which affect developmental fates such as those related with the dauer fate. Specifically, the authors do not exclude the possibility that the defects observed in the absence of either gene could be explained by precocious DD remodeling. Precocious DD remodeling can occur when certain pathways, such as the lin-14 heterochronic pathway, are affected. Interestingly, lin-14 has been linked with daf16/FOXO in at least two ways: during lifespan determination (Boehm and Slack, Science 2005) and in the

      L1/L2 stages via the direct negative regulation of an insulin-like peptide gene ins-33 (Hristova et al., Mol Cell Bio 2005). It is likely that the prevention of DD dysfunction requires keeping insulin signaling in check (downregulated) in DD neurons in early larval stages, which seems to coincide with the critical timing and function of daf-18/PTEN. Hence, it will be interesting to test the involvement of these genes in the daf-18/daf-16 effects observed by the authors.

      This is another interesting point raised by the reviewer. We have demonstrated that defects manifest in early L1 (30 min-1 hour post-hatching) which corresponds to a pre-remodeling time in wild-type worms.

      We acknowledge the possibility of early remodeling in specific mutants as pointed out by the reviewer.

      However, the following points suggest that the effects of these mutations may extend beyond the particularity of DD remodeling: i) Our experiments also show defects in VD neurons in daf-18 mutants (Supplementary Figure 3, revised version), as discussed in our previous response. These neurons do not undergo significant remodeling during their development. ii) DAF-18 and DAF-16 deficiencies produce neurodevelopmental alteration on other Non-Remodeling Neurons: Severe neurite defects in neurons that are nearly fully formed at larval hatching, such as AIY in daf-18 and daf-16 mutants, have been previously reported (Christensen et al., 2011). Additionally, the migration of another neuron, HSN, is severely affected in these mutants (Kennedy et al., 2013). iii) To the best of our knowledge, DD remodeling only alters synaptic polarity without forming new commissures or significant altering the trajectory of the formed ones. Thus, it is unlikely (though not impossible) for remodeling defects to cause the observed commissural branching and handedness abnormalities in DD neurons. Therefore, we think that the impact of daf-18 mutations on GABAergic neurons is not primarily linked to DD remodeling but extends to various neuron types. It is intriguing and requires further exploration in the future, the apparent resilience of cholinergic motor neurons to these mutations. This resilience is not limited to daf18/PTEN animals since mutants in certain genes expressed in both neuron types (such as neuronal integrin ina-1 or eel-1, the C. elegans ortholog of HUWE1) alter the function or morphology of GABAergic neurons but not cholinergic motor neurons (Kowalski, J. R. et al. Mol Cell Neurosci 2014; Oliver, D. et al. J Dev Biol (2019); Opperman, K. J. et al. Cell Rep 2017). These points are discussed in the manuscript (Discussion, page 15, lines 311-322, revised version) and reveal the existence of compensatory or redundant mechanisms in these excitatory neurons, rendering them much more resistant to both morphological and functional abnormalities.

      Discussion on the impact of the work on the field and beyond:

      The authors significantly advance the field by bringing insight into how DAF-18/PTEN affects neurodevelopment, but fall short of understanding the mechanism of selectivity towards GABAergic neurons, and most importantly, of properly contextualizing their findings within the state-of-the-art C. elegans biology.

      For instance, the authors do not pinpoint which type of GABAergic neuron is affected, despite the fact that there are two very well-described populations of ventral nerve cord inhibitory GABAergic neurons with clear temporal and cell fate differences: the embryonically-born DD neurons and the postembryonically-born VD neurons. The time point of the critical period apparently defined by the authors (pending clarifications of methods, presentation of all data, and confirmation of inconsistencies between the text and figures in the submitted manuscript) could suggest that DAF-18/PTEN is required in either or both populations, which would have important and different implications. An effect on DD neurons seems more likely because an image is presented (Figure 2D) of a defect in an L1 daf-18/PTEN mutant larva with 6 neurons (which means the larva was processed at a time when VD neurons were not yet born or expressing pUnc-47, so supposedly it is an image of a larva in the first half of the L1 stage (0-~7h?)). DD neurons are also likely the critical cells here because the neurodevelopment errors are partially suppressed when the ketogenic diet is provided at an "early" L1 stage, but not later (e.g., from L2-L3, according to the text, L2-L4 according to the figure? ).

      Thank you for this insightful input. As previously mentioned, we conducted experiments in this revision to clarify the specificity of GABAergic errors in daf-18/PTEN mutants, in particular, whether they affect DDs, VDs, or both. Our results suggest that commissural defects are not limited to DD neurons but also occur in VD neurons (Supplementary Figure 3). Regarding the effect of βHB, our findings suggest that VD neurons are targets of βHB action. As mentioned in the previous response and the discussion section (Page 17, lines 369-383, revised version), we might speculate that lingering βHB or a metabolite from prior exposure could mitigate these defects in VD neurons that are born in Late L1s-Early L2s. Additionally, βHB has been noted for its capacity to induce long-term epigenetic changes. Therefore, it could act on precursor cells of VD neurons, with the resulting changes manifesting during VD development independently of whether exposure has ceased. All these possibilities are now discussed in the manuscript.

      Acknowledging that our work raises several questions that we aim to address in the future, we believe our manuscript provides valuable information regarding how the PI3K pathway modulates neuronal development and how dietary interventions can influence this process.

      This study brings important contributions to the understanding of GABAergic neuron development in C. elegans, but unfortunately, it is justified and contextualized mostly in distantly-related fields - where the study has a dubious impact at this stage rather than in the central field of the work (post-embryonic development of C. elegans inhibitory circuits) where the study has stronger impact. This study is fundamentally about a cell fate determination event that occurs in a nutritionally-sensitive

      developmental stage (post-embryonic L1 larval stage) yet the introduction and discussion are focused on more distantly related problems such as excitatory/inhibitory (E/I) balance, pathophysiology of human diseases, and treatments for them. Whereas speculation is warranted in the discussion, the reduced indepth consideration of the known biology of these neurons and organisms weakens the impact of the study as redacted. For instance, the critical role of DAF-18/PTEN seems to occur at the early L1 larval stage, a stage that is particularly sensitive to nutritional conditions. The developmental progression of L1 larvae is well-known to be sensitive to nutrition - eg, L1 larvae arrest development in the absence of food, something that is explored in nematode labs to synchronize animals at the L1 stage by allowing embryos to hatch into starvation conditions (water). Development resumes when they are exposed to food. Hence, the extensive postembryonic developmental trajectory that GABAergic neurons need to complete is expected to be highly susceptible to nutrition. Is it? The sensitivity towards the ketogenic diet intervention seems to favor this. In this sense, the attribution of the findings to issues with the nutrition-sensitive insulin-like signaling pathway seems quite plausible, yet this possibility seems insufficiently considered and discussed.

      We greatly appreciate the reviewer's emphasis on the sensitivity of the L1 stage to nutritional status. As the reviewer points out, C. elegans adjusts its development based on food availability, potentially arresting development in L1 in the absence of food. It is therefore reasonable that both the completion of DD neuron trajectories and the initial development steps of VD neurons are particularly sensitive to dietary modulation of the insulin pathway, in which both DAF-18 and DAF-16 play roles. This important point has also been included in the discussion (Page 18, lines 384-407, revised version).

      Finally, the fact that imbalances in excitatory/inhibitory (E/I) inputs are linked to Autism Spectrum Disorders (ASD) is used to justify the relevance of the study and its findings. Maybe at this stage, the speculation would be more appropriate if restricted to the discussion. In order to be relevant to ASD, for instance, the selectivity of PTEN towards inhibitory neurons should occur in humans too. However, at present, the E/I balance alteration caused by the absence of daf-18/PTEN in C. elegans could simply be a coincidence due to the uniqueness of the post-embryonic developmental program of GABAergic neurons in C. elegans. To be relevant, human GABAergic neurons should also pass through a unique developmental stage that is critically susceptible to the PI3K-PDK1-AKT pathway in order for DAF18/PTEN to have any role in determining their function. Is this the case? Hence, even in the discussion, where the authors state that "this study provides universally relevant information on.... the mechanisms underlying the positive effects of ketogenic diets on neuronal disorders characterized by GABA dysfunction and altered E/I ratios", this claim seems unsubstantiated as written particularly without acknowledging/mentioning the criteria that would have to be fulfilled and demonstrated for this claim to be true.

      Our results suggest that defects in GABAergic neurons are not limited to DDs, which, as the reviewer rightly notes, are quite unique in their post-embryonic development primarily due to the synaptic remodeling process they undergo. These defects also extend to VD neurons, which do not exhibit significant developmental peculiarities once they are born. Therefore, we think that the defects are not specific to the developmental program of DD neurons but are more related to all GABAergic motoneurons. Additionally, the observation of defects in non-GABAergic neurons in C. elegans daf-18 mutants supports the hypothesis that the role of daf-18 is not limited to DD neurons (Christensen et al., 2011; Kennedy et al., 2013).

      In mammals, Pten conditional knockout (cKO) animals have been extensively studied for synaptic connectivity and plasticity, revealing an imbalance between synaptic excitation and inhibition (E/I balance) (Reviewed in Rademacher and Eickholt, 2019, Cold Spring Harbor Perspect Med, https://doi.org/10.1101%2Fcshperspect.a036780). This imbalance is now widely accepted as a key pathological mechanism linked to the development of ASD-related behavior (Lee et al, 2017; Biological Psychiatry, https://doi.org/10.1016/j.biopsych.2016.05.011) . The importance of PTEN in the development of GABAergic neurons in mammals is well-documented. For instance, embryonic PTEN deletion from inhibitory neurons impacts the establishment of appropriate numbers of parvalbumin and somatostatin-expressing interneurons, indicating a central role for PTEN in inhibitory cell development (Vogt et al, 2015, Cell Rep, https://doi.org/10.1016%2Fj.celrep.2015.04.019). Additionally, conditional PTEN knockout in GABAergic neurons is sufficient to generate mice with seizures and autism-related behavioral phenotypes (Shin et al, 2021, Molecular Brain, https://doi.org/10.1186%2Fs13041-02100731-8). Moreover, while mice in which PV GABAergic neurons lacked both copies of Pten experienced seizures and died, heterozygous animals (PV-Pten+/−) showed impaired formation of perisomatic inhibition (Baohan et al, 2016, Nature Comm, OI: 10.1038/ncomms12829). Therefore, there is substantial evidence in mammals linking PTEN mutations to neurodevelopmental disorders in general and affecting GABAergic neurons in particular. Hence, we believe that the role of daf-18/PTEN in GABAergic development could be a more widespread phenomenon across the animal kingdom rather than a specific process unique to C. elegans.

      Beyond the points discussed, we have addressed the reviewer's comment regarding the last sentence of the abstract. We have revised it to more cautiously frame the relationship between our findings, ASD, and mammalian neurodevelopmental disorders.

      Reviewer #2 (Public Review):

      Summary:

      Disruption of the excitatory/inhibitory (E/I) balance has been reported in Autism Spectrum Disorders

      (ASD), with which PTEN mutations have been associated. Giunti et al choose to explore the impact of PTEN mutations on the balance between E/I signaling using as a platform the C. elegans neuromuscular system where both cholinergic (E) and GABAergic (I) motor neurons regulate muscle contraction and relaxation. Mutations in daf-18/PTEN specifically affect morphologically and functionally the GABAergic (I) system, while leaving the cholinergic (E) system unaffected. The study further reveals that the observed defects in the GABAergic system in daf-18/PTEN mutants are attributed to reduced activity of DAF-16/FOXO during development.

      Moreover, ketogenic diets (KGDs), known for their effectiveness in disorders associated with E/I imbalances such as epilepsy and ASD, are found to induce DAF-16/FOXO during early development. Supplementation with β-hydroxybutyrate in the nematode at early developmental stages proves to be both necessary and sufficient to correct the effects on GABAergic signaling in daf-18/PTEN mutants.

      Strengths:

      The authors combined pharmacological, behavioral, and optogenetic experiments to show the

      GABAergic signaling impairment at the C. elegans neuromuscular junction in DAF-18/PTEN and DAF-

      16/FOXO mutants. Moreover, by studying the neuron morphology, they point towards

      neurodevelopmental defects in the GABAergic motoneurons involved in locomotion. Using the same set of experiments, they demonstrate that a ketogenic diet can rescue the inhibitory defect in the daf18/PTEN mutant at an early stage.

      Weaknesses:

      The morphological experiments hint towards a pre-synaptic defect to explain the GABAergic signaling impairment, but it would have also been interesting to check the post-synaptic part of the inhibitory neuromuscular junctions such as the GABA receptor clusters to assess if the impairment is only presynaptic or both post and presynaptic.

      Moreover, all observations done at the L4 stage and /or adult stage don't discriminate between the different GABAergic neurons of the ventral nerve cord, ie the DDs which are born embryonically and undergo remodeling at the late L1 stage, and VDs which are born post-embryonically at the end of the L1 stage. Those additional elements would provide information on the mechanism of action of the FOXO pathway and the ketone bodies.

      Thank you for your insightful suggestions. 

      This is an initial study that serves as a cornerstone, demonstrating the sensitivity of GABAergic neuron development to alterations in the PI3K pathway and how these alterations can be mitigated by a dietary intervention with a ketone body. While we have determined that the transcription factor DAF-16/FOXO is essential in the neurodevelopmental process and is the target of ketone bodies to alleviate defects, there are still underlying mechanisms to be elucidated. This is only the first step that opens many avenues for further investigation, including the study of post-synaptic partners.

      While our current study primarily focuses on neuronal alterations without delving into potential postsynaptic effects, we do plan to investigate this aspect in future research. This includes examining GABAergic receptors as well as cholinergic receptors, as exacerbation of cholinergic signaling cannot be ruled out. To conduct a comprehensive study of post-synaptic structure and functionality, we would need strains with fluorescent markers for both pre- and post-synaptic components (such as rab-3, unc-49, unc29, acr-16 fusion to GFP or mCherry). Unfortunately, most of these strains are not currently available in our laboratory. Unlike the US or Europe, acquiring these strains from the C. elegans CGC repository in Argentina is challenging due to common customs delays, which require significant time and resources to navigate. Discussions at the Latin American C. elegans conference with CGC administrators, such as Ann Rougvie, have been initiated to address this issue, but a solution has not been reached yet.  Additionally, to analyze post-synaptic functionality in-depth, studying the response to perfusion with various agonists using electrophysiology would be beneficial. We are in the process of acquiring the capability to conduct electrophysiology experiments in our laboratory, but progress is slow due to limited funding.

      While we believe these experiments are very informative, they will require a considerable amount of time due to our current circumstances. We consider them non-essential to the primary message of the paper, which focuses on neuronal developmental defects leading to functional alterations in daf-18/PTEN mutants and the novel finding that these can be mitigated by supplementing food with hydroxybutyrate. We will study the structure and functionality of the post-synapse in our future projects and also plan to extend this investigation to mutants with deficiencies in genes closely related to neurodevelopmental defects, such as neuroligin, neurexin, or shank-3, which have been implicated in synaptic architecture.

      We also agree that discriminating between DD and VD neurons provides significant insights into the neurodevelopmental phenomena dependent on the FOXO pathway and the action of βHB. In this revised version, we present evidence that not only DD neurons are affected but also VD neurons (see

      Supplementary Figure 3, revised version). This allows us to suggest that daf-18 affects the development of GABAergic neurons regardless of whether they are born embryonically (DDs) or post-embryonically (VDs) (see also our response to the previous reviewer). We hope to distinguish the defects observed in each type of neuron in future studies. For this, we would need to use strains specifically marked in one neuronal type or another, which, for the same reasons mentioned earlier, would take a considerable amount of time under current conditions. 

      Conclusion:

      Giunti et al provide fundamental insights into the connection between PTEN mutations and neurodevelopmental defects through DAF-16/FOXO and shed light on the mechanisms through which ketogenic diets positively impact neuronal disorders characterized by E/I imbalances.  

      Reviewer #3 (Public Review):

      Summary:

      This is a conceptually appealing study by Giunti et al in which the authors identify a role for PTEN/daf-18 and daf-16/FOXO in the development of inhibitory GABA neurons, and then demonstrate that a diet rich in ketone body β-hydroxybutyrate partially suppresses the PTEN mutant phenotypes. The authors use three assays to assess their phenotypes: (1) pharmacological assays (with levamisole and aldicarb); (2) locomotory assays and (3) cell morphological assays. These assays are carefully performed and the article is clearly written. While neurodevelopmental phenotypes had been previously demonstrated for PTEN/daf-18 and daf-16/FOXO (in other neurons), and while KB β-hydroxybutyrate had been previously shown to increase daf-16/FOXO activity (in the context of aging), this study is significant because it demonstrates the importance of KB β-hydroxybutyrate and DAF-16 in the context of neurodevelopment. Conceptually, and to my knowledge, this is the first evidence I have seen of a rescue of a developmental defect with dietary metabolic intervention, linking, in an elegant way, the underpinning genetic mechanisms with novel metabolic pathways that could be used to circumvent the defects.

      Strengths:

      What their data clearly demonstrate, is conceptually appealing, and in my opinion, the biggest contribution of the study is the ability of reverting a neurodevelopmental defect with a dietary intervention that acts upstream or in parallel to DAF-16/FOXO.

      Weaknesses:

      The model shows AKT-1 as an inhibitor of DAF-16, yet their studies show no differences from wildtype in akt-1 and akt-2 mutants. AKT is not a major protein studied in this paper, and it can be removed from the model to avoid confusion, or the result can be discussed in the context of the model to clarify interpretation.

      Thank you very much for the suggestion. We agree with the reviewer's appreciation that the study of AKT's action itself is too limited in this study to draw conclusions that would allow its inclusion in the proposed model. Therefore, following the reviewer's suggestion, we have removed this protein from our model

      When testing additional genes in the DAF-18/FOXO pathway, there were no significant differences from wild-type in most cases. This should be discussed. Could there be an alternate pathway via DAF-18/DAF16, excluding the PI3K pathway or are there variations in activity of PI3K genes during a ketogenic diet that are hard to detect with current assays?

      Thank you for bringing up this point. Our pharmacological experiments indeed demonstrate that all mutants associated with an exacerbation of the PI3K pathway, which typically inhibits nuclear translocation and activity of the transcription factor DAF-16, lead to imbalances in E/I

      (excitation/inhibition) that manifest as hypersensitivity to cholinergic drugs. This includes the gain of function of pdk-1 and the loss of function of daf-18 and daf-16 itself. In our subsequent experiments, we demonstrate that this exacerbation of the PI3K pathway leads to errors in the neurodevelopment of GABAergic neurons, which explains the hypersensitivity to aldicarb and levamisole.

      As the reviewer remarks, it is intriguing why mutants inhibiting this pathway do not show differences in their sensitivity to cholinergic drugs compared to wild-type animals. We can speculate, for instance, that during neurodevelopment, there is a critical period where the PI3K pathway must remain with very low activity (or even deactivated) for proper development of GABAergic neurons. This could explain why there are no differences in sensitivity to cholinergic drugs between mutants that inhibit the PI3K pathway and the wild type. The PI3K pathway depends on insulin-like signals, which are in turn positively modulated by molecules associated with the presence of food. Interestingly, larval stage 1 is particularly sensitive to nutritional status, being able to completely arrest development in the absence of food. Therefore, dietary intervention with BHB may generate a signal of dietary restriction (as seen in mammals) and, as a consequence of this dietary restriction, the PI3K pathway is inhibited, resulting in increased DAF-16 activity. This could restore the proper neurodevelopment of GABAergic neurons. However, this is mere speculation, and further deeper experiments (than the pharmacology ones we performed here) with mutants in different genes within the PI3K pathway may shed light on this point.

      Following the reviewer's suggestion, this point has been discussed in the revised version of the manuscript. (Discussion Page 18, Lines 384-407).

      The consequence of SOD-3 expression in the broader context of GABA neurons was not discussed. SOD3 was also measured in the pharynx but measuring it in neurons would bolster the claims.

      SOD-3 is a known target of DAF-16. Previous studies have shown that βHB induces SOD-3 expression through the induction of DAF-16 (Edwards et al, 2014, Aging,

      https://doi.org/10.18632%2Faging.100683). The highest levels of SOD-3 expression are typically observed in the pharynx or intestine (DeRosa et al, 2019 https://doi.org/10.1038/s41586-019-1524-5;  Zheng et al., 2021, PNAS, https://doi.org/10.1073/pnas.2021063118), and it is often used as a measure of general upregulation of DAF-16. Therefore, we used this parameter as a measure of βHB upregulating systemic DAF-16 activity.  While we agree with the reviewer that observing variations in SOD-3 expression in neurons would further support our conclusions, unfortunately, we did not detect measurable signals of SOD-3 in motor neurons in either the control condition or the daf-18 background even upon stress or BHB-exposure. This may be because SOD-3 is a minor target of DAF-16 in these neurons, or its modulation may not correspond to the timing of fluorescence measurements (L4-adults).

      Despite this, our genetic experiments and neuron-specific rescue experiments lead us to conclude that DAF-16 must act autonomously in GABAergic neurons to ensure proper neurodevelopment.

      If they want to include AKT-1, seeing its effect on SOD-3 expression could be meaningful to the model.

      Thank you for this suggestion. We believe that even measuring SOD-3 levels in akt mutant backgrounds would still provide limited information to give it a predominant value in our work. Additionally, to have a complete understanding of the total role of AKT, it would be necessary to measure it in a double mutant background of akt-1; akt-2, and these double mutants generate 100 % dauers even at 15C (Oh et al., PNAS 2005, https://doi.org/10.1073/pnas.0500749102; Quevedo et al., Current Biology 2007, http://dx.doi.org/10.1016/j.cub.2006.12.038; Gatzi et al., PLOS ONE 2014,

      https://doi.org/10.1371/journal.pone.0107671), greatly complicating the execution of these experiments. Therefore, following the first advice of this reviewer, we have decided to modify our model by excluding AKT.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      ⁃ Please include earlier in the main text the rationale for using unc-25 as a control/reference already when mentioning Figure 1A.

      Thank you for pointing out the need to reference this control earlier. We have included the following paragraph in the description of Figure 1 (Page 5, line 71, revised version):

      “Hypersensitivity to cholinergic drugs is typical of animals with an increased E/I ratio in the neuromuscular system, such as mutants in unc-25 (the C. elegans orthologue for glutamic acid decarboxylase, an essential enzyme for synthesizing GABA). While daf-18/PTEN mutants become paralyzed earlier than wild-type animals, their hypersensitivity to cholinergic drugs is not as severe as that observed in animals completely deficient in GABA synthesis, such unc-25 null mutants (Figures 1B and 1C) indicating a less pronounced imbalance between excitatory and inhibitory signals.”

      ⁃ Please discuss the greater sensitivity of pdk-1(gf) animals to levamisole than to aldicarb.

      Thank you for bringing up this subtle point.  We understand that the reviewer is referring to the paralysis curve in response to aldicarb in pdk-1(gf), which is closer to unc-25 than the curve for levamisole (in both cases, they are more sensitive than the wild type). Therefore, pdk-1(gf) animals seem to be more sensitive to aldicarb than to levamisole. These results are now shown in Figure 1D (revised version).

      The PI3K pathway does not only act in neurons but also in muscles. Gain of function in pdk-1 has been shown to modulate muscle protein degradation (Szewczyk et al, EMBO Journal, 2008. https://doi.org/10.1038/sj.emboj.7601540). In contrast,  no effect on protein degradation has been reported for null mutants in this gene. Several studies have demonstrated that protein degradation levels can differentially affect receptor subunits, particularly acetylcholine receptors (Reviewed in Crespi et al, Br J Pharmacol, 2018). C. elegans is characterized by a wide repertoire of AChR subunits, and there are at least two subtypes of ACh receptors in muscles (one multimeric sensitive to levamisole and one homomeric (ACR-16) insensitive to levamisole) (Richmond et al, 1999 Nature Neuroscience http://dx.doi.org/10.1038/12160; Touroutine D, JBC 2005 https://doi.org/10.1074/jbc.M502818200).

      Interestingly, acr-16 null mutants are hypersensitive to aldicarb (Zeng et al, JCB, 2023, https://doi.org/10.1083/jcb.202301117) while the electrophysiological response to levamisole in this mutant remains similar to that of wild-type (Tourorutine et al, 2005). Therefore, it may be that the gain of function in pdk-1 induces a change in the expression of AChR subtypes in muscle that differentially affect sensitivity to levamisole and ACh. This is purely speculative, and there may be many other explanations. While it would be interesting to explore this difference further, it goes far beyond the scope of this study. The cholinergic drug sensitivity assay is purely exploratory and allowed us to delve into the GABAergic and cholinergic signals in daf-18 mutants. In this sense, the hypersensitivity of pdk-1(gf) to both drugs supports the idea that an increase in PI3K signaling leads to an increased E/I ratio.

      ⁃ Please explain the rationale to perform akt-1 and akt-2 assays separated. Why not test doublemutants? Has their lack of redundancy been determined?.  

      Our pharmacological assays are conducted at the L4 larval stage, making it impossible to analyze the potential redundancy of akt-1 and akt-2 in sensitivity to levamisole and aldicarb. This impossibility arises because the akt-1;akt-2 double mutant exhibits nearly 100% arrest as dauer even at 15°C, as reported in several prior studies (Oh et al., PNAS 2005, https://doi.org/10.1073/pnas.0500749102; Quevedo et al., Current Biology 2007, http://dx.doi.org/10.1016/j.cub.2006.12.038; Gatzi et al., PLOS ONE 2014, https://doi.org/10.1371/journal.pone.0107671). While the increased dauer arrest in the double mutant compared to the single mutants might suggest redundant functions in dauer entry, there are also reports indicating the absence of redundancy in other processes, such as vulval development (Nakdimon et al., PLOS Genetics 2012, https://doi.org/10.1371%2Fjournal.pgen.1002881).

      The complete Dauer arrest likely underlies why other studies focusing on the role of the PI3K pathway in neurodevelopment utilize both mutants separately (Christensen et al, Development 2011,

      https://doi.org/10.1242/dev.069062). While determining the potential redundancy of these genes is not feasible for this assay, we utilized various mutants of the pathway (age-1, pdk-1, daf-18, daf-16 and daf16;daf-18 in addition to the akt-s) that support the conclusion, which is that exacerbating the PI3K pathway activity makes animals hypersensitive to cholinergic drugs.

      In response to the reviewer's concern, we have added a sentence in the text explaining the impossibility of performing the assay in the akt-1;akt-2 double mutant (Page 6, lines90-92) 

      Figure 1C and D (This applies to all similarly presented bar figures). Please show data points and dispersion (preferably data, median+- 25-75% or average+-SD). 

      Thank you. Done

      ⁃ Line 112 -maybe "and resumes"? 

      Thank you. Done (Line 126, revised version)

      ⁃ Figure 1E and F. Please present mean +-SD (not SEM) of fluctuations. Please change slightly the tones so that the dispersion is easier to distinguish on the "blue light on" box.

      Thank you for the suggestion. We have adjusted the tones as recommended to enhance the visualization of the "blue light on" box. For visualization purposes, we present the shading of the standard error of the mean (SEM), as is usual in these types of optogenetic experiments where traces of animal length variations are measured (Liewald et al, Nature Methods, 2008, doi: 10.1038/nmeth.1252; Schulstheis et al, J. Neurophysiology, 2011, doi: 10.1152/jn.00578.2010; Koopman et al, BMC Biology 2021, https://doi.org/10.1186/s12915-021-01085-2; Seidhenthal et al, Micro Publication Biology, 2022, https://doi.org/10.17912%2Fmicropub.biology.000607 ).

      For the revised version, we have also included bar graphs for each optogenetic experiment, representing the mean of the length average of each worm measured from the first second after the blue light was turned on until the second before the light was turned off (in the graph, this corresponds to the period between seconds 6 and 9 of the traces). These graphs include the standard deviation and the corresponding significance levels. All of this has been included in the new legend (Figure 2D, 2E, 4E-J).

      ⁃ Figure 1A&1B & Supplementary Figure 1D x Supplementary Figure 1E&1F. What is the difference between these experiments? Whereas the unc-25 mutants paralyze in the same amount of time, the WT animals paralyze ~1 h later in Supplementary Figure 1E-1F in response to either drug. Please revise experimental conditions to see if anything can be learned eg, maybe this is a nutritional response from experiments done at different timepoints? Maybe different food recipes affected sensitivity to paralysis?

      Thank you for pointing this out. While the experiments with daf-18 (in both alleles) and daf-16 were conducted at the beginning of this project (2019-2020), the assays with the other mutants in the PI3K and mTOR pathways were performed years later. Changes in the reagents used (agar, peptone, cholesterol, etc.) to grow the worms have occurred, potentially altering the animals' response directly or through the nutritional quality of the bacteria they grow on. In addition, the difference may be attributed to the fact that experiments at the project's outset were conducted by one author, while more recent experiments were carried out by another. The goal is to quantify paralysis in non-responsive worms after touch stimulation. The force of this probing or the thickness of the hair used for touching can be slightly operator-dependent and can lead to variable responses. In addition, always the presence of wild-type and unc-25 strain is included as internal control in every experiment. Nevertheless, despite this userdependent variation, the experiments were always conducted blindly (except for unc-25, whose uncoordinated phenotype is easily identifiable), thus we trust in the outcomes.

      ⁃ Supplementary Figure 1G - Length and Width appear to be switched in both left and right panels - please revise and include a description of N and of statistics depicted. 

      Unfortunately, we don't see the switching error that the reviewer mentioned. In the left panel, we demonstrate that optogenetic activation of GABAergic neurons leads to an increase in length without modifying the width of the animal. Therefore, we conclude that the increase in area, as observed in our Fiji macro for optogenetic response analysis, is due to an increase in the animal's length. In the cholinergic activation shown in the right panel, the animal shortens (decreasing length) without modifying the width, resulting in the reduction of the total body area. 

      We have included information about N (sample size) and the statistical test used in the legends as suggested. These graphs are now shown as Figures 2F and G, revised version.

      ⁃ Supplementary Figure 1G legend lines 779-780. Please describe the post-hoc test applied following ANOVA to obtain the denoted p values. This applies to all datasets where ANOVA or Krusal-Wallis tests were applied.

      Following reviewer´s suggestion, all the post-hoc tests applied after ANOVA or Kruskal-Wallis analysis were included in the legend of each figure and Materials and Methods (statistical analysis section).

      ⁃ Line 174 maybe "arises *from* the hyperactivation" instead of *for*?.

      Corrected. Thank you. Line 190, revised version.

      ⁃ Supplementary Figure 4. On line 816 it says n=40-90, but please check the n of the daf-18, daf-16 samples, which seem to have less than 40 animals.

      We understand that the reviewer is referring to Supplementary Figure 3 from the original version (now Supplementary Figure 5 in the revised version). We have now included the number of observations below each data point cloud to clearly indicate the sample size for each condition

      ⁃ Supplementary Figure 4 - please state what are the bars on the graphs. Please state which post-hoc test was performed after Kruskal-Wallis and present at least the p values obtained between treated controls and each genotype. Alternatively, present the whole truth table in supplementary daita.

      We understand that the reviewer is referring to Supplementary Figure 3 from the original version (now Supplementary Figure 5 in the revised version). There was an error in the original legend (thank you for bringing this to our attention) since the statistics were not performed using Kruskall-Wallis in this case, but rather each treated condition was compared to its own untreated control using Mann-Whitney test. We have now added the p-values to the graph. All raw data for this figure, as well as for all other figures, are available in Open Science Framework (https://osf.io/mdpgc/?view_only=3edb6edf2298421e94982268d9802050).

      ⁃ Please cite the figure panels in order: eg, Figure 3E is mentioned in the text after panels Figure 3F-K.

      Done. We have rearranged the figures to adapt them to the text order (Figure 4, revised version)

      ⁃ Figure 4 - line 610 please revise "(n=20-30 (n: 20-25 animals per genotype/trial)."

      Thank you. Corrected.

      ⁃ Figure 4 - there appears to be an inconsistency in the figure with the text (lines 223-225). In figures it says E-L1, but in the text, it says "solely in L1". Does E-L1 include the whole L1 stage? If not- E-L1 can be interpreted only as during the embryonic stage, hence, no exposure to betaHB due to the impermeable chitin eggshell. Then there is L1-L2, which should cover the L1 stage and the L2 or something else. Please revise. The text mentions L2-L3 or L3-L4 and these categories are not in the figures. This clarification is key for the interpretation of the results. The precise developmental time of the exposures is not defined either in the methods or in the figures. Please provide precise times relative to hours and/or molts and revise the text/figure for consistency.

      The reviewer is entirely correct in pointing out the lack of relevant data regarding the exposure time to βHB. We have now clarified the information For the revised version, we have adjusted the nomenclature of each exposure period to precisely reflect the developmental stages involved.

      For the experiments involving continuous exposure to βHB throughout development, the NGM plate contained the ketone body. Therefore, the exposure encompassed, in principle, the ex-utero embryonic development period up to L4-Young adults (E-L4/YA, in Figure 5A) when the experiments were conducted. Since it could be a restriction to drug penetration through the chitin shell of the eggs (see Supplementary Figure 7), we can ensure βHB exposure from hatching.

      In experiments involving exposure at different developmental stages as those depicted in Figure 4 of the original version, (now Figure 5), animals were transferred between plates with and without βHB as required. We exposed daf-18/PTEN mutant animals to βHB-supplemented diets for 18-hour periods at different developmental stages (Figure 5A). The earliest exposure occurred during the 18 hours following egg laying, covering ex-utero embryonic development and the first 8-9 hours of the L1 stage (This period is called E-L1, in figure 5 revised version). The second exposure period encompassed the latter part of the L1 stage, the entire L2 stage, and most of the L3 stage (L1-L3). The third exposure spanned the latter part of the L3 stage (~1-2 hours), the entire L4 stage, and the first 6-7 hours of the adult stage (L3-YA).

      All this information has been conveniently included in Figure 5 (and its legend), text (Page 13, lines 259276), and Material and Methods of the revised manuscript.

      ⁃ Some methods are not sufficiently well described. Specifically, how the animals were exposed to treatments and how stages were obtained for each experiment. Was synchronization involved? If so, in which experiments and how exactly was it performed?

      As mentioned in previous responses all the experiments were performed in age-synchronized animals. We include the following sentence in Materials and Methods (C. elegans culture and maintenance section): “All experiments were conducted on age-synchronized animals. This was achieved by placing gravid worms on NGM plates and removing them after two hours. The assays were performed on the animals hatched from the eggs laid in these two hours”.

      Reviewer #2 (Recommendations For The Authors):

      Major points

      (1) To complete the study on the GABAergic signaling at the NMJs, it would be interesting to assess the status of the post-synaptic part of the synapse such as the GABAR clustering. It would also tell if the impairment is only presynaptic or both post and presynaptic.

      Thank you for your insightful suggestion. We agree that exploring post-synaptic elements can shed light on whether the impairment is solely presynaptic or involves both pre and post-synaptic components.

      While our current study primarily focuses on neuronal alterations without delving into potential postsynaptic effects, we do plan to investigate this aspect in the future. This includes not only examining GABAergic receptors but also exploring cholinergic receptors, as exacerbation of cholinergic signaling cannot be ruled out. To conduct a comprehensive study of post-synaptic structure and functionality, we would need strains with fluorescent markers for both pre and post-synaptic components (rab-3, unc-49, unc-29, acr-16 driving GFP or mCherry). However, most of these strains are not currently available in our laboratory. Unlike the US or Europe, acquiring these strains from the C. elegans CGC repository in Argentina is challenging due to common customs delays, requiring significant time and resources to navigate. Discussions at the Latin American C. elegans conference with CGC administrators, such as Ann Rougvie, have been initiated to address this issue, but a solution has not been reached yet. 

      Additionally, to analyze post-synaptic functionality in-depth, studying the response to perfusion with various agonists using electrophysiology would be beneficial. We are in the process of acquiring the capability to conduct electrophysiology experiments in our laboratory, but progress is slow due to limited funding.

      While we believe these experiments are very informative, they will require a considerable amount of time due to our current circumstances. We consider them non-essential to the primary message of the paper, which focuses on neuronal morphological defects leading to functional alterations in daf-18/PTEN mutants.

      We will include these experiments in our future projects, also planning to extend this investigation to mutants with deficiencies in genes closely related to neurodevelopmental defects, such as neuroligin, neurexin, or shank-3, which have been implicated in synaptic architecture.

      (2) The author always referred to unc-47 promoter or unc-17 promoter, never specifying where those promoters are driving the expression (and in the Materials & Methods, no information on the corresponding sequence). Depending on the promoters they may not only be expressed in the motoneurons involved in locomotion (VA, VB, DA, DB, VD, and DD), but they could also be expressed in other neurons which could be of importance for the conclusions of the optogenetic assays but also the daf-18 expression in GABAergic neurons.

      We appreciate the reviewer's insight regarding the broader expression patterns of the unc-17 and unc-47 promoters in all cholinergic and GABAergic neurons, respectively. The strains expressing constructs with these promoters were obtained from the CGC or other labs and have been widely used in previous papers (Liewald et al, Nature Methods, https://www.nature.com/articles/nmeth.1252 (2008); Byrne, A. B. et al. Neuron 81, 561-573, doi:10.1016/j.neuron.2013.11.019 (2014).

      Regarding the optogenetic assays, the readout utilized (body length elongation or contraction) is primarily associated with the activity of cholinergic and GABAergic motor neurons and has been used in numerous studies to measure motor neuron functionality (Liewald et al, Nature Methods, https://www.nature.com/articles/nmeth.1252 (2008);Hwang, H. et al. Sci Rep 6, 19900, doi:10.1038/srep19900 (2016); Schultheis et al,  . J Neurophysiol 106, 817-827, doi:10.1152/jn.00578.2010 (2011); Koopman, M., Janssen, L. & Nollen, E. A. BMC Biol 19, 170, doi:10.1186/s12915-021-01085-2 (2021);). It has previously been established that the shortening observed after optogenetic activation of the unc-17 promoter, while active in various interneurons, depends on the activity of cholinergic motor neurons (Liewald et al., Nature Methods, https://www.nature.com/articles/nmeth.1252 (2008)). This was demonstrated by examining transgenic worms expressing ChR2-YFP from another cholinergic, motoneuronspecific but weaker promoter, Punc-4. They observed contraction and coiling upon illumination, albeit to a milder degree.

      In terms of GABAergic neurons, only 3 do not directly synapse to body wall muscles (AVL, PDV, and RIS) and are primarily involved in defecation. Of the 23 GABAergic motor neurons, 19 are Dtype motoneurons, while the remaining 4 innervate head muscles (Pereira et al, eLife 2015, https://doi.org/10.7554/eLife.12432). It is therefore expected that while there may be some contribution from these latter neurons to the elongation after optogenetic activation in animals containing punc-47::ChR2, the main contribution should be from the D-type neurons. Additionally, while there may be some influence on D-type neuron development due to daf-18 rescue in neurons like RME, DVB or AVL, the most direct explanation for the rescue is that daf-18 acts autonomously in D-type cells.  Additionally, we have pharmacological and behavioral assays that support the findings of optogenetics and enable us to reach final conclusions.

      (3) DD neurons are born during embryogenesis and newborn L1s have neurites even though less than at a later stage. If possible, it would be interesting to take a look at them to see if βHB has an effect or not. It will corroborate the hypothesis that βHB action is prevented by the impermeable eggshell on a system that can respond at a later stage. Moreover, using a specific DD, DA, and DB promoter, it would be possible to check if there is a difference in the morphological defects between embryonic and post-embryonic neurons.

      This is a very interesting point raised by the reviewer. We conducted experiments to analyze the morphology of GABAergic neurons in animals exposed to βHB only during the ex-utero embryonic development (in their laid egg state). We observed that this incubation was not sufficient to rescue the defects in GABAergic neurons (Supplementary Figure 7, revised version). As reported by other authors and discussed in our paper, the chitinous eggshell might act as an impermeable barrier to most drugs. However, we cannot rule out that incubation during this period is necessary but not sufficient to mitigate the defects. We have included these experiments in Supplementary Figure 7 and in the text (Page 13, lines 272-276)

      Additionally, we analyzed confocal images where, based on their position, we could identify and assess errors in DD (embryonic) and VD (Post-embryonic) neurons (Supplementary Figure 3, revised version). These experiments show that the effects are observed in both types of neurons, and we did not observe any differential alterations in neuronal morphology between the two types of neurons.

      Minor points

      (1)   Expression of daf-18/PTEN in muscle or hypodermis, could it ensure a proper development? It could give insights into the action mechanism of βHB.

      The reviewer's observation is indeed very intriguing. Previous studies from the Grishok lab (Kennedy et al, 2013) have demonstrated that the expression of daf-18 or daf-16 in extraneuronal tissues, specifically in the hypodermis, can rescue migratory defects in the serotoninergic neuron HSN in daf-18 or daf-16 null mutants of C. elegans. Clearly, this could also be an option for rescuing the morphological and functional defects of GABAergic motoneurons.

      However, the fact that the expression of daf-18 in GABAergic neurons rescues these defects strongly suggests an autonomous effect. In this regard, autonomous effects of DAF-18 or DAF-16 on neurodevelopmental defects have also been reported in interneurons in C. elegans (Christensen et al, 2011). This is included in the discussion (Page 15, lines 330-335)

      (2) Re-organise the introduction. The paragraph on ketogenic diets (lines 35-38) is not logically linked.

      Following reviewer´s suggestion we have reorganized the introduction and changed the order of explanation regarding the significance of ketogenic diets, linking it with their proven effectiveness in alleviating symptoms of diseases with E/I imbalance (Lines 23-60, revised version)

      (3) Incorporate titles in the result section to guide the reader.

      Done. Thank you

      (4) Systematically add PTEN or FOXO when daf-18 or daf-16 are mentioned (for example lines 69, 84, 85).

      Done. Thank you  

      (5) Strain lists: lines 646 to 653: some information is missing on the different transgenes used in this study (integrated (Is) or extrachromosomal (Ex) with their numbers).

      Thank you for bringing this to our attention. We have now included all the information regarding the different transgenes used in this study, including whether they are integrated (Is) or extrachromosomal (Ex) and their respective numbers. This information can be found in the revised version of the manuscript (Materials and Methods, C. elegans culture and maintenance section highlighted in yellow).

      Reviewer #3 (Recommendations For The Authors):

      In Figure 1, some experiments were done with the unc-25 control while others, such as the optogenetic experiments, were done without those controls.

      Thank you for pointing this out. In the optogenetic experiments, we waited for the worm to move forward for 5 seconds at a sustained speed before exposing it to blue light to standardize the experiment, as the response can vary if the animal is in reverse, going forward, or stationary. Due to the severity of the uncoordinated movement in unc-25 mutants, achieving this forward movement before exposure is very difficult. Additionally, this lack of coordination prevents these animals from performing the escape response tests, as they barely move. Therefore, we limited the use of this severe GABAergic-deficient control to pharmacological or post-prodding shortening experiments.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      Additional experiments to characterize what this novel cell type becomes in older animals would be ideal to strengthen the manuscript, but the authors should at least address this in the Discussion.

      The manuscript could be significantly improved if the authors included, for example, a timeline and/or cartoon contextualizing these cells relative to the formation of other CN neurons and their locations, perhaps as a summary figure at the end. Furthermore, the logic of each figure could be enhanced if the authors graphically show - again, perhaps with a schematic/cartoon - the question being tested for each figure. Furthermore, making the figure titles less descriptive and more explanatory would also help a reader follow the logic of the experiments.

      These are indeed valid and important questions for our research, and understanding the distribution, fate, and connectivity of this new cell type in the cerebellar nuclei postnatally is a focus of ongoing investigation in our lab. To address these questions, we are currently utilizing SNCA-GFP mice, a project led by a PhD student in my lab. While this work will be the subject of a full-length research paper, we do add a sentence to the paper concerning a recent report about the presence of SNCA neurons in the adult CN.  We have included a reference to the postnatal expression of SNCA (“In adult mice, postnatal expression of SNCA has been reported in medial CN neurons. PMID: 32639229”.) on page 8 of our manuscript (highlighted in yellow). In addition, we have included a cartoon as a summary figure (Fig. 9) illustrating the origin of cerebellar nuclei from the caudal and rostral ends in both Atoh1+/+ and Atoh1-/- mice. Thank you once again, we have revised and improved the Fig. titles accordingly.

      Reviewer #2 (Recommendations For The Authors):

      Figure 3:

      (1) If most SNCA+ cells are OTX2+ based on the IHCs, why are there so many SNCA+ Otx2- cells in the sort?

      In each group, 350,000 cells were sorted. Due to the relatively small population size of this subset of cerebellar nuclei neurons, the sorting procedure could not perfectly mirror our immunohistochemistry results. In each group, 350,000 cells were sorted. Due to the relatively small population size of this subset of cerebellar nuclei neurons, the sorting procedure could not perfectly mirror our immunohistochemistry results. However, it is noteworthy that a portion of sorted cells expressed SNCA or Otx2 while a smaller population co-expressed both Otx2 and SNCA in the cerebellar primordium.

      (2) Panel 3F: FACS graphs - the resolution of the figures is too poor on the PDF to read any of the text of these graphs. What are the axes?

      We thank the reviewer for this comment. In the revision a high resolution of the FACS graph has replaced the lower quality graph in panel 3F. This clearly identifies the axes and text for this panel.

      Figure 4:

      (1) Arrowheads are making a subset of + cerebellar cells -Why? Not defined in the legend.

      The population of cells indicated by the arrowheads are now defined in the legend. We have added the statement “Examples of Otx2 expressing cells are indicated by arrowheads in panels B, D, E, and F.”

      (2) The orientation of panels E and F is unclear - please provide low mag panel insets.

      An orientation marker (ie, (r-c and d-v; rostral caudal and dorsal ventral, respectively)) has been added to panel A, which applies to all panels, including panels E and F. Furthermore, the isthmus is noted with an “i” to provide further orientation.

      (3) G - and throughout the paper - whisker plots (not simple box plots) are required. Also, it is unclear from the methods how Otx2+ cells were counted - how many embryos/age? The description of 10 sections across 3 slides is incomplete. Are these cells distributed equally across the mediolateral axis of the anlage? Where are comparable M/L sections compared across ages? Is the increase in # across time because these cells are proliferative or are more migrating into the anlage?

      The plot has been replaced with whisker plots. A more detailed description of the Method used has been on page 15; “To assess the number of OTX2-positive cells, we conducted immunohistochemistry (IHC) labeling on slides containing serial sections from embryonic days 12, 13, 14, and 15 (n=3 at each timepoint). Under the microscope, we systematically counted OTX2-positive cells within the cerebellar primordium. This analysis encompassed a minimum of 10 sections, spread across at least 3 slides, ensuring comprehensive coverage of OTX2 expression along the mediolateral axis of the cerebellar primordium. For each slide, the counts of OTX2-positive cells from all sections were cumulatively calculated to determine the total number of positive cells per slide. Subsequently, statistical analysis was employed to compare the results obtained different developmental time points.”

      Figure 5:

      The use of confocal microscopy creates clear data re Otx2-GFP expression, but I cannot understand the origin of the panels. How do they relate to E/F and H/I? Different sections?

      In Figure 5, panels A-D display Otx2 expressing cells in the cerebellar primordium of Otx2-GFP transgenic mice, whereas panels E-J depict RNAscope fluorescence in situ hybridization (FISH) for the Otx2 probe in wild type mice. These represent complementary approaches to map Otx2+ cells in the developing cerebellum. This is made clear in a revised legend in Fig 5.

      Figure 6:

      The justification for the in-culture experiments, particularly the long (4 and 21DIV) times is unclear and needs to be strengthened or the in vitro data should be removed.

      Thank you for the respected reviewer’s comment. The E-H panels, show the co-expression of SNCA and p75NTR, highlight a significant role in the differentiation of specific neuronal populations during development. These findings validate our previous results (PMID: 31509576) and are consistent with the results of our current study. Therefore, we have chosen to keep these panels. However, in line with the suggestion from the reviewer, we have removed panels I-L from Fig. 6.

      Figure 7:

      SNCA expression in panels A and G is not specific nor is the Otx2 staining in panel B making the data in panels C and I uninterpretable and these panels need to be replaced. The Meis2 data however is much better and I agree this data shows that the dorsal RL-derived cells are deleted in Atoh1-/- while the SNCA+ cells remain. This is strong data supporting the dual origins of NTZ.

      Thank you for the points, Panel A and G have been replaced with high-resolution images. In addition, panels A-C have been carefully cropped to enhance focus on the NTZ area, to improve the quality and visibility of panels.  To enhance clarity, we have included a summary fig. 9 for clarification.

      Figure 8:

      The diI experiments are a key addition to this paper and clearly show the direct movement of some cells from the mesencephalon into the developing cerebellum, but data presentation must be considerably strengthened.

      (1) What is the inset in panel A? Low mag of embryo? Perhaps conversion of image to PDF degraded resolution - add a description in the legend. Arrowhead and arrow identities are reversed in the legend. The arrow points to the isthmus.

      Thank you for the comment, for clarification we have included information in the Fig. legend (highlighted in yellow). In addition, the issues with the arrows have been addressed and corrected.

      (2) Panels B and C are also shown in Supplementary Figure 2 with arrows indicating rostral and caudal movement - these arrows need to be added here. There is no need to replicate these same panels in the supplement.

      Thanks, arrows have been added in panels B, C of Fig. 8.

      (3) The text states that "almost all DiI cells migrated caudally into the cerebellum" and refers to Figure 8E and Suppementl 3 but there is no evidence/support shown for this, just a few + cells in 8E and some very difficult-to-see positive cells in sections in Supplement E-F. Given the importance of this data, I am surprised that the authors chose bright field/phase microscopy to show this. This section's data is not convincing data at all. I find it very difficult to see specific staining. These panels must be improved. This is key data for paper conclusions.

      These are valid points, and we acknowledge that this experiment alone may not provide conclusive evidence regarding the subset of CN originating from mesencephalon. At this stage of the study, we do not claim definitively that the SNCA/OTX2/MEIS2 positive cells originate from the mesencephalon. As stated in our manuscript, "In conclusion, our study indicates that the SNCA+/ OTX2+/ MEIS2+/ p75NTR+/ LMX1A- rostroventral subset of CN neurons do not originate from the well-known distinct germinative zones of the cerebellar primordium. Instead, our findings suggest the existence of a previously unidentified extrinsic germinal zone, potentially the mesencephalon."  We have also discussed embryonic culture approaches in the manuscript, which could involve the use of other agents such as plasmid/viral vectors, hinting at the possibility of origin from the mesencephalon. While tracing the origin from the mesencephalon in vivo and in vitro is promising and on our to-do list, the data will not be available for this manuscript. To prevent confusion, we have eliminated redundant panels of Fig. 8 with Supplementary Fig. 2 and 3. However, if the reviewer deems it necessary to remove these panels, we are prepared to do so.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Recommendations For The Authors):

      The revised manuscript addressed my minor concerns adequately, and the manuscript is now further improved. I have no remaining criticisms.

      Reviewer #2 (Recommendations For The Authors):

      Abstract:

      line 45 The abbreviation "SytI" should perhaps be introduced above.

      done

      Results:

      line 139 "RRP kinetics" should perhaps read "RRP depletion kinetics" or "secretion kinetics".

      We replaced “RRP kinetics” with “RRP secretion kinetics”

      line 325ff and Figure 8

      As far as I understand, SytI 875 R233Q ki cells shown in violet express wt CplxII. Perhaps this should be explicitly stated?

      To accommodate this suggestion: We now state on page 13 line 302: “Overexpression of the CpxII DN mutant in SytI R233Q ki cells, which is expected to outcompete the function of endogenous CpxII in these cells (Dhara et al., 2014), further slowed down the rate of synchronized release and restored the EB size to the wt level (Figure 7C, D)”

      line 332ff and Figure 8

      What is plotted in Figure 8B bottom and in Figure 8D is not a "rate" but rather a "unitary rate", more commonly referred to as a "rate constant".

      The y-axis label of Figures 8B and 8D should therefore better be changed to "rate constant". See also line 528 of the Discussion.

      Figure (y-axis label) and text were changed accordingly

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      Glaser et al present ExA-SPIM, a light-sheet microscope platform with large volumetric coverage (Field of view 85mm^2, working distance 35mm), designed to image expanded mouse brains in their entirety. The authors also present an expansion method optimized for whole mouse brains and an acquisition software suite. The microscope is employed in imaging an expanded mouse brain, the macaque motor cortex, and human brain slices of white matter. 

      This is impressive work and represents a leap over existing light-sheet microscopes. As an example, it offers a fivefold higher resolution than mesoSPIM (https://mesospim.org/), a popular platform for imaging large cleared samples. Thus while this work is rooted in optical engineering, it manifests a huge step forward and has the potential to become an important tool in the neurosciences. 

      Strengths: 

      - ExA-SPIM features an exceptional combination of field of view, working distance, resolution, and throughput. 

      - An expanded mouse brain can be acquired with only 15 tiles, lowering the burden on computational stitching. That the brain does not need to be mechanically sectioned is also seen as an important capability. 

      - The image data is compelling, and tracing of neurons has been performed. This demonstrates the potential of the microscope platform. 

      Weaknesses: 

      - There is a general question about the scaling laws of lenses, and expansion microscopy, which in my opinion remained unanswered: In the context of whole brain imaging, a larger expansion factor requires a microscope system with larger volumetric coverage, which in turn will have lower resolution (Figure 1B). So what is optimal? Could one alternatively image a cleared (non-expanded) brain with a high-resolution ASLM system (Chakraborty, Tonmoy, Nature Methods 2019, potentially upgraded with custom objectives) and get a similar effective resolution as the authors get with expansion? This is not meant to diminish the achievement, but it was unclear if the gains in resolution from the expansion factor are traded off by the scaling laws of current optical systems. 

      Paraphrasing the reviewer: Expanding the tissue requires imaging larger volumes and allows lower optical resolution. What has been gained?

      The answer to the reviewer’s question is nuanced and contains four parts. 

      First, optical engineering requirements are more forgiving for lenses with lower resolution. Lower resolution lenses can have much larger fields of view (in real terms: the number of resolvable elements, proportional to ‘etendue’) and much longer working distances. In other words, it is currently more feasible to engineer lower resolution lenses with larger volumetric coverage, even when accounting for the expansion factor. 

      Second, these lenses are also much better corrected compared to higher resolution (NA) lenses. They have a flat field of view, negligible pincushion distortions, and constant resolution across the field of view. We are not aware of comparable performance for high NA objectives, even when correcting for expansion.

      Third, although clearing and expansion render tissues ‘transparent’, there still exist refractive index inhomogeneities which deteriorate image quality, especially at larger imaging depths. These effects are more severe for higher optical resolutions (NA), because the rays entering the objective at higher angles have longer paths in the tissue and will see more aberrations. For lower NA systems, such as ExaSPIM, the differences in paths between the extreme and axial rays are relatively small and image formation is less sensitive to aberrations. 

      Fourth, aberrations are proportional to the index of refraction inhomogeneities (dn/dx). Since the index of refraction is roughly proportional to density, scattering and aberration of light decreases as M^3, where M is the expansion factor. In contrast, the imaging path length through the tissue only increases as M. This produces a huge win for imaging larger samples with lower resolutions. 

      To our knowledge there are no convincing demonstrations in the literature of diffraction-limited ASLM imaging at a depth of 1 cm in cleared mouse brain tissue, which would be equivalent to the ExA-SPIM imaging results presented in this manuscript.  

      In the discussion of the revised manuscript we discuss these factors in more depth. 

      - It was unclear if 300 nm lateral and 800 nm axial resolution is enough for many questions in neuroscience. Segmenting spines, distinguishing pre- and postsynaptic densities, or tracing densely labeled neurons might be challenging. A discussion about the necessary resolution levels in neuroscience would be appreciated. 

      We have previously shown good results in tracing the thinnest (100 nm thick) axons over cm scales with 1.5 um axial resolution. It is the contrast (SNR) that matters, and the ExaSPIM contrast exceeds the block-face 2-photon contrast, not to mention imaging speed (> 10x).  

      Indeed, for some questions, like distinguishing fluorescence in pre- and postsynaptic structures, higher resolutions will be required (0.2 um isotropic; Rah et al Frontiers Neurosci, 2013). This could be achieved with higher expansion factors.

      This is not within the intended scope of the current manuscript. As mentioned in the discussion section, we are working towards ExA-SPIM-based concepts to achieve better resolution through the design and fabrication of a customized imaging lens that maintains a high volumetric coverage with increased numerical aperture.  

      - Would it be possible to characterize the aberrations that might be still present after whole brain expansion? One approach could be to image small fluorescent nanospheres behind the expanded brain and recover the pupil function via phase retrieval. But even full width half maximum (FWHM) measurements of the nanospheres' images would give some idea of the magnitude of the aberrations. 

      We now included a supplementary figure highlighting images of small axon segments within distal regions of the brain.  

      Reviewer #2 (Public Review)

      Summary: 

      In this manuscript, Glaser et al. describe a new selective plane illumination microscope designed to image a large field of view that is optimized for expanded and cleared tissue samples. For the most part, the microscope design follows a standard formula that is common among many systems (e.g. Keller PJ et al Science 2008, Pitrone PG et al. Nature Methods 2013, Dean KM et al. Biophys J 2015, and Voigt FF et al. Nature Methods 2019). The primary conceptual and technical novelty is to use a detection objective from the metrology industry that has a large field of view and a large area camera. The authors characterize the system resolution, field curvature, and chromatic focal shift by measuring fluorescent beads in a hydrogel and then show example images of expanded samples from mouse, macaque, and human brain tissue. 

      Strengths: 

      I commend the authors for making all of the documentation, models, and acquisition software openly accessible and believe that this will help assist others who would like to replicate the instrument. I anticipate that the protocols for imaging large expanded tissues (such as an entire mouse brain) will also be useful to the community. 

      Weaknesses: 

      The characterization of the instrument needs to be improved to validate the claims. If the manuscript claims that the instrument allows for robust automated neuronal tracing, then this should be included in the data. 

      The reviewer raises a valid concern. Our assertion that the resolution and contrast is sufficient for robust automated neuronal tracing is overstated based on the data in the paper. We are hard at work on automated tracing of datasets from the ExA-SPIM microscope. We have demonstrated full reconstruction of axonal arbors encompassing >20 cm of axonal length.  But including these methods and results is out of the scope of the current manuscript. 

      The claims of robust automated neuronal tracing have been appropriately modified.  

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Smaller questions to the authors: 

      - Would a multi-directional illumination and detection architecture help? Was there a particular reason the authors did not go that route?

      Despite the clarity of the expanded tissue, and the lower numerical aperture of the ExA-SPIM microscope, image quality still degrades slightly towards the distal regions of the brain relative to both the excitation and detection objective. Therefore, multi-directional illumination and detection would be advantageous. Since the initial submission of the manuscript, we have undertaken re-designing the optics and mechanics of the system. This includes provisions for multi-directional illumination and detection. However, this new design is beyond the scope of this manuscript. We now mention this in L254-255 of the Discussion section.

      - Why did the authors not use the same objective for illumination and detection, which would allow isotropic resolution in ASLM? 

      The current implementation of ASLM requires an infinity corrected objective (i.e. conjugating the axial sweeping mechanism to the back focal plane). This is not possible due to the finite conjugate design of the ExA-SPIM detection lens.

      More fundamentally, pushing the excitation NA higher would result in a shorter light sheet Rayleigh length, which would require a smaller detection slit (shorter exposure time, lower signal to noise ratio). For our purposes an excitation NA of 0.1 is an excellent compromise between axial resolution, signal to noise ratio, and imaging speed. 

      For other potentially brighter biological structures, it may be possible to design a custom infinity corrected objective that enables ASLM with NA > 0.1.

      - Have the authors made any attempt to characterize distortions of the brain tissue that can occur due to expansion? 

      We have not systematically characterized the distortions of the brain tissue pre and post expansion. Imaged mouse brain volumes are registered to the Allen CCF regardless of whether or not the tissue was expanded. It is beyond the scope of this manuscript to include these results and processing methods, but we have confirmed that the ExA-SPIM mouse brain volumes contain only modest deformation that is easily accounted for during registration to the Allen CCF. 

      - The authors state that a custom lens with NA 0.5-0.6 lens can be designed, featuring similar specifications. Is there a practical design? Wouldn't such a lens be more prone to Field curvature? 

      This custom lens has already been designed and is currently being fabricated. The lens maintains a similar space bandwidth product as the current lens (increased numerical aperture but over a proportionally smaller field of view). Over the designed field of view, field curvature is <1 µm. However, including additional discussion or results of this customized lens is beyond the scope of this manuscript.

      Reviewer #2 (Recommendations For The Authors): 

      • System characterization: 

      - Please state what wavelength was used for the resolution measurements in Figure 2.

      An excitation wavelength of 561 nm was used. This has been added to the manuscript text.

      - The manuscript highlights that a key advance for the microscope is the ability to image over a very large 13 mm diameter field of view. Can the authors clarify why they chose to characterize resolution over an 8diameter mm field rather than the full area? 

      The 13 mm diameter field of view refers to the diagonal of the 10.6 x 8.0 mm field of view. The results presented in Figure 1c are with respect to the horizontal x direction and vertical y direction. A note indicating that the 13 mm is with respect to the diagonal of the rectangular imaging field has been added to the manuscript text. The results were presented in this way to present the axial and lateral resolution as a function of y (the axial sweeping direction).

      - The resolution estimates seem lower than I would expect for a 0.30 NA lens (which should be closer to ~850 nm for 515 nm emission). Could the authors clarify the discrepancy? Is this predicted by the Zemax model and due to using the lens in immersion media, related to sampling size on the camera, or something else? It would be helpful if the authors could overlay the expected diffraction-limited performance together with the plots in Figure 2C. 

      As mentioned previously, the resolution measurements were performed with 561 nm excitation and an emission bandpass of ~573 – 616 nm (595 nm average). Based on this we would expect the full width half maximum resolution to be ~975 nm. The resolution is in fact limited by sampling on the camera. The 3.76 µm pixel size, combined with the 5.0X magnification results in a sampling of 752 nm. Based on the Nyquist the resolution is limited to ~1.5 µm. We have added clarifying statements to the text.

      - I'm confused about the characterization of light sheet thickness and how it relates to the measured detection field curvature. The authors state that they "deliver a light sheet with NA = 0.10 which has a width of 12.5 mm (FWHM)." If we estimate that light fills the 0.10 NA, it should have a beam waist (2wo) of ~3 microns (assuming Gaussian beam approximations). Although field curvature is described as "minimal" in the text, it is still ~10-15 microns at the edge of the field for the emission bands for GFP and RFP proteins. Given that this is 5X larger than the light sheet thickness, how do the authors deal with this? 

      The generated light sheet is flat, with a thickness of ~ 3 µm. This flat light sheet will be captured in focus over the depth of focus of the detection objective. The stated field curvature is within 2.5X the depth of focus of the detection lens, which is equivalent to the “Plan” specification of standard microscope objectives.

      - In Figure 2E, it would be helpful if the authors could list the exposure times as well as the total voxels/second for the two-camera comparison. It's also worth noting that the Sony chip used in the VP151MX camera was released last year whereas the Orca Flash V3 chosen for comparison is over a decade old now. I'm confused as to why the authors chose this camera for comparison when they appear to have a more recent Orca BT-Fusion that they show in a picture in the supplement (indicated as Figure S2 in the text, but I believe this is a typo and should be Figure S3). 

      This is a useful addition, and we have added exposure times to the plot. We have also added a note that the Orca Flash V3 is an older generation sCMOS camera and that newer variants exist. Including the Orca BT-Fusion. The BT-Fusion has a read noise of 1.0 e- rms versus 1.6 e- rms, and a peak quantum efficiency of ~95% vs. 85%. Based on the discussion in Supplementary Note S1, we do not expect that these differences in specifications would dramatically change the data presented in the plot. In addition, the typo in Figure S2 has been corrected to Figure S3.

      - In Table S1, the authors note that they only compare their work to prior modalities that are capable of providing <= 1 micron resolution. I'm a bit confused by this choice given that Figure 2 seems to show the resolution of ExA-SPIM as ~1.5 microns at 4 mm off center (1/2 their stated radial field of view). It also excludes a comparison with the mesoSPIM project which at least to me seems to be the most relevant prior to this manuscript. This system is designed for imaging large cleared tissues like the ones shown here. While the original publication in 2019 had a substantially lower lateral resolution, a newer variant, Nikita et al bioRxiv (which is cited in general terms in this manuscript, but not explicitly discussed) also provides 1.5-micron lateral resolution over a comparable field of view. 

      We have updated the table to include the benchtop mesoSPIM from Nikita et al., Nature Communications, 2024. Based on this published version of the manuscript, the lateral resolution is 1.5 µm and axial resolution is 3.3 µm. Assuming the Iris 15 camera sensor, with the stated 2.5 fps, the volumetric rate (megavoxels/sec) is 37.41.

      - The authors state that, "We systematically evaluated dehydration agents, including methanol, ethanol, and tetrahydrofuran (THF), followed by delipidation with commonly used protocols on 1 mm thick brain slices. Slices were expanded and examined for clarity under a macroscope." It would be useful to include some data from this evaluation in the manuscript to make it clear how the authors arrived at their final protocol. 

      Additional details on the expansion protocol may be included in another manuscript.

      General comments: 

      • There is a tendency in the manuscript to use negative qualitative terms when describing prior work and positive qualitative terms when describing the work here. Examples include: 

      - "Throughput is limited in part by cumbersome and error-prone microscopy methods". While I agree that performing single neuron reconstructions at a large scale is a difficult challenge, the terms cumbersome and error-prone are qualitative and lacking objective metrics.

      We have revised this statement to be more precise, stating that throughput is limited in part by the speed and image quality of existing microscopy methods.

      - The resolution of the system is described in several places as "near-isotropic" whereas prior methods were described as "highly anisotropic". I agree that the ~1:3 lateral to axial ratio here is more isotropic than the 1:6 ratio of the other cited publications. However, I'm not sure I'd consider 3-fold worse axial resolution than lateral to be considered "near" isotropic.

      We agree that the term near-isotropic is ambiguous. We have modified the text accordingly, removing the term near-isotropic and where appropriate stating that the resolution is more isotropic than that of other cited publications.

      - exposures (which in the caption is described as "modest"). I'd suggest removing these qualitative terms and just stating the values.

      We agree and have changed the text accordingly.

      • The results section for Figure 5 is titled "Tracing axons in human neocortex and white matter". Although this section states "larger axons (>1 um) are well separated... allowing for robust automated and manual tracing" there is no data for any tracing in the manuscript. Although I agree that the images are visually impressive, I'm not sure that this claim is backed by data.

      We have now removed the text in this section referring to automated and manual tracing.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The paper investigates a potential cause of a type of severe epilepsy that develops in early life because of a defect in a gene called KCNQ2. The significance is fundamental because it substantially advances our understanding of a major research question. The strength of the evidence is convincing because appropriate methods are used that are in line with the state-of-the art, although there are some revisions/corrections that would strengthen the evidence further.

      Thank you for the expert, thorough, and helpful review.  We believe that addressing the reviewers’ points has improved our paper greatly.   

      Public Reviews:

      Reviewer #1 (Public Review):

      Abreo et al. performed a detailed multidisciplinary analysis of a pathogenic variant of the KCNQ2 ion channel subunit identified in a child with neonatal-onset epilepsy and neurodevelopmental disorders. These analyses revealed multiple molecular and cellular mechanisms associated with this variant and provided important insights into what distinguishes distinct pathogenic variants of KCNQ2 associated with self-limited familial neonatal epilepsy versus those leading to developmental and epileptic encephalopathy, and how they may mechanistically differ, to result in different extents of developmental impairment.

      The authors first provide a detailed clinical description of the patient heterozygous for a novel pathogenic variant encoding KCNQ2 G256W. They then model the structure of the G256W variant based on recent cryo-EM structures of KCNQ2 and other ion channel subunits and find that while the affected position is quite distinct from the channel pore, it participates in a novel, evolutionarily conserved set of amino acids that form a network of hydrogen bonds that stabilize the structure of the pore domain.

      They then undertake a series of rigorous and quantitative laboratory experiments in which the KCNQ2 G256W variant is coexpressed exogenously with WT KCNQ2 and KCNQ3 subunits in heterologous cells, and endogenously in novel gene-edited mice generated for this study. This includes detailed electrophysiological analyses in the transfected heterologous cells revealing the dominant-negative phenotype of KCNQ2 G256W. They found altered firing properties in hippocampal CA1 neurons in brain slices from the heterozygous KCNQ2 G256W mice.

      They next showed that the expression and localization of KCNQ channels are altered in brain neurons from heterozygous KCNQ2 G256W mice, suggesting that this variant impacts KCNQ2 trafficking and stability.

      Together, these laboratory studies reveal that the molecular and cellular mechanisms shaping KCNQ channel expression, localization, and function are impacted at multiple levels by the variant encoding KCNQ2 G256W, likely contributing to the clinical features of the child heterozygous for this variant relative to patients harboring distinct KCNQ2 pathogenic variants.

      Thank you for the thorough summary and estimation of the initial submission, we are very glad that our approach, analytical methods, and conclusions were convincing.   

      Reviewer #2 (Public Review):

      Summary:

      The paper entitled "Plural molecular and cellular mechanisms of pore domain KCNQ2 encephalopathy" by Abreo et al. is a complex and integrated paper that is well-written with a focus on a single gene variant that causes a severe developmental

      encephalopathy. The paper collates clinical outcomes from 4 individuals and investigates a variant causing KCNQ2-DEE using a wide range of experimental techniques including structural biology, in vitro electrophysiology, generation of genetically modified animal models, immunofluorescence, and brain slice recordings. The overall results provide a plausible explanation of the pathophysiology of the G265W variant and provide important findings to the KCNQ2-DEE field as well as beginning to separate the understanding between seizures and encephalopathies.

      Strengths:

      (1) The authors describe in detail how the structural biology of the channel with a mutation changes the movement of the protein and adds insights into how one variant can change the function of the M-current. The proposed model linking this change to pathogenic consequences should help pave the way for additional studies to further support this type of approach.

      (2) The multiple co-expression ratio experiments drill down to the complex nature of the assembly of channels in over-expression systems and help to move toward an understanding of heterozygosity. It might have been interesting if TEA was tested as a blocker to better understand the assembly of the transfected subunits or possibly use vectors to force desired configurations.

      (3) The immunofluorescent approach to understanding re-distribution is another component of understanding the function of this critical current. The demonstration that Q2 and Q3 are diminished at the AIS is an important finding and a strength to the totality of the data presented in the paper.

      (4) Brain slice work is an important component of studying genetically modified animals as it brings in the systems approach, and helps to explain seizure generation and EEG recordings. The finding that G265W/+ neurons were more sensitive to current injections is a critical component of the paper.

      (5) The strength of this body of work is how the authors integrated different scientific approaches to knitting together a compelling set of experiments to better explain how a single variant, and likely extrapolation to other variants, can cause a severe neonatal developmental encephalopathy with a poor clinical outcome.

      Thank you for the thorough and encouraging reading of our work and its strengths, we are very glad that, excepting the issues mentioned which we have addressed, our approach and conclusions were convincing.

      Weaknesses:

      (1) Minor comment: Under the clinical history it is unclear whether the mother was on Leviracetam for suspected in-utero seizures or if Leviracetam was given to individual 1.

      The latter seems more likely, and if so this should be reworded.

      We revised the results text to clarify that the drug was begun postnatally, after epilepsy was diagnosed in the child.   

      (2) As described in the clinical history of patient 1, treatment with ezogabine was encouraging with rapid onset by a parental global impression with difficulty in weaning off the drug. When studying the genetically modified mice, it would have been beneficial to the paper to talk about any ezogabine effects on the genetically modified mice.

      We agree this is of great interest, but sampling and metrics are challenging due to the very low frequency of seizures and delayed mortality in the heterozygous G256 mice.  Accordingly, we have not performed ezogabine treatment experiments in the mice described in this study, which model a human variant associated with a brief neonatal window of frequent seizures.  We hope to return this issue using other transgenic mice with higher seizure frequency, but such results are outside the current scope.

      (3) It is a bit surprising that CA1 pyramidal neurons from the heterozygous G256W mice have no difference in resting membrane potential. The discussion section might explore this in a bit more detail.

      Thank you for raising this issue. This combination of outcomes has been seen previously and is interpreted as an outcome of low somatodendritic surface expression of the channels.  Relatively higher expression within the AIS membrane, with its the relatively small surface area and electrical isolation from the soma, allow the KCNQ2/3 channels to influence AIS excitability with little or (in this instance) undetectable influence on the RMP (see e.g., Otto et al. 2006, PMID: 16481438; Singh et al. 2008, PMID 16481438  for KCNQ2 mutant mice.  See Hu and Bean, 2018, figure 2; PMID: 29526554 for explicit testing via focal AIS vs. somatic blocker perfusion).  Additionally, in previous work, we did not find any changes to the RMP of CA1 pyramidal neurons in either Kcnq2 knockout mice (PMID: 24719109) or mice expressing a Kcnq2 GOF variant (PMID: 37607817).  We modified the discussion including adding references to prior studies combining experimental and multicompartmental computational models.

      (4) It was mentioned in the paper about a direct comparison between SLFNE and G256W.

      However, in the slice recordings, there was no comparison. Having these data comparing

      SLFNE to G256W would have been a more fulsome story and would have added to the concept around susceptibility to action potential firing.

      Thank you for this point. We agree that such side-by-side recordings would be interesting.  However, slice recordings were not performed on the SLFNE mice. The study design was based on the fact that extensive prior studies of both haploinsufficient and missense human SLFNE variant mice have been published (Otto et al. 2006 J Neuroscience, PMID: 16481438; Singh et al. 2008, PMID 16481438; Kim et al 2020 PMID: 31283873) and show good agreement, but DEE missense variants have not been previously studied. We revised the discussion, to place the current DEE model results in the context of the prior SNFLE model slice work. We contrast the similarity of the CA1 cellular hyperexcitability phenotype ex vivo (at least in CA1 pyramidal cells) across models to the differences in electrographic and behavioral seizures (i.e., network level physiology).  

      Reviewer #3 (Public Review):

      Summary:

      This manuscript describes the symptoms of patients harboring KCNQ2 mutation G256W, functional changes of the mutant channel in exogenous expression, and phenotypes of G256W/+ mice. The patients presented seizures, the mutation reduced currents of the channel, and the G256W/+ mice showed seizures, increased firing frequency in neurons, reduced KCNQ2 expression, and altered subcellular distribution.

      Strengths:

      This is a large amount of work and all results corroborated the pathogenicity of the mutation in KCNQ2, providing an interesting example of KCNQ2-associated neurological disorder's impact on functions at all levels including molecular, cellular, tissue, animal model, and patients.

      Weaknesses:

      The manuscript described observations of changes in association with the mutation at molecular cellular functions and animal phenotype, but the results in some aspects are not as strong as in others. Nevertheless, the manuscript made overarching conclusions even when the evidence was not sufficiently strong.

      Thank you for your review.  In our revision (as listed in the recommendations to authors section) we have attempted to better justify the conclusions you mention there.

      Recommendations for the authors: 

      Reviewer #1 (Recommendations For The Authors):

      Suggestions for improved or additional experiments, data, or analyses.

      Page 7: the authors' statement that G256 could be intolerant to substitution would be strengthened by a straightforward analysis of available genome- and exome-wide sequencing data to determine the level of genic intolerance at this position in the human population, as has been used previously to highlight critical residues including those impacted by pathogenic variants in many other proteins including ion channels (e.g., Genome Biology 17:9, 2016; Am J Hum Genet 99:1261, 2016; Biochim Biophys Acta Biomemb 1862:183058, 2020).

      Thank you for this suggestion, we have revised the opening of this section to point out the low ratio of benign to pathogenic variants in the region surrounding G256 shown by prior work. We have added citations to the papers describing the MTR and gnomAD tools that highlight these data and calculations.   

      The overall interpretation of the CHO cell results would be enhanced by the authors including in their discussion an explicit statement that they did not attempt to evaluate the overall and plasma membrane expression levels of the exogenously expressed WT and mutant KCNQ2 subunits, nor that of KCNQ3, in the transfected CHO cells. They could also highlight that this is an important future experiment to determine whether the dominant negative effects are due to impaired expression/trafficking or impaired function of plasma membrane channels, as this may be an important consideration for designing therapeutic strategies.

      We agree.  We revised the discussion to explicitly mention this additional direction.  We agree this topic has therapeutic implications, especially given our in vivo protein localization results.  We added a mention that combinations of molecules enhancing surface localization with channel openers could be a therapeutic strategy, analogous to approved therapies for cystic fibrosis.  

      The authors conclude that the impact of ezogabine treatment is reduced in the cells expressing G256+/W versus those expressing WT KCNQ2. However, the delta pA/pF graph in panel 3G expresses the effects of ezogabine as absolute increases in current density. Determining the relative increase (i.e., fold change) in current density in ezogabine-treated versus control conditions is a more valid way to analyze these data. This provides a better reflection of the impact of ezogabine as the control currents already have a much larger amplitude than the G256+/W currents. By eye the impact of ezogabine looks comparable or even larger for the G256+/W condition than for WT, fundamentally changing the interpretation of these results.

      Thank you for this helpful comment.  The reviewer calls attention to the fact that although G256W/+ mean whole cell currents from are less than WT, before and after application of ezogabine, it appeared from Fig. 3G that ezogabine enhanced currents to a “proportionally equivalent extent” in G256W/+ and WT cells.  We revised panel 3G to try to make this more clear.  It now shows WT currents +/- ezogabine currents normalized to (WT, no ezogabine at +40 mV), along with G256W/+ cells +/- ezogabine currents, normalized to (G256W/+, no ezogabine at +40 mV).  This normalization shows that the mixed population of channels expressed by G256W/+ cells are equally augmented (with a trend toward greater augmentation), compared to controls.  This is a striking result given that channels lacking WT KCNQ2 subunits do not respond to ezogabine (i.e., the “homozygous heteromer” condition, Fig. 3F) do not respond to ezogabine.  Although the underlying data are unchanged, we agree with the reviewers’ conclusion about emphasizing the effect “per channel”.  This reframing is mechanistically and clinically important.  We have made changes to the results text and discussion to highlight related issues.   

      Figure 7: it is not clear from the information presented whether the qPCR would only measure WT KCNQ2 mRNA levels or detect levels of both WT and E254fs transcripts. The authors assume nonsense-mediated decay, but they did [not] determine experimentally that this occurred. The sequencing in the supplemental figure shows the presence of E254fs transcripts but does not allow for insights into their abundance. It should be straightforward to develop primer sets that could then be used to selectively amplify WT and E254fs transcripts for quantitation. 

      Thank you for this helpful suggestion.  The assay used in the initial submission measures total Kcnq2 mRNA. We developed and performed a new assay where the probe binding site is the WT sequence, centered on the mutations. New Figure 7-Figure supplement 1, panel A is a cartoon showing the differences between the assays.  Using the WT alleleselective RT-qPCR assay, both  G256W/+ and E254fs/+ samples showed a 50% loss of WT Kcnq2.   We now can conclude that NMD is absent for G256W and incomplete for E254fs mRNA. Neither mutant heterozygous line shows a compensatory increase in WT Kcnq2 expression.  These conclusions are much more specific than previously, and documenting incomplete NMD of KCNQ2 is novel and of potential clinical significance.  The KCNQ2 protein (western blot) and WT mRNA (qPCR) results now agree, both showing ~50% loss.   

      For reporting transparency, the authors should provide the sequences of each of the primers used. Perhaps this is in the "key reagents" section, but this was missing from the manuscript. I note the authors use NMD in this section without defining it. and added a reference to a review where “incomplete NMD” is discussed.

      We have added the assay catalogue numbers to the key reagents table.  We eliminated the use of the NMD abbreviation. We added citations to the “incomplete NMD” literature including an excellent recent review and a directly relevant primary paper.  These show how NMD efficiency may differ: between genes, transcripts, cells, tissues and, remarkably, between human individuals (see doi: 10.1093/hmg/ddz028, cited in the review—caffeine inhibits NMD!).  The revised discussion mentions this, and relevance to future studies of novel KCNQ2 variant pathogenicity and severity prediction.  

      Recommendations for improving the writing and presentation.

      I found the presentation of the IHC images deficient in terms of accessibility and transparency. While the movies provided are also useful, it is important the authors also provide conventional static merged images of each of their multiplex labeling images in the body of the paper. This allows a reader to see the labeling with the different antibodies in the context of each other (one of the major advantages of multiplex labeling), instead of trying to remember the pattern each label gave in prior sections of the movie.

      [We queried the reviewer via the eLife editorial staff]: To clarify my suggestion to improve Figure 8, the authors should generate from their movies static images that are basically what they already did in Fig8S3 for the G256W Het panel of the Fig8 movie. This involves revising Fig8S3 to include WT panels, and adding two new supplemental figures that show WT/Het panels with the separate antibodies and then a merged image from Fig8S1 and Fig8S2, just like they did in Fig8S3 for the mutant part of the Fig8 movie.

      Thank you for this comment. As suggested by the reviewer, for each IHC movie (Fig. 8, Fig. 8-figure supplement 1 and Fig. 8-figure supplement 2), we added a new supplementary  figure showing WT and mutant animal static images corresponding to the movies.  For main Figure 8 (CA1, G256W/+ comparison), the new static images enable evaluating the patterns of colocalization by providing selected portions of the images at the highest useful magnification.  These show  each individual antibody in greyscale (best for comparing) and 4 different green-red merged images to show overlap (yellow) vs non-overlap.  The merged images demonstrate colocalization of KCNQ2 and KCNQ3 at the distal portions of AnkG-labelled CA1 pyramidal cell AISs, in agreement with prior publications.  In G256W/+ but not E254fs/+ images, KCNQ2 and KCNQ3 show reduced relative labeling of AISs and increased relative labeling of somata in the pyramidal cell layer.   For CA3, the merged views show the redistributed relative labeling of KCNQ2 and KCNQ3 between stratum lucidum and stratum pyramidale.  

      We also revised Fig. 8 supplement 3 (CA1) to include WT panels, On reexamination, all WT interneurons  in the small sample lacked somatic KCNQ2 and KCNQ3 labeling.  Some s. oriens and radiatum AISs of both WT and G256W/+ sections showed KCNQ2 and KCNQ3 labeling, as shown in the revised figure.  Counting statistics are included in the supporting data.  Importantly, our belief that the images shown are representative is supported by the blinded analysis of a much larger sample (Figure 9, unchanged in revision).  

      Dragging the movie viewer “slider” allows the viewer to move  back and forth between color channels.  It works well in eLife if used in that way.   This is a way of seeing the “representativeness” of the merges shown in the CA1 conventional static images, which necessarily include a smaller x-y area and include only a few AISs.   We also added a KCNQ2/KCNQ3 merge to the movies. 

      Western blot results in Figure 9 - Supplement 1: for transparency, the authors need to show the entire blot, as they did in Figure 4 - Supplement 2. This is required in many journals, and in the case of KCNQ2 it provides crucial information as to the different forms of KCNQ2 present on SDS gels in these samples that contain different KCNQ2 isoforms. Given the surprising decrease in levels of KCNQ2 monomer in the G256+/W mice, it is important to present and analyze the levels of the monomer, dimer, and higher oligomeric forms of KCNQ in these samples, to determine whether protein "missing" in the monomeric form is not present in the dimeric or higher oligomeric form. This is especially important as the G256W mutant could lead to misfolding and aggregation leading to a higher proportion of both WT and G256W subunits being present in a higher-order oligomeric form. I note that it is odd that the figure legend states "Images of entire filter used for western blot of lysates, probed for KCNQ2 and KCNQ3.", even though only selected portions are shown.

      Thank you for this suggestion. We agree that the wording of the legend needed improvement.  

      In revision, the western blots are renumbered as Figure 10, and Figure 10-Figure supplement 1. In the main figure, monomer bands and densitometry are shown, as previously.  In the new Figure 10-Figure supplement 1,  we show (1) the ECL image of the entire filter probed with rabbit anti-KCNQ2, (2) the same blot, stripped, and reprobed with guinea pig KCNQ3, (3) the lower portion, probed with mouse anti-tubulin. The revised Fig. 10-fig supplement 1 shows 3 genotypes x 3 individual (male) p21 mice, with all steps performed in parallel from homogenization to ECL detection.  As suggested, we performed new analysis of the immunoreactive bands corresponding to (apparent) monomer, dimer, and higher oligomeric forms of KCNQ2. Analysis of the sum of those bands showed loss of KCNQ2 protein in both mutant lines.  

      The methods are sufficiently detailed with the exception that there is inconsistent inclusion of catalog numbers and RRIDs. Having these would improve transparency as to specific reagents used and would allow for enhanced reproducibility of the lab research performed here.

      The revised submission includes the key resources table, which we understood was not requested from eLife at initial submission. 

      Minor corrections to the text and figures.

      Typos/mistakes as to antibodies used in the IHC methods section "anti-AnkG36 N106/36 " should be "anti-AnkG N106/36", and "mouse anti-PanNav IgG1 supernatant" should be mouse anti-PanNav IgG1 purified antibody". 

      Thank you, corrections made.

      It would facilitate a reader's interpretation of the IHC results if the authors explicitly stated in the IHC results section that the KCNQ2 antibody used is against the N-terminus and therefore should recognize both mutant isoforms as the mutations are downstream of this.

      We added this point to the results section in relation to Figure 4-figure supplement 2 (western), and in IHC methods.

      PV is not defined when used in the discussion, nor is why knowing that somatic KCNQ2 immunolabeling is present in both PV and non- PV interneurons of WT mice of value to the reader.

      We revised these sentences for clarity.

      The IHC methods state that "mice were transcardially perfused with....ice cold 2% paraformaldehyde in PBS, freshly prepared from a 20% stock (Electron Microscopy Sciences).". The authors presumably mean "formaldehyde" as paraformaldehyde is the inert polymeric storage form of active depolymerized monomeric formaldehyde that is a fixative.

      The reviewer is correct regarding the chemistry; the manufacturer’s product name is “Paraformaldehyde 20% aqueous solution”.  We revised accordingly.

      Reviewer #3 (Recommendations For The Authors):

      Some comments regarding the presentation are as follows.

      (1) The section "G256W lies atop a dome-shaped hydrogen bond network linking helix S5 to the turret and selectivity filter" is entirely based on structural observations without functional validation. This may be more appropriate in Discussion. The emphasis on the "turret arch" bonding should be tuned down due to the lack of functional support.

      We understand and agree with this concern about the distinction between structural analysis and implied function.  However, we believe that the structural model reinterpretation and phylogenetic sequence analysis in our submission are results.  Structures as complex as those of KCNQ channels necessarily cannot be fully shown or analyzed in an initial publication. To our knowledge, the word “turret” has not appeared in a KCNQ channel cryoEM paper to date.  Bringing clinical motivation to prioritize study of an overlooked spot on the channel is creditworthy. The comprehensive heterologous patch clamp results in our study (including absence of effects on voltage-dependence, evidence of partial functional activity of channels containing one mutant subunit per channel shown for KCNQ2 homomers, KCNQ2/3 heteromers, and via acute ezogabine rescue experiments in the biologically most relevant heteromers) are functional evidence consistent with G256W acting through disruption of the SF.  

      However, we agree that more support is needed. The words “dome” and “arch”, though accurate for describing shape, tend to imply a mechanical “load bearing and distributing” function --our study does not prove this. Accordingly, we have toned down the emphasis by removing the words “keystone”, “turret dome bonding”, and  “as a structural novelty” from the abstract.   The revised discussion section replaces arch with “arch-shaped”, calls the idea that the turret functions as a stabilizing arch a “novel hypothesis”, and proposes next experiments (with relevant citations).

      Section title "Heterozygous G256W mice have neonatal seizures" does not seem to match the results since there was only one mouse that showed neonatal seizures.

      Thank you, we have revised the section title.  The text is transparent regarding sample size. The discussion highlights that these seizures are rare (indeed, not previously shown for any heterozygous missense model, to our knowledge).

      (2) It will be nice for the non-expert readers if the observations of "discrete seizures", "clusters", "diffuse bilateral onset", "unilateral onset" etc. are marked in Figure 1.

      Thank you for making this point. Figure 1 shows key excerpts of one bilateral onset seizure; a unilateral onset example isn’t shown since previous KCNQ2 DEE papers we cite have emphasized and illustrated focal onset seizures (Weckhuysen et al., 2013; Numis et al., 2014).    We revised the results section (p. 4) and Figure 1 and supplement captions to improve clarity for all readers including non-specialists.  

      (3) Figure 5 and page 10 first paragraph. Please specify the number of cells and the number of mice that were studied.

      Thank you, this information has been added to legend.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      […]

      (1) The authors claim that the negative frequency dependence that maintains polymorphism in their model results from a non-linear relationship between the display trait and sexual success [...] Maybe I missed something, but the authors do not provide support for their claim about the negative frequency-dependence of sexual selection in their simulations. To do so they could (1) extract the relationship between the relative mating success of the two male types from the simulations and (2) demonstrate that polymorphism is not maintained if the relationship between male display trait and mating success is linear.

      We believe that there is a confusion of terminology here. We agree that for the two alleles at a locus impacting male display in our model, the allele conferring inferior display quality will have a fitness that increases as its frequency increases, so this allele displays positive frequency dependent fitness. And, the alternate, display-favoring allele at the locus does display negative frequency dependence. Our use of the terminology ‘negative frequency dependence’ was meant to refer to the negative dependence of the fitness of the display-favoring allele with respect to its own frequency. However, a significant body of literature instead discusses models in which both an allele and its alternate(s) are beneficial when at low frequency and deleterious when at high frequency under the same selective challenge, entailing negative frequency dependence of fitness for all alleles involved. This benefit-when-rare model of a single trait is often described simply as negative frequency dependence, and generates balancing selection at the locus, but is not the model we are presenting here, and does not encompass all models involving negative frequency dependent fitness. This lexical expectation may make the interpretation of our work more difficult, and we have amended the manuscript to make our model clearer (lines 227-231). In this model, we have a negative frequency dependence for the fitness of the display-favoring allele in mate competition, but the net selective disadvantage of this allele at high frequency is due to a cost in another, pleiotropic, fitness challenge: the constant survival effect. So, the alleles are under balancing selection where alternate alleles are favored by selection when rare, but not due solely to selection during mate competition. Instead, our model relies on pleiotropy for an emergent form of frequency-dependent balancing selection (in the sense that each allele is predicted to be beneficial on balance when rare).

      In the reviewer’s model of the success of two alleles at one locus, the ratio of success is vaguely linear with allele frequency for n=3, though it starts quite convex and has an inflection point between convex and concave segments (for the disfavored allele) at p≈0.532. This is visualized easily by plotting the function and its derivatives in Wolfram-Alpha. For n>=4, the fitness function with respect to the display-favoring/disfavoring allele becomes increasingly concave/convex respectively, and this specific nonlinearity is needed to act along with the antagonistic pleiotropy to maintain balancing selection, rather than being maintained by a model that favors any rare allele on the basis of its rarity in some manner. In an attempt to make the importance of the encounter number parameter clearer, we’ve generated new panels for Figure S1 which simulate encounter numbers 2, 3, and 4, and we have updated corresponding text and figure references in lines 335-338.

      For (1-2), it is not clear how to modify the simulation such that the relationship between the trait value and mating success can be perfectly linear - either linear with respect to allele frequency in a one locus model or linear with respect to trait value at a specific population composition, without removing the simulation of mate competition altogether. While it may be of interest to explore a more comprehensive range of biological trade-offs in future studies, we are not able to meaningfully do so within the context of the present manuscript.

      (2) The authors only explore versions of the model where the survival costs are paid by females or by both sexes. We do not know if polymorphism would be maintained or not if the survival cost only affected males, and thus if sexual antagonism is crucial.

      We now present simulations with male costs only as added panels to Figure S1 and mention these results in the main text (lines 334-335). Maintenance of the polymorphism is significantly reduced or completely absent in such simulations.

      (3) The authors assume no cost to aneuploidy, with no justification. Biologically, investment in aneuploid eggs would not be recoverable by Drosophila females and thus would potentially act against inversions when they are rare.

      We did offer some discussion and justification of our decision to model no inherent fitness of the inversion mutation itself, specifically aneuploidy, in lines 36-39 and 78-80 of the original reviewed preprint. Previous research suggests that D. melanogaster females may not actually invest in aneuploid eggs generated from crossover within paracentric inversions. While surprising, and potentially limited to a subset of clades, many ‘r-selected’ taxa or those in which maternal investment is spread out over time may have some degree of reproductive compensation for non-viable offspring, which can reduce the costs of generating aneuploids significantly (for example, t-haplotypes in mice). We have added this example and citation to lines 34ff in the current draft.

      (4) The authors appear to define balanced polymorphism as a situation in which the average allele frequency from multiple simulation runs is intermediate between zero and one (e.g., Figure 3). However, a situation where 50% of simulation runs end up with the fixation of allele A and the rest with the fixation of allele B (average frequency of 0.5) is not a balanced polymorphism. The conditions for balanced polymorphism require that selection favors either variant when it is rare.

      We originally chose mean final frequency for presenting the single locus simulations based on the ease of generating a visual plot that included information on fixation vs loss and equilibrium frequency. Figure 3 and related supplemental images have been changed to now also represent the proportion of simulations retaining polymorphism at the locus in the final generation.

      (5) Possibly the most striking result of the experiment is the fact that for 14 out of 16 combinations of inversion x maternal background, the changes in allele frequencies between embryo and adult appear greater in magnitude in females than in males irrespective of the direction of change, being the same in the remaining two combinations. The authors interpret this as consistent with sexually antagonistic pleiotropy in the case of In(3L)Ok and In(3R)K. The frequencies of adult inversion frequencies were, however, measured at the age of 2 months, at which point 80% of flies had died. For all we know, this may have been 90% of females and 70% of males that died at this point. If so, it might well be that the effects of inversion on longevity do not systematically differ between the ages and the difference in Figure 9B results from the fact that the sample includes 30% longest-lived males and 10% longest-lived females.

      This critique deserves some consideration. The aging adults were separated by sex during aging, but while we recorded the number of survivors, we did not record the numbers of eclosed adults and their sexes initially collected out of an interest in maintaining high throughput collection. We therefore cannot directly calculate the associated survival proportions, but we can estimate them. We collected 1960 females and 3156 males, and we can very roughly estimate survival if we assume that equal numbers of each sex eclosed, and that the survivors represent 20% of the original population. That gives 12790 individuals per sex, or 84.7% female mortality and 75.3% male mortality.

      So, we have added a qualification discussing the possibility of stronger selection on females and its influence on observed sex-specific frequency changes, on lines 602-605.

      (6) Irrespective of the above problem, survival until the age of 2 months is arguably irrelevant from the viewpoint of fitness consequences and thus maintenance of inversion polymorphism in nature. It would seem that trade-offs in egg-to-adult survival (as assumed in the model), female fecundity, and possibly traits such as females resistance to male harm would be much more relevant to the maintenance of inversion polymorphisms.

      Adult Drosophila will continue to reproduce in good conditions until mortality, and the estimated age of a mean reproductive event for a Drosophila melanogaster individual is 24 days (Pool 2015), and likewise for D. simulans (Turelli and Hoffman 1995). Given that reproduction is centered around 24 days, we expect sampling at 2 months of age to still be relevant to fitness. In seasonally varying climates, either temperate or with long dry season, survival through challenging conditions is expected to require several months. In many such cases, females are in reproductive diapause, and so longevity is the main selective pressure. See lines 931-936 in the revised manuscript.

      As we agreed above, it would of interest to investigate a wider range of trade-offs in future studies. We focused here on the balanced between survival and male reproductive success because the latter trait generates negative frequency dependence for display-favoring alleles and a disproportionate skew towards higher quality competitors, whereas many other fitness-relevant traits lack that property.

      (7) The experiment is rather minimalistic in size, with four cages in total; given that each cage contains a different female strain, it essentially means N=1. The lack of replication makes statements like " In(2L)t and In(2R)NS each showed elevated survival with all maternal strains except ZI418N" (l. 493) unsubstantiated because the claimed special effect of ZI418N is based on a single cage subject to genetic drift and sampling error. The same applies to statements on inversion x female background interac7on (e.g., l. 550), as this is inseparable from residual variation. It is fortunate that the most interesting effects appear largely consistent across the cages/female backgrounds. Still, I am wondering why more replicates had not been included.

      Our experimental approach might be described as “diversity replication”. Essentially, the four maternal genetic backgrounds are serving dual purposes – both to assess experimental consistency and to ensure that our conclusions are not solely driven by a single non-representative genotype (which in so many published studies, can not be ruled out). It would indeed be interesting if we could have quadrupled the size of our experiment by having four replicates per maternal background. However, we suspect the reviewer may not recognize the substantial effort involved in our four existing experiments. Each of these involved collecting 500+ virgin females, hand-picking thousands of embryos during the duration of egg-laying, and repeatedly transferring offspring to maintain conditions during aging, such that cages had to be staggered by more than a month. These four cages took a year of benchwork just to collect frozen samples, before any preparation and quality control of the associated amplicon libraries for sequencing. Adding a further multiplier would take it well beyond the scope of a single PhD thesis.  Fortunately, we were able to obtain the key results of interest without that additional effort, even if clearer insights into the role of maternal background would also be of strong interest.

      We do agree that no firm conclusions about maternal background can be reached without further replication, and so we have qualified or removed relevant statements accordingly (lines 568ff, 620-622).

      Reviewer #1 (Recommendations For The Authors):

      The description of the model is confusing and incomplete, e.g., the values of several parameters used to obtain the numerical results are not given. It is first stated (l. 223) that the model is haploid, but text elsewhere talks about homozygotes and heterozygotes. If the model is diploid (this in itself is not clear), what is assumed about dominance?

      We are not presenting results for a mathematical model estimated numerically. We have now clarified our transition from a conceptual depiction of our model, in which we use haploid representations for simplified presentation, to our forward population genetic simulations, which are entirely diploid. More broadly, we have improved our communication of the assumptions and parameters used in our simulations. The scenarios we investigate involve purely additive trait effects within and between loci (except that survival probabilities are multiplicative to avoid negative values). We think that considering other dominance scenarios would be a worthy subject for a follow-up study, whereas the present manuscript is already covering a great deal of ground.   

      Similarly, it is hard to understand the design (l.442ff). I was confused as to whether a population was set up for each inversion or for all of them and what the unit or replication was. I found the description in Methods (l. 763-771) much clearer and only slightly longer; I suggest the authors transfer it to the Results. Also, Figure 8 should contain the entire crossing scheme; the current version is misleading in that it implies males with only two genotypes.

      All four tested inversions were segregating within the same karyotypically diverse population of males, and were assayed from the same experiments. We have attempted to improve the relevant description. For Figure 8, we had trouble conceiving a graphic update that contained a more complete cross scheme without seeming much more confused and cluttered. We have tried to clarify in the relevant text and the figure caption instead.

      There are a number of small issues that should be addressed:

      - No epistasis for viability assumed - what would be the consequence?

      We explored a model in which we intentionally included no terms for epistatic effects on phenotype. All epistasis with regard to fitness is emergent from competition between individuals with phenotypes composed of non-epistatic, non-dominant genetic effects. So, the simplest model of antagonism would have no epistasis for viability whatsoever. One could explore a model that has emergent viability epistasis in a similar way, by implementing stabilizing selection on a quantitative trait with a gaussian or similar non-linear phenotype-to-fitness map, but that might be better served as a topic for a future study. We have, however, tried to make this intent clearer in the text.

      l. 750 implies that aneuploidy generated by the inversion has no cost (aneuploid games are resampled)

      Yes, as addressed in public review item (3). Alternately see lines 34ff, 293, 369, 392 for in-text edits.

      l. 24-25: unclear; is this to mean that there is haplotype x sex interaction for survival?

      l. 25: success in what? (I assume this will be explained in the paper, but the abstract should stand on its own).

      l. 193-4: "producing among most competitive males": something missing or a word too much?? Figure 1B,C: a tiny detail, but the plots would be more intuitive if the blue (average) bars were ager (i.e., to the right) of the male and female ones, given that the average is derived from the two sex-specific values.

      Each of the above have been edited or implemented as suggested

      l. 205. It is convex function, but I do not understand what the authors mean by "convex distribution".

      Hopefully the updated text is clearer: “yielding a distribution of male reproductive output that follows a relatively convex trend”.

      l. 223ff: some references to Fig 1 panels in this paragraph seem off by one letter (i.e., A should be B, etc.).

      l. 231 "fitness...are equally fit": rephrase 

      l. 260: maybe "thrown out" is not the most fortunate term, maybe "eliminated" would be better?

      Each of the above have been edited or implemented as suggested

      Figure 3: I do not understand the meaning of "additive" and "multiplicative" in the case of a single locus haploid model

      All presented simulations are diploid, and these refer to the interactions between the two alleles at the locus. Hopefully the language is overall clearer in this draft.

      l. 274: "Mutation of new nucleotide" meaning what? Or is it mutation _to_ a new nucleotide?

      Hopefully the revised text is clearer.

      Figure 5. The right panel of figure 5A implies that, with the inversion, the population evolves to an extreme display trait that is so costly that it fills 95% of all individuals (or of all females?

      What is assumed about this here?). Apart from the biological realism of this result, what does it say about the accumulation of polymorphism and maintenance of the inversion? The graphs in fig 5B do plot a divergence between haplotypes, but it is not clear how they relate to those in panel A - the parameter values used to generate these plots are again not listed. Furthermore, from the viewpoint of the polymorphism, it would be good to report the frequencies at the steady-state.

      We have now clarified the figure description, including the parameter values used. The distribution of frequencies at the end of the simulation is represented in figure 6. Given that we set up the simulation with assumptions that are otherwise common to population models, what biological process would prevent this extreme? Why isn’t this extreme observed in natural populations? One possible explanation is that they become sex chromosomes, with increasing likelihood as the cost increases. Or other compensatory changes may occur that we don’t simulate, like regulatory evolution giving a complementary phenotype. Maybe genetic constraints in natural populations prevent the mutation of the kind of pleiotropic mutations that drive this dynamic. The populations still survive, though they are parameterized by relative fitness. What would an absolute fitness population function be? Would it go extinct or not? It would be of interest to explore a wider range of models, but it is the purpose of this paper to establish that this is a viable model for the maintenance of sexually antagonistic polymorphism and association with inversions. We have added a paragraph motivated by this comment to the Discussion starting on line 765.

      l. 401-2: Z-like, W-like : please specify you are talking about patterns resembling sex chromosomes. 

      l. 738: "population calculates"?

      l. 743-4 and 746-7: is this the same thing said twice, or are there two components of noise?  l. 357: there is no figure 5C.

      Each of the above have been addressed with text edits.

      L. 473-5: Yes, the offspring did not contain inversion homozygotes, but the sire pool did, didn't it? So homozygous inversions may have affected male reproductive success. Anyway, most of this paragraph (from line 473) seems to belong in Discussion rather than Results.

      We have revised this sentence to focus on offspring survival. 

      We can understand the reviewer’s suggestion about Results vs. Discussion text. While this can often be a challenging balance, we find that papers are often clearer if some initial interpretation is offered within the Results text. However, we moved the portion of this paragraph relating our findings to the published literature to the Discussion.

      l. 516: " In(3L)Ok favored male survival": this is misleading/confusing given the data, " In(3L)Ok reduced female survival more strongly than male survival..."

      Hopefully the phrasing is clearer now.

      l. 663ff: I did not have an impression that this section added anything new and could safely be cut.

      We have done some editing to make this more concise and emphasize what we think is essential, but we believe that the model of an autosomal, sexually antagonistic inversion differentiating before contributing to the origin of a sex chromosome is novel and interesting. And, that this additional emphasis is worthwhile to encourage thought and consideration of this idea in future research and among interested researchers.

      l. 751: "flat probability per locus": do the authors mean a constant probability?

      Edited.

      Reviewer #2 (Public Review):

      The manuscript lacks clarity of writing. It is impossible to fully grasp what the authors did in this study and how they reached their conclusions. Therefore, I will highlight some cases that I found problematic.

      Hopefully the revised manuscript improves writing clarity. 

      Although this is an interesting idea, it clearly cannot explain the apparent influence of seasonal and clinal variation on inversion frequencies.

      We do not believe that our model predicts a non-existence of temporal and spatial dependence of the fitness of inverted haplotypes, nor do we seek to identify the manner in which seasonal and clinal differences affect fitness of inverted haplotypes. Rather, we argued that the influence of seasonal and clinal selection on inversions does not on its own predict the observed maintenance of inversions at low to intermediate frequencies across such a diverse geographic range, along with the higher frequencies of many derived inversions in more ancestral environments. 

      We might imagine that trade-offs between life history traits such as mate competition and survival should be universal across the range of an organism. But in practice, the fitness benefits and costs of a pleiotropic variant (or haplotype) may be heavily dependent on the environment. A harsh environment such as a temperate winter may both reduce the number of females that a male encounters (decreasing the benefit of display-enhancing variants) and also increase the likelihood that survival-costly variants lead to mortality (thus increasing their survival penalty). In light of such dynamics, our model would predict that equilibrium inversion frequencies should be spatially and temporally variable, in agreement with a number of empirical observations regarding D. melanogaster inversions.

      We have edited the introduction to emphasize that inversion frequencies vary temporally as well as seasonally, on lines 144ff. We also note relevant discussion of the potential interplay between the environment and trade-offs such as those we investigate, on lines 153-155.

      The simulations are highly specific and make very strong assumptions, which are not well-justified.

      We respond to all specific concerns expressed in the Recommendations For The Authors section below. We also note that we have made further clarifications throughout the text regarding the assumptions made in our analysis and their justification.  

      Reviewer #2 (Recommendations For The Authors):

      I think that the manuscript would greatly benefit from a major rewrite and probably also a reanalysis of the empirical data.

      In particular, a genome-wide analysis of differences in SNP frequencies between sexes and developmental stages would help the reader to appreciate that inversions are special.

      [moved up within this section for clarity] We are lacking a genomic null model-how often do the authors see similar allele frequency differences when looking at the entire genome? This could be easily done with whole genome Pool-Seq and would tell us whether inversions are really different from the genomic background. I think that this information would be essential given the many uncertainties about the statistical tests performed. 

      We expect that autosome-wide SNP frequencies will be heavily influenced by the frequencies of inversions, which occur on all four major autosomal chromosome arms. These inversions often show moderate disequilibrium with distant variants (e.g. Corbett-Detig & Hartl 2012).

      Furthermore, the limited number of haplotypes present, given that the paternal population was founded from 10 inbred lines, would further enhance associations between inversions and distant variants. Therefore, we do not expect that whole-genome Pool-Seq data would provide an appropriate empirical null distribution for frequency changes. Instead, we have generated appropriate null predictions by accounting for both sampling effects and experimental variance, and we have aimed to make this methodology clearer in the current draft. 

      Some basic questions:

      why start at a frequency of 50% (line 287)?

      Isn't it obvious that in this scenario strong alleles with sexually antagonistic effects can survive?

      The initial goal of the associated Figure 4 was not to show that a strongly antagonistic variant could persist. Instead, we wanted to test the linkage conditions in which a second, relatively weaker antagonistic variant survived – which did not occur in the absence of strong linkage. 

      We have now added simulations with relatively lower initial frequencies, in which the weaker variant and the inversion both start at 0.05 frequency, while the stronger variant is still initialized at 0.5 to reflect the initial presence of one balanced locus with a strongly antagonistic variant. Here, the weaker antagonistic variant is still usually maintained when it is close to the stronger variant, and while the inversion-mediated maintenance of the weaker variant at greater distance from the stronger variant because less frequent than the original investigated case, it still happens often enough to hypothetically allow for such outcomes over evolutionary time-scales.

      Still, we should also emphasize that the goals of this proof-of-concept analysis are to establish and convey some basic elements of our model. Subsequently, analyses such as those presented in Figures 5 and 6 provide clearer evidence that the hypothesized dynamics of inversions facilitating the accumulation of sexual antagonism actually occur in our simulations.

      The experiments seem to be conducted in replicate (which is of course essential), but I could not find a clear statement of how many replicates were done for each maternal line cross.

      How did the authors arrive at 16 binomial trials (line 473)? 4 inversions, 4 maternal genotypes?

      How were replicates dealt with?

      In Figure 9, it would be important to visualize the variation among replicates.

      Unfortunately, we did not have the bandwidth to perform replicates of each maternal line. Instead, we use four maternal backgrounds to simultaneously establish consistency across independent experiments and genetic backgrounds (see our response to Reviewer 1, point 7). We’ve edited the draft to make this clearer and more clearly delineate what is supported and not supported by our data. Replicate variation for the control replicates of the extraction and sequencing process, and the exact read counts of the experiment, are available in Supplemental Tables S5, S6, and S7.

      The statistical analysis of trade-off is not clear: which null model was tested? No frequency change? In my opinion, two significances are needed: a significant difference between parental and embryo and then embryo and adult offspring. The issue with this is, however, that the embryo data are used twice and an error in estimating the frequency of the embryos could be easily mistaken as antagonistic selection.

      Hopefully the description of our null model is clearer in the text, now starting around line 967 in the Methods. We are aware of the positive dependence when performing tests comparing the paternal to embryo and then embryo to offspring frequencies, and this is accounted for by our analysis strategy - see lines 1009-1012.

      It was not clear how the authors adjusted their chi-squared test expectations. Were they reinventing the wheel? There is an improved version of the chi-squared test, which accounts for sampling variation.

      We did not actually perform chi-square tests. Instead, we used the chi statistic from the chi-squared test as a quantitative summary of the differences in read counts between samples. We compared an observed value of chi to values for this statistic obtained from simulated replicates of the experiment. Sampling from this simulation generated our ‘expected’ distribution of read counts, sampled to match sources of variance introduced in the experimental procedure, but without any effect of natural selection, per lines 825ff in the original submission. Hence, we are approximating the likelihood of observing an empirical chi statistic by generating random draws from a model of the experiment and comparing values calculated from each draw to the experimental value: a Monte Carlo method of approximating a p-value for our data. We have attempted to make the structure of these simulations and their use as a null-model clearer in this draft.

      It is not sufficiently motivated why the authors model differences in the extraction procedure with a binomial distribution.

      Adding a source of variance here seemed necessary as running control sequencing replicates revealed that there was residual variance not fully recapitulated by sample-size-dependent resampling. Given that we were still sampling a number of draws from a binomial outcome (the read being from the inverted or standard arrangement), a binomial distribution seemed a reasonable model, and we fit the level of this additional noise source to an experiment-wide constant, read-count or genome-count independent parameter that best fit the variance observed in the controls (lines 830ff in the original draft). Clarification is made in this manuscript draft, lines 979-989.

      How many reads were obtained from each amplicon? It looks like the authors tried to mimic differences between technical replicates by a binomial distribution, which matches the noise for a given sample size, but this depends on the sequence coverage of the technical replicates.

      We provide read counts in Supplemental Tables S6 and S7. The relevant paragraph in the methods has been edited for clarity, lines 972ff. Accounting for sampling differences between replicates used a hypergeometric distribution for paternal samples to account for paternal mortality before collection, and the rest were resampled with a binomial distribution. There were two additional binomial samplings, to account for resampling the read counts and to capture further residual variance in the library prep that did not seem to depend on either allele or read counts.

      It would be good to see an estimate for the strength of selection: 10% difference in a single generation appears rather high to me.

      Estimates of selection strength based on solving for a Wright-Fisher selection coefficient for each tested comparison can now be found in Table S8, mentioned in text on lines 589-590. The mean magnitude of selection coefficients for all paternal to embryo comparisons was 0.322, and for embryo to all adult offspring it was 0.648. For In(3L)Ok the mean selection coefficients were 0.479 and -0.53, and for In(3R)K they were -0.189 and 1.28, respectively. Some are of quite large magnitude, but we emphasize that the coefficients for embryo to adult are based on survival to old age, rather than developmental viability. That factor, in addition to the laboratory environment, makes these estimates distinct from selection coefficients that might be experienced in natural populations.

      Reviewer #3 (Public Review):

      Strengths:

      (1) …the authors developed and used a new simulator (although it was not 100% clear as to why SLiM could not have been used as SLiM has been used to study inversions).

      Before SLiM 3.7 or so (and including when we did the bulk of our simulation work), we do not think it would have been feasible to use SLiM to model the mutation of inversions with random breakpoints and recombination between without altering the SLiM internals. Separately, needing to script custom selection, mutation, and recombination functions in Eidos would have slowed SLiM down significantly. Given our greater familiarity with python and numpy, and the ability to implement a similar efficiency simulator more quickly than through learning C++ and Eidos, we chose to write our own.

      It should be a fair bit easier to implement comparable simulations in SLiM now, but it will still require scripting custom mutation, selection, and recombination functions and would still result in a similarly slow runtime. The current script recipe recommended by SLiM for simulating inversions uses constants to specify the breakpoints of a single inversion, without the ability to draw multiple inversions from a mutational distribution, or model recombination between more complicated karyotypes. Hence, our simulator still seems to be a more versatile and functional option for the purposes of this study.

      Weaknesses:

      [Comments 1 through 4 on Weaknesses included numerous citation suggestions, and some discussion recommendations as well. In our revised manuscript, we have substantially implemented these suggestions. In particular, we have deepened our introduction of mechanisms of balancing selection and prior work on inversion polymorphism, integrating many

      suggested references. While especially helpful, these suggestions are too extensive to completely quote and respond to in this already-copious document. Therefore, we focus our response on two select topics from these comments, and then proceed to comment 5 thereafter.]

      (2) The general reduction principle and inversion polymorphism. In Section 1.2., the authors state that "there has not been a proposed mechanism whereby alleles at multiple linked loci would directly benefit from linkage and thereby maintain an associated inversion polymorphism under indirect selection." Perhaps I am misunderstanding something, but in my reading, this statement is factually incorrect. In fact, the simplest version of Dobzhansky's epistatic coadaptation model

      (see Charlesworth 1974; also see Charlesworth and Charlesworth 1973 and discussion in Charlesworth & Flatt 2021; Berdan et al. 2023) seems to be an example of exactly what the authors seem to have in mind here: two loci experiencing overdominance, with the double heterozygote possessing the highest fitness (i.,e., 2 loci under epistatic selection, inducing some degree of LD between these loci), with subsequent capture by an inversion; in such a situation, a new inversion might capture a haplotype that is present in excess of random expectation (and which is thus filer than average)…

      We agree that the quoted statement could be misleading and have rewritten it. We intended to point out that we are presenting a model in which all loci contribute additively (with respect to display) or multiplicatively (with respect to survival probability), without any dominance relationships or genetic interaction terms. And yet, the model generates epistatic balancing selection in a panmictic population under a constant environment. This represents a novel mechanism by which (the life-history characteristics of) a population would generate epistatic balancing selection as an emergent property, instead of assuming a priori that there is some balancing mechanism and representing frequency dependence, dominance effects, or epistatic interactions directly using model parameters. We have therefore refined the scope of the statement in question (lines 155-158). 

      (4) Hearn et al. 2022 on Littorina saxatilis snails. 

      A good reference. There is considerable work on ecotype-associated inversions in L. saxatalis, but we previously cut some discussion of this and of other populations with high gene flow but identifiable spatial structure for inversion-associated phenotypes (e.g. butterfly mimicry polymorphisms, Mimulus, etc.). Due to the spatially discrete environmental preferences and sampled ranges of the inversions in these populations, we considered these examples to be somewhat distinct from explaining inversion polymorphism in a potentially homogenous and panmictic environment. 

      (4) cont. A very interesting paper that may be worth discussing is Connallon & Chenoweth (2019) about dominance reversals of antagonistically selected alleles (even though C&C do not discuss inversions): AP alleles (with dominance reversals) affecting two or more life-history traits provide one example of such antagonistically selected alleles (also see Rose 1982, 1985; Curtsinger et al. 1994) and sexually antagonistically selected alleles provide another. The two are of course not necessarily mutually exclusive, thus making a conceptual connection to what the authors model here.

      We had removed a previously drafted discussion of dominance reversal for brevity’s sake, but this topic is once again represented in the updated draft of the manuscript with a short reference in the introduction, lines 76-80. We also mention ‘segregation lift’ (Wittmann et al. 2017) involving a similar reversal of dominance for fitness between temporally fluctuating conditions, as opposed to between sexes or life history stages. 

      (5) The model. In general, the description of the model and of the simulation results was somewhat hard to follow and vague. There are several aspects that could be improved:  [5](1) it would help the reader if the terminology and distinction of inverted vs. standard arrangements and of the three karyotypes would be used throughout, wherever appropriate.

      We have attempted to do so, using the suggested heterokaryotypic/homokaryotypic terminology.

      [5](2) The mention of haploid populations/situations and haploid loci (e.g., legend to Figure 1) is somewhat confusing: the mechanism modelled here, of course, requires suppressed recombination in the inversion/standard heterokaryotype; and thus, while it may make sense to speak of haplotypes, we're dealing with an inherently diploid situation. 

      While eukaryotes with haploid-dominant life history may still experience similar dynamics, we do expect that most male display competition is in diploid animals, and we are only simulating diploid fitnesses and experimenting with diploid Drosophila. We have tried to minimize the discussion of haploids in this draft.

      [5](3) The authors have a situation in mind where the 2 karyotypes (INV vs. STD) in the heterokaryotype carry distinct sets of loci in LD with each other, with one karyotype/haplotype carrying antagonistic variants favoring high male display success and with the other karyotype/haplotype carrying non-antagonistic alternative alleles at these loci and which favor survival. Thus, at each of the linked loci, we have antagonistic alleles and non-antagonistic alleles - however, the authors don't mention or discuss the degree of dominance of these alleles. The degree of dominance of the alleles could be an important consideration, and I found it curious that this was not mentioned (or, for that matter, examined). 

      In this study, our goal was to show that the investigated model could produce balanced and increasing antagonism without the need to invoke dominance. We think there would be a strong case for a follow-up study that more investigates how dominance and other variables impact the parameter space of balanced antagonism, but this goal is beyond our capacity to pursue in this initial study. We’ve added several lines clarifying the absence of dominance from our investigated models, and pointing out that dominance could modulate the predictions of these models (lines 211-213, 278-282).  

      [5](4) In many cases, the authors do not provide sufficient detail (in the main text and the main figures) about which parameter values they used for simulations; the same is true for the Materials & Methods section that describes the simulations. Conversely, when the text does mention specific values (e.g., 20N generations, 0.22-0.25M, etc.), little or no clear context or justification is being provided. 

      We have sought to clarify in this draft that 20N was chosen as an ample time frame to establish equilibrium levels and frequencies of genetic variation under neutrality. We present a time sequence in Figure 5, and these results indicate that that antagonism has stabilized in models without inversions or with higher recombination rates, whereas its rate of increase has slowed in a model with inversions and lower levels of crossing over. 

      The inversion breakpoints and the position of the locus with stronger antagonistic effects in Figure 4 were chosen arbitrarily for this simple proof of concept demonstration, with the intent that this locus was close to one breakpoint. Hopefully these and other parameters are clearer in the revised manuscript.

      [5](5) The authors sometimes refer to "inversion mutation(s)" - the meaning of this terminology is rather ambiguous.

      Edited, hopefully the wording is clearer now. The quoted phrase had uniformly referred to the origin of new inversions by a mutagenic process. 

      (6) Throughout the manuscript, especially in the description and the discussion of the model and simulations, a clearer conceptual distinction between initial "capture" and subsequent accumulation / "gain" of variants by an inversion should be made. This distinction is important in terms of understanding the initial establishment of an inversion polymorphism and its subsequent short- as well as long-term fate. For example, it is clear from the model/simulations that an inversion accumulates (sexually) antagonistic variants over time - but barely anything is said about the initial capture of such loci by a new inversion.

      We do not have a good method of assessing a transition between these two phases for the simulations in which both antagonistic alleles and inversions arise stochastically by a mutagenic process. However, we have tried to be clearer on the distinction in this draft: we have included simulations in Figure 4 with variants starting at lower frequencies, and we have tried to better contextualize the temporal trajectories in Figure 5 as (in part) modeling the accumulation of variants after such an origin.

      Reviewer #3 (Recommendations For The Authors):

      - In general: the whole paper is quite long, and I felt that many parts could be written more clearly and succinctly - the whole manuscript would benefit from shortening, polishing, and making the wording maximally precise. Especially the Introduction (> 8 pages) and Discussion (7.5 pages) sections are quite long, and the description of the model and model results was quite hard to follow.

      We have attempted to condense some portions of the manuscript, but inevitably added to others based on important reviewer suggestions. Regarding the length Introduction and Discussion, we are covering a lot of intellectual territory in this study, and we aim to make it accessible to readers with less prior familiarity. At this point, we have well over 100 citations – far more than a typical primary research paper – in part thanks to the relevant sources provided by this reviewer. We are therefore optimistic that our text will provide a valuable reference point for future studies. We have also made significant efforts to clarify the Results and Methods text in this draft without notably expanding these sections.

      - In general: the conceptual parts of the paper (introduction, discussion) could be better connected to previous work - this concerns e.g. the theoretical mechanisms of balancing selection that might be involved in maintaining inversions; the general, theoretical role of antagonistic pleiotropy (AP) and trade-offs in maintaining polymorphisms; previously made empirical connections between inversions and AP/trade-offs; previously made empirical connections between inversions and sexual antagonism.

      In the revised manuscript, we have improved the connection of these topics to prior work.

      - L3: "accumulate". A clearer distinction could be made, throughout, between initial capture of alleles/haplotypes by an inversion vs. subsequent gain.

      Please see point 6 in the response to the Public Review, above.

      - L29: I basically agree about the enigma, however, there are quite many empirical examples in D. melanogaster / D. pseudoobscura and other species where we do know something about the nature of selection involved, e.g., cases of NFDS, spatially and temporally varying selection, fitness trade-offs, etc.

      At least for our focal species, we have emphasized that geographic (and now temporal) associations have been found for some inversions. For the sake of length and focus, we probably should not go down the road of documenting each phenotypic association that has been reported for these inversions, or say too much about specific inversions found in other species. As indicated in our response to reviewer 2, some previously documented inversion-associated trade-offs may be compatible with the model presented here. However, we did locate and add to our Discussion one report of frequency-dependent selection on a D. melanogaster inversion (Nassar et al. 1973).

      - L43: it is actually rather unlikely, though not impossible, that new inversions are ever completely neutral (see the review by Berdan et al. 2023).

      This line was intended to convey that, in line with Said et al. 2018’s results, the structural alterations involved in common segregating inversions are not expected to contribute significantly to the phenotype and fitness (as indicated by lack of strong regulatory effects), and that their phenotypic consequences are instead due to linked variation. We have rewritten this passage to better communicate this point, now lines 44-52. Interpreting Section 2 and Figure 1 of Berdan et al. 2023, the linked variation may be what is in mind when saying that inversions are almost never neutral. We have also added a line referencing the expected linked variation of a new inversion (lines 49-52).

      - L51-73: I felt this overview should be more comprehensive. The model by Kirkpatrick & Barton (2016 ) is in many ways less generic than the one of Charlesworth (1974) which essentially represents one way of modeling Dobzhansky's epistatic coadaptation. Also, the AOD mechanism is perhaps given too much weight here as this mechanism is very unlikely to be able to explain the establishment of a balanced inversion polymorphism (see Charlesworth 2023 preprint on bioRxiv). NFDS, spatially varying selection and temporally varying selection (for all of which there is quite good empirical evidence) should all be mentioned here, including the classical study of Wright and Dobzhansky (1946) which found evidence for NFDS (also see Chevin et al. 2021 in Evol. Lett.)

      On reflection, we agree that we put too much emphasis on AOD and have edited the section to be more representative.

      - L57. Two earlier Dobzhansky references, about epistatic coadaptation, would be: Dobzhansky, T. (1949). Observations and experiments on natural selection in Drosophila. Hereditas, 35(S1), 210-224. hlps://doi.org/10.1111/j.1601-5223.1949.tb033 34.xM; Dobzhansky, T. (1950). Genetics of natural populations. XIX. Origin of heterosis through natural selection in populations of Drosophila pseudoobscura. Genetics, 35, 288-302.hlps://doi.org/10.1093/gene7cs/35.3.288 - In general, in the introduction, the classical chapter by Lemeunier and Aulard (1992) should be cited as the primary reference and most comprehensive review of D. melanogaster inversion polymorphisms.

      - L101: this is of course true, though there are some exceptions, such as In(3R)Mo.

      - L110: the papers by Knibb, the chapter by Lemeunier and Aulard (1992), and the meta-analysis of INV frequencies by Kapun & Flatt (2019) could be cited here as well.

      Citation suggestions integrated.

      - L123 and elsewhere: the common D. melanogaster inversions are old but perhaps not THAT old - if we take the Corbett-Detig & Hartl (2012) es7mates, then most of them do not really exceed an age of Ne generations, or at least not by much. I mean: yes, they are somewhat old but not super-old (cf. discussion in Andolfatto et al. 2001).

      Edited to curb any hyperbole. We agree that there are much more ancient polymorphisms in populations.

      - L133-135. This needs to be rewritten: this claim is incorrect, to my mind (Charlesworth 1974; also see Charlesworth and Charlesworth 1973; discussion in Charlesworth & Flatt 2021).

      Edited. See public review response (2).

      - L154: the example of inversion polymorphism is actually explicitly discussed in Altenberg's and Feldman's (1987) paper on the reduction principle.

      Edited to mention this. Inversions are also mentioned in Feldman et al. 1980, Feldman and Balkau 1973, Feldman 1972, and have been in discussion since the origins of the idea.

      - L162ff: see Connallon & Chenoweth (2019).

      Citation suggestion integrated, along with Cox & Calsbeek 2009 which seems more directly applicable, now line 185ff.

      - L169: why? There is much evidence for other important trade-offs in this system.

      Reworded.

      - L178-179: other studies have found that trade-offs/AP contribute to the maintenance of inversion polymorphisms, e.g. Mérot et al. 2020 and Betrán et al. 1998, etc.

      Added Betrán et al. 1998 - a good reference. Moved up mention of Mérot et al. 2020 from later in the text and directed readers to the Discussion, lines 202-205.

      - L198. "alternate inversion karyotypes" - you mean INV vs. STD? It would be good to adopt a maximally clear, uniform terminology throughout.

      Edited to communicate this better.

      - L215-217: this is a theoretically well-known result due to Hazel (1943); Dickerson (1955); Robertson (1955); e.g., see the discussion in the quantative genetics book by Roff (1997) or in the review of Flatt (2020).

      Citations integrated, now lines 232ff.

      - L223 and L245: "haploid" - somewhat confusing (see public review). 

      - L259-260: This may need some explanation. 

      - L261-262: simply state that there is no recombination in D. melanogaster males.

      Edited for increased clarity.

      - L274 (and elsewhere): the meaning of "mutation...of new..inversion polymorphisms" is ambiguous - do you mean a polymorphic inversion and hence a new inversion polymorphism or do you mean polymorphisms/variants accumulating in an inversion?

      - L275: maybe better heterokaryotypic instead of heterozygous? (note that INV homokaryotypes or STD homokaryotypes can be homo- or heterozygous, so when referring to chromosomal heterozygotes instead of heterozygous chromosomes it may be best to refer to heterokaryotypes).

      Per [5](1) and [5](5) in the public review, we have edited our terminology.

      - L276: referral to M&M - I found the description of the model/simulation details there to be somewhat vague, e.g. in terms of parameter settings, etc.

      Further described.

      - L281-282: would SLiM not have worked?

      See public review response.

      - L286-287: why these parameters?

      Further described.

      - L296ff: it is not immediately clear that the loci under consideration are polymorphic for antagonistic alleles vs. non-antagonistic alternative alleles - maybe this could be made clear very explicitly.

      Edited to be explicit as suggested.

      - L341, 343: "inversion mutation" - meaning ambiguous.

      - L348, 352: "specified rate" - vague.

      - L354-357: initial capture and/or accumulation/gain? 

      - L401, 402, 404: Z-, W- and Y- are brought up here without sufficient context/explanation.

      The above have been addressed by edits in the text.

      - L523, 557, 639, 646, and elsewhere: not the first evidence - see the paper by Mérot et al. (2020) (and e.g. also by Yifan Pei et al. (2023)). 

      Citations integrated in the introduction and discussion. Mérot et al. (2020) was cited (L486 in original) but discussion was curtailed in the previous draft. 

      - L558-559. I agree but it is clear that there are many mechanisms of balancing selection that can achieve this, at least in principle; for some of them (NFDS, etc.) we have pretty good evidence. 

      - L576-577. This is correct but for In(3R)C that study did find a differential hot vs. cold selection response.

      Addressed with text edit. 

      - L584-L586: cf. Betrán et al. (1998), Mérot et al. (2020), Pei et al. (2023), etc.

      - L591. "other forms of balancing selection": yes! This should be stressed throughout. Multiple forms of balancing selection exist and they are not mutually exclusive. 

      - L593: consider adding Dobzhansky (1943), Machado et al. (2021) 

      - L596-597: this is rather unlikely, at least in terms of inversion establishment (see Charlesworth 2023; hlps://www.biorxiv.org/content/10.1101/2023.10.16.562579v1).

      - L608: consider adding Kapun & Flal (2019). 

      - L611-612: see studies by Mukai & Yamaguchi, 1974; and Watanabe et al., 1976. 

      - L639, 646: AP - see general literature on AP as a factor in maintaining polymorphism (Rose

      1982, 1985; Curtsinger et al. 1994; Charlesworth & Hughes 2000 chapter in Lewontin Festschrift; Conallon & Chenoweth 2019 - this latter paper is par7cularly relevant in terms of AP effects in the context of sexual antagonism) 

      Citation suggestions integrated.

      - L657: inversion polymorphism is explicitly discussed in Altenberg's and Feldman's (1987) paper on the reduction principle.

      Hopefully this is better communicated.

      - L724-755: I felt that this section generally lacks sufficient details, especially in terms of parameter choices and settings for the simula7ons. 

      - L732L: why not state these rates?

      Parameter values are now given a fuller description in figure legends and in the methods.  

      - L746: but we know that mutational effect sizes are not uniformly distributed (?).

      We made this choice for simplicity and to avoid invoking seemingly arbitrary distribution, but one could instead simulate trait effects with some gamma distribution. Display values would still have variable fitness effects that fluctuate with population composition, but we agree that distribution shifted toward small effects would be more realistic.

      - L765: In(3R)P is not mentioned elsewhere - is this really correct?

      That was incorrect, fixed.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Malaria parasites detoxify free heme molecules released from digested host hemoglobins by biomineralizing them into inert hemozoin. Thus, why malaria parasites retain PfHO, a dead enzyme that loses the capacity of catabolizing heme, is an outstanding question that has puzzled researchers for more than a decade. In the current manuscript, the authors addressed this question by first solving the crystal structure of PfHO and aligning it with structures of other heme oxygenase (HO) proteins. They found that the N-terminal 95 residues of PfHO, which failed to crystalize due to their disordered nature, may serve as signal and transit peptides for PfHO subcellular localization. This was confirmed by subsequent microscopic analysis with episomally expressed PfHO-GFP and a GFP reporter fused to the first 83 residues of PfHO (PfHO N-term-GFP). To investigate the functional importance of PfHO, the authors generated an anhydrotetracycline (aTC) controlled PfHO knockdown strain. Strikingly, the parasites lacking PfHO failed to grow and lost their apicoplast. Finally, by chromatin immunoprecipitation (ChIP), quantitative PCR/RT-PCR, and growth assays, the authors showed that both the cognate N-terminus and HO-like domain were required for PfHO function as an apicoplast DNA interacting protein.

      The authors systemically performed multidisciplinary approaches to address this difficult question: what is the function of this enzymatically dead PfHO? I enjoyed reading this manuscript and its thoughtful discussion. This study is not of clinical importance for antimalarial treatments but also deepens our understanding of protein function evolution. While I understand these experiments are challenging to conduct in malaria parasites, the data quality of some of the experiments could be improved. For example, most of the Western blots and Southern blots are not of high quality.

      We thank the reviewer for the positive comments but are a bit puzzled by the final statement about western and Southern blot quality. We agree that the two anti-PfHO western blots probed with custom antibody (Fig. 3- source data 2 and 8) have substantial background signal in the higher molecular mass region >75 kDa. However, we note that the critical region <50 kDa is clear in both cases and readily enables target band visualization. All other western blots probing GFP or HA epitopes are of high quality with minimal off-target background. We present two Southern blot images. We agree that the signal is somewhat faint for the Southern blot demonstrating on-target integration of the aptamer/TetR-DOZI plasmid (Fig. 3- fig. supplement 4), although we note that the correct band pattern for integration is visible. We also note that the accompanying genomic PCR data is unambiguous. The Southern blot for GFP-DHFRDD incorporation into the PfHO locus (Fig. 3- fig. supplement 1) has clear signal and strongly supports on-target integration. The minor background signal in the lower left region of the image does not extend into nor impact interpretation of correct clonal integration.

      Reviewer #2 (Public Review):

      Summary:

      Blackwell et al. investigated the structure, localization, and physiological function of Plasmodium falciparum (Pf) heme oxygenase (HO). Pf and other malaria parasites scavenge and digest large amounts of hemoglobin from red cells for sustenance. To counter the potentially cytotoxic effects of heme, it is biomineralized into hemozoin and stored in the food vacuole. Another mechanism to counteract heme toxicity is through its enzymatic degradation via heme oxygenases. However, it was previously found by the authors that PfHO lacks the ability to catalyze heme degradation, raising the intriguing question of what the physiological function of PfHO is. In the current contribution, the authors determine that PfHO localizes to the apicoplast, determine its targeting sequence, establish the essentiality of PfHO for parasite viability, and determine that PfHO is required for proper maintenance of apicoplasts and apicoplast gene expression. In sum, the authors establish an essential physiological function for PfHO, thereby providing new insights into the role of PfHO in plasmodium metabolism.

      Strengths:

      The studies are rigorously conducted and the results of the experiments unambiguously support a role for PfHO as being an apicoplast-targeted protein required for parasite viability and maintenance of apicoplasts.

      Weaknesses:

      While the studies conducted are rigorous and support the primary conclusions, the lack of experiments probing the molecular function of PfHO limits the impact of the work. Nevertheless, the knowledge that PfHO is required for parasite viability and plays a role in the maintenance of apicoplasts is still an important advance.

      We appreciate the positive assessment. We agree that further mechanistic understanding of PfHO function remains a key future challenge. Indeed, we made extensive efforts to unravel PfHO interactions that underpin its critical function. We elucidated key interactions with the apicoplast genome, reliance on the electropositive N-terminus, association with DNA-binding proteins, and a specific defect in apicoplast mRNA levels. The major limitation we faced in further defining PfHO function is the general lack of understanding of apicoplast transcription and broader gene expression. That limitation and the challenges to overcome it go well beyond our study and will require concerted efforts across several manuscripts (likely by multiple groups) to define the mechanistic features of apicoplast gene expression. We look forward to contributing further molecular understanding of PfHO function as broader understanding of apicoplast transcription emerges.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This paper investigates the effects of the explicit recognition of statistical structure and sleep consolidation on the transfer of learned structure to novel stimuli. The results show a striking dissociation in transfer ability between explicit and implicit learning of structure, finding that only explicit learners transfer structure immediately. Implicit learners, on the other hand, show an intriguing immediate structural interference effect (better learning of novel structure) followed by successful transfer only after a period of sleep.

      Strengths:

      This paper is very well written and motivated, and the data are presented clearly with a logical flow. There are several replications and control experiments and analyses that make the pattern of results very compelling. The results are novel and intriguing, providing important constraints on theories of consolidation. The discussion of relevant literature is thorough. In summary, this work makes an exciting and important contribution to the literature.

      Weaknesses:

      There have been several recent papers that have identified issues with alternative forced choice (AFC) tests as a method of assessing statistical learning (e.g. Isbilen et al. 2020, Cognitive Science). A key argument is that while statistical learning is typically implicit, AFC involves explicit deliberation and therefore does not match the learning process well. The use of AFC in this study thus leaves open the question of whether the AFC measure benefits the explicit learners in particular, given the congruence between knowledge and testing format, and whether, more generally, the results would have been different had the method of assessing generalization been implicit. Prior work has shown that explicit and implicit measures of statistical learning do not always produce the same results (eg. Kiai & Melloni, 2021, bioRxiv; Liu et al. 2023, Cognition).

      We agree that numerous papers in the Statistical Learning literature discuss how different test measures can lead to different results and, in principle, using a different measure could have led to varying results in our study. In addition, we believe there are numerous additional factors relevant to this issue including the dichotomous vs. continuous nature of implicit vs. explicit learning and the complexity of the interactions between the (degree of) explicitness of the participants' knowledge and the applied test method that transcend a simple labeling of tests as implicit or explicit and that strongly constrains the type of variations the results of  different test would produce. Therefore, running the same experiments with different learning measures in future studies could provide additional interesting data with potentially different results.

      However, the most important aspect of our reply concerning the reviewer's comment is that although quantitative differences between the learning rate of explicit and implicit learners are reported in our study, they are not of central importance to our interpretations. What is central are the different qualitative patterns of performance shown by the explicit and the implicit learners, i.e., the opposite directions of learning differences for “novel” and “same” structure pairs, which are seen in comparisons within the explicit group vs. within the implicit group and in the reported interaction. Following the reviewer's concern, any advantage an explicit participant might have in responding to 2AFC trials using “novel” structure pairs should also be present in the replies of 2AFC trials using the “same” structure pairs and this effect, at best, could modulate the overall magnitude of the across groups (Expl/Impl.) effect but not the relative magnitudes within one group. Therefore, we see no parsimonious reason to believe that any additional interaction between the explicitness level of participants and the chosen test type would impede our results and their interpretation. We will make a note of this argument in the revised manuscript.

      Given that the explicit/implicit classification was based on an exit survey, it is unclear when participants who are labeled "explicit" gained that explicit knowledge. This might have occurred during or after either of the sessions, which could impact the interpretation of the effects.

      We agree that this is a shortcoming of the current design, and obtaining the information about participants’ learning immediately after Phase 1 would have been preferred. However, we made this choice deliberately as the disadvantage of assessing the level of learning at the end of the experiment is far less damaging than the alternative of exposing the participants to the exit survey question earlier and thereby letting them achieve explicitness or influence their mindset otherwise through contemplating the survey questions before Phase 2. Our Experiment 5 shows how realistic this danger of unwanted influence is: with a single sentence alluding to pairs in the instructions of Exp 5, we  could completely change participants' quantitative performance and qualitative response pattern. Unfortunately, there is no implicit assessment of explicitness we could use in our experimental setup. We also note that given the cumulative nature of statistical learning, we expect that the effect of using an exit survey for this assessment only shifts absolute magnitudes (i.e. the fraction of people who would fall into the explicit vs. implicit groups) but not aspects of the results that would influence our conclusions.

      Reviewer #2 (Public Review):

      Summary:

      Sleep has not only been shown to support the strengthening of memory traces but also their transformation. A special form of such transformation is the abstraction of general rules from the presentation of individual exemplars. The current work used large online experiments with hundreds of participants to shed further light on this question. In the training phase, participants saw composite items (scenes) that were made up of pairs of spatially coupled (i.e., they were next to each other) abstract shapes. In the initial training, they saw scenes made up of six horizontally structured pairs, and in the second training phase, which took place after a retention phase (2 min awake, 12 h incl. sleep, 12 h only wake, 24 h incl.

      sleep), they saw pairs that were horizontally or vertically coupled. After the second training phase, a two-alternatives-forced-choice (2-AFC) paradigm, where participants had to identify true pairs versus randomly assembled foils, was used to measure the performance of all pairs. Finally, participants were asked five questions to identify, if they had insight into the pair structure, and post-hoc groups were assigned based on this. Mainly the authors find that participants in the 2-minute retention experiment without explicit knowledge of the task structure were at chance level performance for the same structure in the second training phase, but had above chance performance for the vertical structure. The opposite was true for both sleep conditions. In the 12 h wake condition these participants showed no ability to discriminate the pairs from the second training phase at all.

      Strengths:

      All in all, the study was performed to a high standard and the sample size in the implicit condition was large enough to draw robust conclusions. The authors make several important statistical comparisons and also report an interesting resampling approach. There is also a lot of supplemental data regarding robustness.

      Weaknesses:

      My main concern regards the small sample size in the explicit group and the lack of experimental control.  

      The sample sizes of the explicit participants in our experiments are, indeed, much smaller than those of the implicit participants due to the process of how we obtain the members of the two groups. However, these sample sizes of the explicit groups are not small at all compared to typical experiments reported in Visual Statistical Learning studies, rather they tend to be average to large sizes. It is the sizes of the implicit subgroups that are unusually high due to the aforementioned data collecting process. Moreover, the explicit subgroups have significantly larger effect sizes than the implicit subgroup, bolstering the achieved power that is also confirmed by the reported Bayes Factors that support the “effect” or the “no effect” conclusions in the various tests ranging in value from substantial to very strong.  Based on these statistical measures,  we think the sample sizes of the explicit participants in our studies are adequate.

      However, we do agree that the unbalanced nature of the sample and effect sizes can be problematic for the between-group comparisons. We aim to replace the student’s t-tests that directly compares explicit and implicit participants with Welch’s t-tests that are better suited for unequal sample sizes and variances.

      As for the lack of experimental control, indeed, we could not fully randomize consolidation condition assignment. Instead, the assignment was a product of when the study was made available on the online platform Prolific. This method could, in theory, lead to an unobserved covariate, such as morningness, being unbalanced between conditions. We do not have any reasons to believe that such a condition would critically alter the effects reported in our study, but as it follows from the nature of unobserved variables, we obviously cannot state this with certainty. Therefore, we will explicitly discuss these potential pitfalls in the revised version of the manuscript.  

      Reviewer #3 (Public Review):

      In this project, Garber and Fiser examined how the structure of incidentally learned regularities influences subsequent learning of regularities, that either have the same structure or a different one. Over a series of six online experiments, it was found that the structure (spatial arrangement) of the first set of regularities affected the learning of the second set, indicating that it has indeed been abstracted away from the specific items that have been learned. The effect was found to depend on the explicitness of the original learning: Participants who noticed regularities in the stimuli were better at learning subsequent regularities of the same structure than of a different one. On the other hand, participants whose learning was only implicit had an opposite pattern: they were better in learning regularities of a novel structure than of the same one. This opposite effect was reversed and came to match the pattern of the explicit group when an overnight sleep separated the first and second learning phases, suggesting that the abstraction and transfer in the implicit case were aided by memory consolidation.

      These results are interesting and can bridge several open gaps between different areas of study in learning and memory. However, I feel that a few issues in the manuscript need addressing for the results to be completely convincing:

      (1) The reported studies have a wonderful and complex design. The complexity is warranted, as it aims to address several questions at once, and the data is robust enough to support such an endeavor. However, this work would benefit from more statistical rigor. First, the authors base their results on multiple t-tests conducted on different variables in the data. Analysis of a complex design should begin with a large model incorporating all variables of interest. Only then, significant findings would warrant further follow-up investigation into simple effects (e.g., first find an interaction effect between group and novelty, and only then dive into what drives that interaction). Furthermore, regardless of the statistical strategy used, a correction for multiple comparisons is needed here. Otherwise, it is hard to be convinced that none of these effects are spurious. Last, there is considerable variation in sample size between experiments. As the authors have conducted a power analysis, it would be good to report that information per each experiment, so readers know what power to expect in each.

      Answering the questions we were interested in required us to investigate two related but separate types of effects within our data: general above-chance performance in learning, and within- and across-group differences.

      Above-chance performance: As typical in SL studies, we needed to assess whether learning happened at all and which types of items were learned. For this, a comparison to the chance level is crucial and, therefore, one-sample t-test is the statistical test of choice. Note that all our t-tests were subject to experiment-wise correction for multiple comparisons using the Holm-Bonferroni procedure, as reported in the Supplementary Materials.

      Within- and across-group differences: To obtain our results regarding group and partype differences and their interactions, we used mixed ANOVAs and appropriate post-hoc tests as the reviewer suggested. These results are reported in the method section.

      Concerning power analysis, we will add the requested information on achieved power by experiment to the revised version of the manuscript.  

      (2) Some methodological details in this manuscript I found murky, which makes it hard to interpret results. For example, the secondary results section of Exp1 (under Methods) states that phase 2 foils for one structure were made of items of the other structure. This is an important detail, as it may make testing in phase 2 easier, and tie learning of one structure to the other. As a result, the authors infer a "consistency effect", and only 8 test trials are said to be used in all subsequent analyses of all experiments. I found the details, interpretation, and decision in this paragraph to lack sufficient detail, justification, and visibility. I could not find either of these important design and analysis decisions reflected in the main text of the manuscript or in the design figure. I would also expect to see a report of results when using all the data as originally planned.  

      We thank the reviewer for pointing out these critical open questions our manuscript that need further clarification. The inferred “consistency effect” is based on patterns found in the data, which show an increase in negative correlation between test types during the test phase. As this is apparently an effect of the design of the test phase and not an effect of the training phase, which we were interested in, we decided to minimize this effect as far as possible by focusing on the early test trials. For the revised version of the manuscript, we will revamp and expand how this issue was handled and also add a short comment in the main text, mentioning the use of only a subset of test trials and pointing the interested reader to the details.

      Similarly, the matched sample analysis is a great addition, but details are missing. Most importantly, it was not clear to me why the same matching method should be used for all experiments instead of choosing the best matching subgroup (regardless of how it was arrived at), and why the nearest-neighbor method with replacement was chosen, as it is not evident from the numbers in Supplementary Table 1 that it was indeed the best-performing method overall. Such omissions hinder interpreting the work.

      Since our approach provided four different balanced metrics (see Supp. Tables 1-4) for each matching method, it is not completely straightforward to make a principled decision across the methods. In addition, selecting the best method for each experiment separately carries the suspicion of cherry-picking the most suitable results for our purposes. For the revised version, we will expand on our description of the matching and decision process and add additional descriptive plots showing what our data looks like under each matching method for each experiment. These plots highlight that the matching techniques produce qualitatively roughly identical results and picking one of them over the other does not alter the conclusions of the test.  The plots will give the interested reader all the necessary information to assess the extent our design decisions influence our results.

      (3) To me, the most surprising result in this work relates to the performance of implicit participants when phase 2 followed phase 1 almost immediately (Experiment 1 and Supplementary Experiment 1). These participants had a deficit in learning the same structure but a benefit in learning the novel one. The first part is easier to reconcile, as primacy effects have been reported in statistical learning literature, and so new learning in this second phase could be expected to be worse. However, a simultaneous benefit in learning pairs of a new structure ("structural novelty effect") is harder to explain, and I could not find a satisfactory explanation in the manuscript.  

      Although we might not have worded it clearly, we do not claim that our "structural novelty effect" comes from a “benefit” in learning pairs of the novel structure. Rather, we used the term “interference” and lack of this interference. In other words, we believe that one possible explanation is that there is no actual benefit for learning pairs of the novel structure but simply unhindered learning for pairs of the novel structure and simultaneous inference for learning pairs of the same structure. Stronger interference for the same compared to the novel structure items seems as a reasonable interpretation as similarity-based interference is well established in the general (not SL-specific) literature under the label of proactive interference. We will clarify these ideas in the revised manuscript.

      After possible design and statistical confounds (my previous comments) are ruled out, a deeper treatment of this finding would be warranted, both empirically (e.g., do explicit participants collapse across Experiments 1 and Supplementary Experiment 1 show the same effect?) and theoretically (e.g., why would this phenomenon be unique only to implicit learning, and why would it dissipate after a long awake break?).

      Across all experiments, the explicit participants showed the same pattern of results but no significant difference between pair types, probably due to insufficiency of the available  sample sizes. We already included in the main text the collapsed explicit results across Experiments 1-4 and Supplementary Experiment 1 (p. 16).  This analysis confirmed that, indeed, there was a significant generalization for explicit participants across the two learning phases. We could re-run the same analysis for only Experiment 1 and

      Supplementary Experiment 1, but due to the small sample of  N=12 in Suppl. Exp. 1, this test will be likely completely underpowered. Obtaining the sufficient sample size for this one test would require an excessive number (several hundreds) of new participants.  

      In terms of theoretical treatment, we already presented our interpretation of our results in the discussion section, which we can expand on in the revised manuscript.

    1. Author response:

      eLife assessment

      This study presents valuable findings on the role of a well-studied signal transduction pathway, the Slit/Robo system, in the context of the assembly of the hematopoietic niche in the Drosophila embryo. The evidence supporting the claims of the authors is solid. However, one aspect that needs attention is whether the cells are migrating and not being pushed to a more dorsal position through dorsal closure and/or other similar large-scale embryo movement. This does not detract from the very interesting analysis of PSC morphogenesis and will interest developmental biologists working on molecular mechanisms of tissue morphogenesis.

      We appreciate the thoughtful and quite useful comments provided by each of the referees. Our responses are noted below each referee’s comment.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The study by Nelson et al. is focused on the formation of the Drosophila Posterior Signaling Center (PSC) which ultimately acts as a niche to support hematopoietic stem cells of the lymph gland (LG). Using a combination of genetics and live imaging, the authors show that PSC cells migrate as a tight collective and associate with multiple tissues during a trajectory that positions them at the posterior of the LG.

      This is an important study that identifies Slit-Robo signaling as a regulator of PSC morphogenesis, and highlights the complex relationship of interacting cell types - PSC, visceral mesoderm (VM), and cardioblasts (CBs) - in the coordinated development of these three tissues during organ development. However, one point requiring clarification is the idea that PSC cells exhibit a collective cell migration; it is not clear that the cells are migrating rather than being pushed to a more dorsal position through dorsal closure and/or other similar large-scale embryo movement. This does not detract from the very interesting analysis of PSC morphogenesis as presented.

      Since each referee asked for clarification concerning collective cell migration, we present a combined response further below, placed after the comments from Reviewer #3.

      Strengths:

      (1) Using the expression of Hid or Grim to ablate associated tissues, they find evidence that the VM and CB of the dorsal vessel affect PSC migration/morphology whereas the alary muscles do not. Slit is expressed by both VM and CBs, and therefore Slit-Robo signaling was investigated as PSCs express Robo.

      (2) Using a combination of approaches, the authors convincingly demonstrate that Slit expression in the CBs and VM acts to support PSC positioning. A strength is the ability to knockdown slit levels in particular tissue types using the Gal4 system and RNAi.

      (3) Although in the analysis of robo mutants, the PSC positioning phenotype is weaker in the individual mutants (robo1 and robo2) with only the double mutant (robo1,robo2) exhibiting a phenotype comparable to the slit RNAi. The authors make a reasonable argument that Slit-Robo signaling has an intrinsic effect, likely acting within PSCs because PSCs show a phenotype even when CBs do not (Figure 4G).

      (4) New insight into dorsal vessel formation by VM is presented in Figure 4A, B, as loss of the VM can affect dorsal vessel morphogenesis. This result additionally points to the VM as important.

      Weaknesses:

      (1) The authors are cautioned to temper the result that Slit-Robo signaling is intrinsic to PSC since the loss of robo may affect other cell types (besides CBs and PSCs) to indirectly affect PSC migration/morphogenesis. In fact, in the robo2, robo1 mutant, the VM appears to be incorrectly positioned (Figure 4G).

      We have reexamined our wording in the relevant Results section and, given that this referee agrees that we, “make a reasonable argument that Slit-Robo signaling has an intrinsic effect, likely acting within PSCs because PSCs show a phenotype even when CBs do not (Figure 4G)”, it was not clear how we might temper our conclusions more. Given that PSC cells express Robo1 and Robo2, and that the Vm does not contact the PSC, our ‘reasonable argument’ appears fair and parsimonious. Since we agree with the referee that a reader should be made as aware as possible of alternatives, we will add a comment to the Discussion, reminding the reader of the possibility of a secondary defect.

      (2) If possible, the authors should use RNAi to knockdown Robo1 and Robo2 levels specifically in the PSCs if a Gal4 is available; might Antp.Gal4 (Fig 1K) be useful? Even if knockdown is achieved in PSCs+CBs, this would be a better/complementary experiment to support the approach outlined in Figure 4D.

      While we agree that PSC-specific knockdown of Robo1 and Robo2 simultaneously would be ideal, this is not possible. First, the most-effective UAS-RNAi transgenes (that is, those in a Valium 20 backbone) are both integrated at the same chromosomal position; these cannot be simultaneously crossed with a GAL4 transgenic line to attempt double knock down. Additionally, as with all RNAi approaches that must rely on efficient knockdown over the rapid embryonic period, even having facile access to the above does not ensure the RNAi approach will cause as effective depletion as the genetic null condition that we use. Second, as the referee concedes, there is no embryonic PSC-specific GAL4. The proposed use of Antp-GAL4 would cause knockdown in many tissues (PSC, CB, Vm, epidermis and amnioserosa). This would lead to a reservation similar to that caused by our use of the straight genetic double mutant, as regards potential indirect requirement for Robo function.

      (3) Movies are hard to interpret, as it seems unclear that the PSCs actively migrate rather than being pushed/moved indirectly due to association with VM and CBs/dorsal vessel.

      First, the Vm does not directly contact the PSC, so it cannot be pushing the PSC dorsally. We will re-examine our text to be certain to make this clear. Second, in our analysis of bin mutants, which lack Vm, LGs and PSCs are able to reach the dorsal midline region in the absence of Vm. Finally, please see our response to Reviewer #3, point 2, for why we maintain that PSC cells are “migrating” even though some PSC cells are attached to CBs.

      Reviewer #2 (Public Review):

      The paper by Nelson KA, et al. explored the collective migration, coalescence, and positioning of the posterior signaling center (PSC) cells in Drosophila embryo. With live imaging, the authors observed the dynamic progress of PSC migration. Throughout this process, visceral mesoderm (VM), alary muscles (Ams), and cardioblasts (CBs) are in proximity to PSC. Genetic ablation of these tissues reveals the requirement for VM and CBs, but not AMs in this process. Genetic manipulations further demonstrated that Slit-Robo signaling was critical during PSC migration and positioning. While the genetic mechanisms of positioning the PSC were explored in much detail, including using live imaging, the functional consequence of mispositioning or (partial) absence of PSC cells has not been addressed, but would much increase the relevance of their findings. A few additional issues need to be addressed as well in this otherwise well-done study.

      Major points:

      (1) The only readout in their experiments is the relative correctness of PSC positioning. Importantly, what is the functional consequence if PSC is not properly positioned? This would be particularly important with robo-sli manipulations, where the PSC is present but some cells are misplaced. What is the consequence? Are the LGs affected, like the specification of their cell types, structure, and function? To address this for at least the robo-slit requirement in the PSC, it may be important to manipulate them directly in the PSC with a split Gal4 system, using Antp and Odd promoters.

      We agree that the functional consequence of PSC mis-positioning is important and a relevant question to eventually address. However, virtually all markers and reagents used to assess the effect of the PSC on progenitor cells and their differentiated descendants are restricted to analyses carried out on the third larval instar - some three days after the experiments reported here. Most of the manipulated conditions in our work are no longer viable at this phase and, thus, addressing the functional consequences of a malformed PSC will require the field to develop new tools. 

      As we noted in the Introduction, the consistency with which the wildtype PSC forms as a coalesced collective at the posterior of the LG strongly suggests importance of its specific positioning and shape, as has now been found for other niches (citations in manuscript). Additionally, in the Discussion we mention the existence of a gap junction-dependent calcium signaling network in the PSC that is important for progenitor maintenance. Without continuity of this network amongst all PSC cells (under conditions of PSC mis-positioning), we strongly anticipate that the balance of progenitors to differentiated hemocytes will be mis-managed, either constitutively, and / or under immune challenge conditions. 

      Finally, to our knowledge, the tools do not exist to build a “split Gal4 system using Antp and Odd promoters”. The expression pattern observed using the genomic Antp-GAL4 line must be driven by endogenous enhancers–none of which have been defined by the field, and thus cannot be used in constructing second order drivers. Similarly, for odd skipped, in the embryo the extant Odd-GAL4 driver expresses only in the epidermis, with no expression in the embryonic LG. Thus, the cis regulatory element controlling Odd expression in the embryonic LG is unknown. In the future, the discovery of an embryonic PSC-specific driver will aid in addressing the specific functional consequences of PSC mis-positioning.

      (2) The densely, parallel aligned fibers in the part of Figure 1J seemed to be visceral mesoderm, but further up (dorsally) that may be epidermis. It is possible that the PSC migrate together with the epidermis? This should be addressed.

      See response to Reviewer #3.

      (3) Although the authors described the standards of assessing PSC positioning as "normal" or "abnormal", it is rather subtle at times and variable in the mutant or KD/OE examples. The criteria should be more clearly delineated and analyzed double-blind, also since this is the only readout. Further examples of abnormal positioning in supplementary figures would also help.

      We appreciate the Reviewer’s concern and acknowledge that the phenotypes we observed were indeed variable, and, at times subtle. As we show and discuss in the paper, our results revealed that the signaling requirements for proper PSC positioning are complex; this was favorably commented upon by Reviewer #1 (“...highlights the complex relationship of interacting cell types - PSC, visceral mesoderm (VM), and cardioblasts (CBs) - in the coordinated development of these three tissues during organ development.…”). We suspect the phenotypic variability is attributable to any number of biological differences such as heterogeneity of PSC cells and an accompanying difference in the timing of their competence to receive and respond to Slit-Robo signaling, the timing of release of Slit from CBs and Vm, number of cells in a given PSC, which PSC cells in the cluster respond to too little or too much signaling, and/or typical variability between organisms. Furthermore, PSC positioning analyses were conducted by two of the authors, who independently came to the same conclusions. For many of the manipulations double blinding was not possible since the genotype of the embryo was discernible due to the obvious phenotype of the manipulated tissue.

      (4) The Discussion is very lengthy and should [be] shortened.

      We will re-examine the prose and emphasize more conciseness, while maintaining clarity for the reader.

      Reviewer #3 (Public Review):

      Summary:

      This work is a detailed and thorough analysis of the morphogenesis of the posterior signaling center (PSC), a hematopoietic niche in the Drosophila larva. Live imaging is performed from the stage of PSC determination until the appearance of a compact lymph gland and PSC in the stage 16 embryo. This analysis is combined with genetic studies that clarify the involvement of adjacent tissue, including the visceral mesoderm, alary muscle, and cardioblasts/dorsal vessels. Lastly, the Slit/Robo signaling system is clearly implicated in the normal formation of the PSC.

      Strengths:

      The data are clearly presented, well documented, and fully support the conclusions drawn from the different experiments. The manuscript differs in character from the mainstay of "big data" papers (for example, no sets of single-cell RNAseq data of, for instance, PSC cells with more or less Slit input, are offered), but what it lacks in this regard, it makes up in carefully planned and executed visualizations and genetic manipulations.

      Weaknesses:

      A few suggestions concerning improvement of the way the story is told and contextualized.

      (1) The minute cluster of PSC progenitors (5 or so cells per side) is embedded (as known before and shown nicely in this study) in other "migrating" cell pools, like the cardioblasts, pericardial cells, lymph gland progenitors, alary muscle progenitors. These all appear to move more or less synchronously. What should also be mentioned is another tissue, the dorsal epidermis, which also "moves" (better: stretches?) towards the dorsal midline during dorsal closure. Would it be reasonable to speculate (based on previously published data) that without the force of dorsal closure, operating in the epidermis, at least the lateral>medial component of the "migration" of the PSC (and neighboring tissues) would be missing? If dorsal closure is blocked, do essential components of PSC and lymph gland morphogenesis (except for the coming-together of the left and right halves) still occur? Are there any published data on this?

      Each of the Reviewers is interested in our response to this very relevant question, and, thus, we will address the issue en bloc here. First, we will add a Supplementary Figure showing that LG and CBs are still able to progress medially towards the dorsal midline when dorsal closure stalls.  This rules out any major effect for the most prominent “large-scale embryo cell sheet movement” in positioning the PSC. Second, published work by Haack et. al. and Balaghi et. al. shows that CBs and leading edge epidermal cells are independently migratory, and we will add this context to the manuscript for the reader.

      (2) Along similar lines: the process of PSC formation is characterized as "migration". To be fair: the authors bring up the possibility that some of the phenotypes they observe could be "passive"/secondary: "Thus, it became important to test whether all PSC phenotypes might be 'passive', explained by PSC attachment to a malforming dorsal vessel. Alternatively, the PSC defects could reflect a requirement for Robo activation directly in PSC cells." And the issue is resolved satisfactorily. But more generally, "cell migration" implies active displacement (by cytoskeletal forces) of cells relative to a substrate or to their neighbors (like for example migration of hemocytes). This to me doesn't seem really clearly to happen here for the dorsal mesodermal structures. Couldn't one rather characterize the assembly of PSC, lymph gland, pericardial cells, and dorsal vessel in terms of differential adhesion, on top of a more general adhesion of cells to each other and the epidermis, and then dorsal closure as a driving force for cell displacement? The authors should bring in the published literature to provide a background that does (or does not) justify the term "migration".

      Before addressing this specifically, we remind readers of our response above that states the rationale ruling out large, embryo-scale movements, such as epidermal dorsal closure, in driving PSC positioning. So, how are PSC cells arriving at their reproducible position? This manuscript reports the first live-imaging of the PSC as it comes to be positioned in the embryo. We interpret these movies to suggest strongly that these cells are a ‘collective’ that migrates. Neither the data, nor we, are asserting that each PSC cell is ‘individually’ migrating to its final position. Rather, our data suggest that the PSC migrates as a collective. The most paradigmatic example of directed, collective cell migration, is of Drosophila ovarian border cells. That cell cluster is surrounded at all times by other cells (nurse cells, in that case), and for the collective to traverse through the tissue, the process requires constant remodeling of associations amongst the migrating cells in the collective (the border cells), as well as between cells in the collective and those outside of it (the nurse cells). In fact, the nurse cells are considered the substrate upon which border cells migrate. Note also that in collective border cell migration cells within the collective can switch neighbors, suggesting dynamic changes to cell associations and adhesions. 

      In our analysis, the PSC cells exhibit qualities reminiscent of the border cells, and thus we infer that the PSC constitutes a migratory cell collective.  We also show in Figure 1H that PSC cells exhibit cellular extensions, and thus have a very active, intrinsic actin-based cytoskeleton. In fact, in Figure 1I, we point out that PSC cells shift position within the collective, which is not only a direct feature of migration, but also occurs within the border cell collective as that collective migrates. Additionally, the fact that the lateral-most PSC cells shift position in the collective while remaining a part of the collective–and they do this while executing net directional movement–makes a strong argument that the PSC is migratory, as no cell types other than PSCs are contacting the surfaces of those shifting PSC cells. Lastly, the Reviewer’s supposition that, rather than migration, dorsal mesoderm structures form via “differential adhesion, on top of a more general adhesion of cells to each other” is, actually, precisely an inherent aspect of collective cell migration as summarized above for the ovarian border collective.

      In our resubmission we will adjust text citing the existing literature to better put into context the reasoning for why PSC formation based on our data is an example of collective cell migration.

      (3) That brings up the mechanistic centerpiece of this story, the Slit/Robo system. First: I suggest adding more detailed data from the study by Morin-Poulard et al 2016, in the Introduction, since these authors had already implicated Slit-Robo in PSC function and offered a concrete molecular mechanism: "vascular cells produce Slit that activates Robo receptors in the PSC. Robo activation controls proliferation and clustering of PSC cells by regulating Myc, and small GTPase and DE-cadherin activity, respectively". As stated in the Discussion: the mechanism of Slit/Robo action on the PSC in the embryo is likely different, since DE-cadherin is not expressed in the embryonic PSC; however, it maybe not be THAT different: it could also act on adhesion between PSC cells themselves and their neighbors. What are other adhesion proteins that appear in the late lateral mesodermal structures? Could DN-cadherin or Fasciclins be involved?

      We agree with the Reviewer that Slit-Robo signaling likely acts in part on the PSC by affecting PSC cell adhesion to each other and/or to CBs (lines 428-435). As stated in the Discussion, we do not observe Fasciclin III expression in the PSC until late stages when the PSC has already been positioned, suggesting that Fasciclin III is not an active player in PSC formation. Assessing whether the PSC expresses any other of the suite of potential cell adhesion molecules such as DN-Cadherin or other Fasciclins, and then study their potential involvement in the Slit-Robo pathway in PSC cells, would be part of a follow-up study.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The authors develop a self-returning self-avoiding polymer model of chromosome organization and show that their framework can recapitulate at the same time local density and large-scale contact structural properties observed experimentally by various technologies. The presented theoretical framework and the results are valuable for the community of modelers working on 3D genomics. The work provides solid evidence that such a framework can be used, is reliable in describing chromatin organization at multiple scales, and could represent an interesting alternative to standard molecular dynamics simulations of chromatin polymer models.

      We appreciate the editor for an accurate description of the scope of the paper.

      Public Reviews:

      Reviewer #1 (Public Review):

      Carignano et al propose an extension of the self-returning random walk (SRRW) model for chromatin to include excluded volume aspects and use it to investigate generic local and global properties of the chromosome 3D organization inside eukaryotic nuclei. In particular, they focus on chromatin volumic density, contact probability, and domain size and suggest that their framework can recapitulate several experimental observations and predict the effect of some perturbations.

      We thanks the reviewer for the attention paid to the manuscript and all the relevant comments.

      Strengths:

      - The developed methodology is convincing and may offer an alternative - less computationally demanding - framework to investigate the single-cell and population structural properties of 3D genome organization at multiple scales.

      - Compared to the previous SRRW model, it allows for investigation of the role of excluded volume locally.

      Excluded volume is accounted for everywhere, not locally. We emphasized this on page 3, line 182:

      “The method that we employ to remove overlaps is a low-temperature-controlled molecular dynamics simulation using a soft repulsive interaction potential between initially overlapping beads, that is terminated as soon as all overlaps have been resolved, as described in the Appendix 3.”


      - They perform some experiments to compare with model predictions and show consistency between the two.

      Weaknesses:

      - The model is a homopolymer model and currently cannot fully account for specific mechanisms that may shape the heterogeneous, complex organization of chromosomes (TAD at specific positions, A/B compartmentalization, promoter-enhancer loops, etc.).

      The SR-EV model is definitely not a homo-polymer, as it is not a regular concatenation of a single monomeric unit.

      The model includes loops, which may happen in two ways: 1) As in the SRRW, branching structures emerging from the configuration backbone can be interpreted as nested loops and 2) A relatively long forward step followed by a return is a single loop. The model induces the formation of packing domains, which are not TADs, and are quantitatively in agreement with ChromSTEM experiments.

      We consider convenient to add a new figure that will further clarify the structures obtained with the SR-EV model. The following paragraph and figure has been added in page 5:

      “The density heterogeneity displayed by the SR-EV configurations can be analyzed in terms of the accessibility. One way to reveal this accessibility is by calculating the coordinations number (CN) for each nucleosome, using a coordination radius of 11.5 nm, along the SR-EV configuration. CN values range from 0 for an isolated nucleosome to 12 for a nucleosome immersed in a packing domain. In Figure 3 we show the SR-EV configuration showed in Figure 2, but colored according to CN. CN can be also considered as a measure to discriminate heterochromatin (red) and euchromatin (blue). Figure 3-A shows how the density inhomogeneity is coupled to different CN, with high CN represented in red and low CN represented in blue. Figure 3-B show a 50 nm thick slab obtained from the same configuration that clearly show the nucleosomes at the center of each packing domains are almost completely inaccesible, while those outside are open and accessible. It is also clear that the surface of the packing domains are characterized by nearly white nucleosomes, i.e. coordinated towards the center of the domain and open in the opposite direction.”

      - By construction of their framework, the effect of excluded volume is only local and larger-scale properties for which excluded volume could be a main actor (formation of chromosome territories [Rosa & Everaers, PLoS CB 2009], bottle-brush effects due to loop extrusion [Polovnikov et al, PRX 2023], etc.) cannot be captured.

      Excluded volume is considered for all nucleosomes, including overlapping beads distant along the polymer chain. Chromosome territories can be treated, but it is not in this case because we look at a single model chromosome.

      - Apart from being a computationally interesting approach to generating realistic 3D chromosome organization, the method offers fewer possibilities than standard polymer models (eg, MD simulations) of chromatin (no dynamics, no specific mechanisms, etc.) with likely the same predictive power under the same hypotheses. In particular, authors often claim the superiority of their approach to describing the local chromatin compaction compared to previous polymer models without showing it or citing any relevant references that would show it.

      We apologize if the text transmit an idea of superiority over other methods that was not intended. SR-EV is an alternative tool that may give a different, even complementary point of view, to standard polymer models.

      - Comparisons with experiments are solid but are not quantified.

      The comparisons that we have presented are quantitative. We do not have so far a way to characterize alpha or phi, a priori, for a particular system.

      Impact:

      Building on the presented framework in the future to incorporate TAD and compartments may offer an interesting model to study the single-cell heterogeneity of chromatin organization. But currently, in this reviewer's opinion, standard polymer modeling frameworks may offer more possibilities.

      We thank the reviewer for the positive opinion on the potential of the presented method. The incorporation of TADs and compartments is left for a future evolution of the model as its complexity will make this work extremely long.

      Reviewer #2 (Public Review):

      Summary:

      The authors introduce a simple Self Returning Excluded Volume (SR-EV) model to investigate the 3D organization of chromatin. This is a random walk with a probability to self-return accounting for the excluded volume effects. The authors use this method to study the statistical properties of chromatin organization in 3D. They compute contact probabilities, 3D distances, and packing properties of chromatin and compare them with a set of experimental data.

      We thank the reviewer for the attention paid to our manuscript.

      Strengths:

      (1) Typically, to generate a polymer with excluded volume interactions, one needs to run long simulations with computationally expensive repulsive potentials like the WeeksChanlder-Anderson potential. However, here, instead of performing long simulations, the authors have devised a method where they can grow polymer, enabling quick generation of configurations.

      (2) Authors show that the chromatin configurations generated from their models do satisfy many of the experimentally known statistical properties of chromatin. Contact probability scalings and packing properties are comparable with Chromatin Scanning Transmission Electron Microscopy (ChromSTEM)  experimental data from some of the cell types.

      Weaknesses:

      This can only generate broad statistical distributions. This method cannot generate sequence-dependent effects, specific TAD structures, or compartments without a prior model for the folding parameter alpha. It cannot generate a 3D distance between specific sets of genes. This is an interesting soft-matter physics study. However, the output is only as good as the alpha value one provides as input.

      We proposed a model to create realistic chromatin configuration that we have contrasted with specific single cell experiments, and also reproducing ensemble average properties. 3D distances between genes can be calculated after mapping the genome to the SR-EV configuration. The future incorporation of the genome sequence will also allow us to describe TADs and A/B compartments. See added paragraph in the Discussion section:

      “The incorporation of genomic character to the SR-EV model will allow us to study all individual single chromosomes properties, and also topological associated domains and A/B compartmentalization from ensemble of configurations as in HiC experiments. “

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Major:

      - In the introduction and along the text, the authors are often making strong criticisms of previous works (mostly polymer simulation-based) to emphasize the need for an alternative approach or to emphasize the outcomes of their model. Most of these statements (see below) are incomplete if not wrong. I would suggest tuning down or completely removing them unless they are explicitly demonstrated (eg, by explicit quantitative comparisons). There is no need to claim any - fake - superiority over other approaches to demonstrate the usefulness of an approach. Complementarity or redundance in the approaches could also be beneficial.

      We regret if we unintentionally transmitted a claim of superiority. We have made several small edits to change that.

      - Line 42-43: at least there exist many works towards that direction (including polymer modeling, but also statistical modeling). For eg, see the recent review of Franck Alber.

      Line removed. Citation to Franck Alber included below in the text.

      - Line 54-57: Point 1 is correct but is it a fair limitation? These models can predict TADs & compartments while SR-EV no. Point 2 is wrong, it depends on the resolution of the model and computer capacity but it is not an intrinsic limitation. Point 3 is wrong, such models can predict very well single-cell properties, and again it is not an intrinsic limitation of the model. Point 4 is incorrect. The space-filling/fractal organization was an (unfortunate) picture to emphasize the typical organization of chromosomes in the early times (2009), but crumpled polymers which are a more realistic description are not space-filling (see Halverson et al, 2013).

      Text involving points 1 to 4 removed. It was unnecessary and does not change the line of the paper.

      - L400-402 + 409-411: in such a model, the biphasic structure may emerge from loop extrusion but also naturally from the crumpled polymer organization. Simple crumpled polymer without loop extrusion and phase separation would also produce biphasic structures.

      Yes, we agree. Also SR-EV leads to biphasic structures.

      - L 448-449: any data to show that existing polymer modeling would predict a strong dependency of C_p(n) on the volumic fraction (in the range studied here)?

      No, I don’t know a work predicting that.

      - Fig. 4:

      - Large-scale structural properties (R^2(n) and C_p(n)) are not dependent on phi. Is it surprising that by construction, SR-EV only relaxes the system locally after SRRW application?

      Excluded volume is considered at all length scales. However, as the decreasing C_p curves observed in theories and experiments imply, the fraction of overlap (or contacts) is more important at small separations (local) than at large separations. Yet, it was a surprise for us to observed negligible effect on phi.

      - Why not make a quantitative comparison between predicted and measured C_p(n)? Or at least plotting them on the same panel.

      Panels B and C are in the same scale and show a good agreement between SR-EV and experiments. However, it is not perfectly quantitative agreement. SR-EV represents the generic structure of chromatin and perfect agreement should not be expected.

      - Comparison with an average C_p(n) over all the chromosomes would be better.

      Possibly, but we don’t think it adds anything to the paper.

      - In Figure 5,6,7 (and related text): authors often describe some parameter values that are 'closest to experiment findings'. Can the authors quantify/justify this? The various 'closest' parameters are different. Can the authors comment?

      The folding parameter and average volume fraction are chose so that the agreement is best with the displayed experimental system, different cell for each case.

      - Figure 5: why not show the experimental distribution from Ou et al?

      - Figure 6 & 7: experimental results. Can the authors show images from their own experiments? Can they show that cohesion/RAD21 is really depleted after auxin treatment?

      It is currently under review in a different journal.

      - In the Discussion, a fair discussion on the limitations of the methods (dynamics, etc) is missing.

      Minor

      - Line 34-36: the logical relationship between this sentence and the ones before and after is very unclear.

      - Along the text, authors use the term 'connectivity' to describe 3D (Hi-C) contacts between different regions of the same chromosome/polymer. This is misleading as connectivity in polymer physics describes the connection along the polymer and not in the 3D space.

      No. I don’t think we used connectivity in that sense. We agree with your statement on the use of connectivity in polymer physics, and is what we always had in mind for this model.

      - Line 92: typo.

      - On the SR-EV method: does the relaxation process create local knots in the structure?

      We have not checked for knots.

      - Table 1: the good correspondence with linker length is remarkable but likely 'fortunate', other chosen resolutions would have led to other results. Moreover, the model cannot account for the fine structure of chromatin fiber. Can the authors comment on that?

      Fortunate to the extent that we sample the model parameter to overall catch the structure of chromatin.

      - Line 211: 'without the need of imposing any parameter': alpha is a parameter, no?

      Correct. Phrase deleted.

      - L267-269 & 450-451: actually in Liu & Dekker, they do observe an effect on Hi-C map (C_p(n)), weak but significant and not negligible.

      Our statements read ‘minimal’ and ‘relatively insensitive’. It is observed, but very small.

      - L283-286: This is a perspective statement that should be in the discussion.

      Moved to the Discussion, as suggested.

      - L239-241: The authors seem to emphasize some contradictions with recent results on phase separation. This is unclear and should be relocated to discussion.

      We just pointed out recent experiments, as stated. No intention to generate a discussion with any of them.

      - L311-313: Unclear statement.

      - L316-325: This is not results but discussion/speculation.

      Moved to Discussion

      - Along the text: 'promotor'-> 'promoter'. 

      - Corrected.

      - L364: explain more in detail PWS microscopy.

      Reviewer #2 (Recommendations For The Authors):

      Even though there are claims about nucleosome-resolution chromatin polymer, it is not clear that this work can generate structures with known nucleosome-resolution features. Nucleosome-level structure is much beyond a random walk with excluded volume and is driven by specific interactions. The authors should clarify this.

    1. Author response:

      We thank the editor and reviewers for their supportive comments about our modeling approach and conclusions, and for raising several valid concerns; we address them briefly below.

      Concerns about model’s biological realism and impact on interpretations

      The goal of this paper was to use an interpretable and modular model to investigate the impact of varying sensorimotor delays. Aspects of the model (e.g. layered architecture, modularity) are inspired by biology; at the same time, necessary abstractions and simplifications (e.g. using an optimal controller) are made for interpretability and generalizability, and they reflect common approaches from past work. The hypothesized effects of certain simplifying assumptions are discussed in detail in Section 3.5. Furthermore, the modularity of our model allows us to readily incorporate additional biological realism (e.g. biomechanics, connectomics, and neural dynamics) in future work. In the revision, we will add citations and edits to the text to clarify these points.

      Concerns that the model is overly complex

      To investigate the impact of sensorimotor delays on locomotion, we built a closed-loop model that recapitulates the complex joint trajectories of fly walking. We agree that locomotion models face a tradeoff between simplicity/interpretability and realism — therefore, we developed a model that was as simple and interpretable as possible, while still reasonably recapitulating joint trajectories and generalizing to novel simulation scenarios. Along these lines, we also did not select a model that primarily recreates empirical data, as this would hinder generalizability and add unnecessary complexity to the model. We do not think these design choices are significant weaknesses of this model; in fact, few comparable models account for all joints involved in locomotion, and fewer explicitly compare model kinematics with kinematics from data. We will add citations and edits to the text to clarify these points in the revision.

      Concerns about the validity of the Kinematic Similarity (KS) metric to evaluate walking

      We chose to incorporate only the first two PCA modes dimensions in the KS metric because the kernel density estimator performs poorly for high dimensional data. Our primary use of this metric was to indicate whether the simulated fly continues walking in the presence of perturbations. For technical reasons, it is not feasible to perform equivalent experiments on real walking flies, which is one of the reasons we explore this phenomenon with the model. We note the dramatic shift from walking to non-walking as delay increases (Figure 5). To be thorough, in the revision, we will investigate the effect of incorporating additional PCA modes, and whether this affects the interpretation of our results. We will additionally edit the discussion and presentation of the KS metric to clarify its purpose in this study. We agree with the reviewers that the KS metric is too coarse to reflect fine details of joint kinematics; indeed, in the unperturbed case, we evaluate our model’s performance using other metrics based on comparisons with empirical data (Figures 2, 7, 8).

    1. Author response:

      We thank the reviewers for their engagement and constructive comments. This provisional response aims to clarify key misconceptions, address major criticisms, and outline our revision plans.

      A primary concern of the reviewers appears to be our model's limitations in addressing a broad range of empirical findings. This, however, misinterprets our core contribution. Our work centers on a cautionary tale that before advocating for newly discovered cell types and their purported special roles in spatial cognition—an approach prevalent in the field—such claims must be tested against alternative (null) hypotheses that may contradict intuitive expectations. We present such an alternative hypothesis regarding spatial cells and their assumed privileged roles. We show that key findings in the field - spatial “cell types”,  arise in a set of null models without spatial grounding (including untrained variants) despite the models not being a model for spatial processing, and we also found that they had no privileged role for representing spatial information.

      Our proposal is not a new model attempting to explain the brain, and therefore we do not aim to capture every empirical finding. Indeed, we would not expect an object recognition model (and its untrained variant) with no explicit spatial grounding to account for all phenomena in spatial cognition. This underscores our key point: if there exists a basic, spatially agnostic model that can explain certain degrees of empirical findings using criteria from the literature (i.e. place, head-direction and border cells), what implications does this have for the more complex theories and models proposed as underlying mechanisms of special cell types?

      Regarding concerns about the limited scope and generalizability of our setting, we will clarify that we considered multiple DNN architectures, both trained and untrained, on multiple decoding tasks (position, head direction, and nearest-wall distance). We plan to extend our experiments further as detailed in the revision plan below.

      Further, there was a methodological concern about using a linear decoder on a fixed DNN for spatial decoding tasks being a form of "hacking". However, linear readout is standard practice in neuroscience to characterize information available in a neural population. Moreover, our tests on untrained networks also showed spatial decoding capabilities, suggesting it's not solely due to the linear readout.

      For our full revision plan:

      (1) We will revise the manuscript to better reflect these above points, clarifying our paper's stance and improving the writing to reduce misconceptions.

      (2) We will address individual public reviews in more detail.

      (3) We intend to address key reviewer recommendations, focusing on better situating our work within the broader context of the existing literature whilst emphasizing the null hypothesis perspective.

      (4) In general, we will consider additional aspects of the literature and conduct new experiments to strengthen the relevance of our work to existing work. We highlight a number of potential experiments which we believe can address reviewer concerns:

      a. Blurring the visual inputs to DNNs to match rodent perception.

      b. Vary environmental settings to verify whether our findings are more

      generalizable (which we predict to be the case).

      c. Vary the environment to assess remapping effects, which will strengthen the

      connection of our work to the literature.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews

      Reviewer #1 (Public Review):

      Summary:

      Federer et al. tested AAVs designed to target GABAergic cells and parvalbumin-expressing cells in marmoset V1. Several new results were obtained. First, AAV-h56D targeted GABAergic cells with >90% specificity, and this varied with serotype and layer. Second, AAV-PHP.eB.S5E2 targeted parvalbumin-expressing neurons with up to 98% specificity. Third, the immunohistochemical detection of GABA and PV was attenuated near viral injection sites.

      Strengths:

      Vormstein-Schneider et al. (2020) tested their AAV-S5E2 vector in marmosets by intravenous injection. The data presented in this manuscript are valuable in part because they show the transduction pattern produced by intraparenchymal injections, which are more conventional and efficient.

      Our manuscript additionally provides detailed information on the laminar specificity and coverage of these viral vectors, which was not investigated in the original studies.

      Weaknesses:

      The conclusions regarding the effects of serotype are based on data from single injection tracks in a single animal. I understand that ethical and financial constraints preclude high throughput testing, but these limitations do not change what can be inferred from the measurements. The text asserts that "...serotype 9 is a better choice when high specificity and coverage across all layers are required". The data presented are consistent with this idea but do not make a strong case for it.

      We are aware of the limitations of our results on the AAV-h56D. We agree with the Reviewer that a single injection per serotype does not allow us to make strong statements about differences between the 3 serotypes. Therefore, in the revised version of the manuscript we have tempered our claims about such differences and use more caution in the interpretation of these data (Results p. 6 and Discussion p.10). Despite this weakness, we feel that these data still demonstrate high efficiency and specificity across cortical layers of transgene expression in GABA cells using the h56D promoter, at least with two of the 3 AAV serotypes we tested. We feel that in itself this is sufficiently useful information for the primate community, worthy of being reported. Due to cost, time and ethical considerations related to the use of primates, we chose not to perform additional experiments to determine precise differences among serotypes. Thus, for example, while it is possible that had we replicated these experiments, serotype 7 could have proven equally efficient and specific as the other two serotypes, we felt answering this question did not warrant additional experiments in this precious species.

      A related criticism extends to the analysis of Injection volume on viral specificity. Some replication was performed here, but reliability across injections was not reported. My understanding is that individual ROIs were treated as independent observations. These are not biological replicates (arguably, neither are multiple injection tracks in a single animal, but they are certainly closer). Idiosyncrasies between animals or injections (e.g., if one injection happened to hit one layer more than another) could have substantial impacts on the measurements. It remains unclear which results regarding injection volume or serotype would hold up had a large number of injections been made into a large number of marmosets.

      For the AAV-S5E2, we made a total of 7 injections (at least 2 at each volume), all of which, irrespective of volume, resulted in high specificity and efficiency for PV interneurons. Our conclusion is that larger volumes are slightly less specific, but the differences are minimal and do not warrant additional injections. Additionally, we kept all the other parameters across animals constant (see new Supplementary Table 1), all of our injections involved all cortical layers, and the ROIs we selected for counts encompassed reporter protein expression across all layers. To provide a better sense of the reliability of the results across injections, in the revised version of the manuscript we now provide results for each of the AAV-S5E2 injection case separately in a new Supplementary Table 2. The results in this table indicate the results are indeed rather consistent across cases with slightly greater specificity for injection volumes in the range of 105-180 nl.

      Reviewer #2 (Public Review):

      This is a straightforward manuscript assessing the specificity and efficiency of transgene expression in marmoset primary visual cortex (V1), for 4 different AAV vectors known to target transgene expression to either inhibitory cortical neurons (3 serotypes of AAV-h56D-tdTomato) or parvalbumin (PV)+ inhibitory cortical neurons in mice. Vectors are injected into the marmoset cortex and then postmortem tissue is analyzed following antibody labeling against GABA and PV. It is reported that: "in marmoset V1 AAV-h56D induces transgene expression in GABAergic cells with up to 91-94% specificity and 80% efficiency, depending on viral serotype and cortical layer. AAV-PHP.eB-S5E2 induces transgene expression in PV cells across all cortical layers with up to 98% specificity and 86-90% efficiency."

      These claims are largely supported but slightly exaggerated relative to the actual values in the results presented. In particular, the overall efficiency for the best h56D vectors described in the results is: "Overall, across all layers, AAV9 and AAV1 showed significantly higher coverage (66.1{plus minus}3.9 and 64.9%{plus minus}3.7)". The highest coverage observed is just in middle layers and is also less than 80%: "(AAV9: 78.5%{plus minus}9.1; AAV1: 76.9%{plus minus}7.4)".

      In the abstract, we indeed summarize the overall data and round up the decimals, and state that these percentages are upper bound but that they vary by serotype and layer while in the Results we report the detailed counts with decimals. To clarify this, in the revised version of the Abstract we have changed 80% to 79% and emphasize even more clearly the dependence on serotype and layer. We have amended this sentence of the Abstract as follows: “We show that in marmoset V1 AAV-h56D induces transgene expression in GABAergic cells with up to 91-94% specificity and 79% efficiency, but this depends on viral serotype and cortical layer.”

      For the AAV-PHP.eB-S5E2 the efficiency reported in the abstract (“86-90%) is also slightly exaggerated relative to the results: “Overall, across all layers coverage ranged from 78%{plus minus}1.9 for injection volumes >300nl to 81.6%{plus minus}1.8 for injection volumes of 100nl.”

      Indeed, the numbers in the Abstract are upper bounds, for example efficiency in L4A/B with S5E2 reaches 90%. To further clarify this important point, in the revised abstract we now state ”AAV-PHP.eB-S5E2 induces transgene expression in PV cells across all cortical layers with up to 98% specificity and 86-90% efficiency, depending on layer”.

      These data will be useful to others who might be interested in targeting transgene expression in these cell types in monkeys. Suggestions for improvement are to include more details about the vectors injected and to delete some comments about results that are not documented based on vectors that are not described (see below).

      Major comments:

      Details provided about the AAV vectors used with the h56D enhancer are not sufficient to allow assessment of their potential utility relative to the results presented. All that is provided is: "The fourth animal received 3 injections, each of a different AAV serotype (1, 7, and 9) of the AAV-h56D-tdTomato (Mehta et al., 2019), obtained from the Zemelman laboratory (UT Austin)." At a minimum, it is necessary to provide the titers of each of the vectors. It would also be helpful to provide more information about viral preparation for both these vectors and the AAVPHP.eB-S5E2.tdTomato. Notably, what purification methods were used, and what specific methods were used to measure the titers?

      We thank the Reviewer for this comment. In the revised version of the manuscript, we now provide a new Supplementary Table 1 with titers and other information for each viral vector injection. We also provide information regarding viral preparation in a new sections in the Methods entitled “ Viral Preparation”  (p12).

      The first paragraph of the results includes brief anecdotal claims without any data to support them and without any details about the relevant vectors that would allow any data that might have been collected to be critically assessed. These statements should be deleted. Specifically, delete: “as well as 3 different kinds of PV-specific AAVs, specifically a mixture of AAV1-PaqR4-Flp and AAV1-h56D-mCherry-FRT (Mehta et al., 2019), an AAV1-PV1-ChR2-eYFP (donated by G. Horwitz, University of Washington),” and delete “Here we report results only from those vectors that were deemed to be most promising for use in primate cortex, based on infectivity and specificity. These were the 3 serotypes of the GABA-specific pAAV-h56D-tdTomato, and the PV-specific AAVPHP.eB-S5E2.tdTomato.” These tools might in fact be just as useful or even better than what is actually tested and reported here, but maybe the viral titer was too low to expect any expression.

      These data are indeed anecdotal, but we felt this could be useful information, potentially preventing other primate labs from wasting resources, animals and time, particularly, as some of these vectors have been reported to be selective and efficient in primate cortex, which we have not been able to confirm. We made several injections in several animals of those vectors that failed either to infect a sufficient number of cells or turned out to be poorly specific. Therefore, the negative results have been consistent in our hands. But we agree with the Reviewer that our negative results could have depended on factors such as titer. In the revised version of the manuscript, following the reviewer’s suggestion, we have deleted this information.

      Based on the description in the Methods it seems that no antibody labeling against TdTomato was used to amplify the detection of the transgenes expressed from the AAV vectors. It should be verified that this is the case - a statement could be added to the Methods.

      That is indeed the case. We used no immunohistochemistry to enhance the reporter proteins as this was unnecessary. The native/ non-amplified tdT signal was strong. This is now stated in the methods (p.12).

      Reviewer #3 (Public Review):

      Summary:

      Federer et al. describe the laminar profiles of GABA+ and of PV+ neurons in marmoset V1. They also report on the selectivity and efficiency of expression of a PV-selective enhancer (S5E2). Three further viruses were tested, with a view to characterizing the expression profiles of a GABA-selective enhancer (h56d), but these results are preliminary.

      Strengths:

      The derivation of cell-type specific enhancers is key for translating the types of circuit analyses that can be performed in mice - which rely on germline modifications for access to cell-type specific manipulation - in higher-order mammals. Federer et al. further validate the utility of S5E2 as a PV-selective enhancer in NHPs.

      Additionally, the authors characterize the laminar distribution pattern of GABA+ and PV+ cells in V1. This survey may prove valuable to researchers seeking to understand and manipulate the microcircuitry mediating the excitation-inhibition balance in this region of the marmoset brain.

      Weaknesses:

      Enhancer/promoter specificity and efficiency cannot be directly compared, because they were packaged in different serotypes of AAV.

      The three different serotypes of AAV expressing reporter under the h56D promoter were only tested once each, and all in the same animal. There are many variables that can contribute to the success (or failure) of a viral injection, so observations with an n=1 cannot be considered reliable.

      This is an important point that was also brough up by Reviewer 1, which we have addressed in our reply-to-Reviewer 1. For clarity and convenience, below we copy our response to Reviewer 1.

      “We are aware of the limitations of our results on the AAV-h56D. We agree with the Reviewer that a single injection per serotype does not allow us to make strong statements about differences between the 3 serotypes. Therefore, in the revised version of the manuscript we will temper our claims about such differences and use more caution in the interpretation of these data. Despite this weakness, we feel that these data still demonstrate high efficiency and specificity across cortical layers of transgene expression in GABA cells using the h56D promoter, at least with two of the 3 AAV serotypes we tested. We feel that in itself this is sufficiently useful information for the primate community, worthy of being reported. Due to cost, time and ethical considerations related to the use of primates, we chose not to perform additional experiments to determine precise differences among serotypes. Thus, for example, while it is possible that had we replicated these experiments, serotype 7 would have proven equally efficient and specific as the other two serotypes, we felt answering this question did not warrant additional experiments in this precious species.”

      The language used throughout conflates the cell-type specificity conferred by the regulatory elements with that conferred by the serotype of the virus.

      Authors’ reply. In the revised version of the manuscript, we have corrected ambiguous language throughout.

      Recommendations for the authors

      Reviewer #1 (Recommendations For The Authors):

      My Public Review comments can be addressed by dialing down the interpretation of the data or providing appropriate caveats in the presentation of the relevant results and their discussion.

      We have done so. See text additions on p. 6 of the Results and p.10 of the Discussion.

      Minor comments:

      92% of PV+ neurons in the marmoset cortex were GABAergic. Can the authors speculate on the identity of the 8% PV+/GABA- neurons (e.g., on the basis of morphology)? Are they likely excitatory? Are they more likely to represent failures of GABA staining?

      We do not know what the other 8% of PV+/GABA- neurons are because we did not perform any other kind of IHC staining. Our best guess is that at least to some extent these represent failures of GABA staining, which is always challenging to perform in primate cortex. However, in mouse PV expression has been demonstrated in a minority of excitatory neurons.

      "Coverage of the PV-AAV was high, did not depend on injection volume.." The fact that the coverage did not depend on injection volume presumably depends, at least in part, on how ROIs were selected. Surely different volumes of injection transduce different numbers of neurons at different distances from the injection track. This should be clarified.

      The ROIs were selected at the center of the injected site/expression core from sections in which the expression region encompassed all cortical layers. Of course, larger volumes of injection resulted in larger transduced regions and therefore overall larger number of transduced neurons, but we counted cells only withing 100 µm wide ROIs at the center of the injection and the percent of transduced PV cells in this core region did not vary significantly across volumes. We have clarified the methods of ROI selection (see Methods pp. 13).

      Figure 2. What is meant by “absolute” in the legend for Figure 2? (How does “mean absolute density” differ from “mean density?”)

      We meant not relative, but this is obvious from the units, so we have removed the word “absolute” in the legend.

      Some non-significant p-values are indicated by "p>0.05" whereas others are given precisely (e.g., p = 1). Please provide precise p-values throughout. Also, the p-value from a surprisingly large number of comparisons in the first section of the results is "1". Is this due to rounding? Is it possible to get significance in a Bonferroni-corrected Kruskal-Wallis test with only 6 observations per condition?

      We now report exact p values throughout the manuscript (with a couple of exceptions where, in order to avoid reporting a large number of p values which interrupts the flow of the manuscript) we provide the upper bound value and state all those comparisons were below that value). The minimum sample size for Kruskall Wallis is 5, for each group being compared, and we our sample is 6 per group.

      Figure 3: The density of tdTomato-expressing cells appears to be greater at the AAV9 injection site than at the AAV1 injection site in the example sections shown. Might some of the differences between serotypes be due to this difference? I would imagine that resolving individual cells with certainty becomes more difficult as the amount of tdTomato expression increases.

      There was an error in the scale bar of Fig. 3C, so that the AAV1 injection site was shown at higher magnification than indicated by the wrong scale bar. Hence the density of tdTomato appeared lower than it is. Moreover, the tdT expression region shown in Fig. 3A is a merge of two sections, while it is only from a single section in panels B and C, leading to the impression of higher density of infected cells in panel A. The pipette used for the injection in panel A was not inserted perfectly vertical to the cortical surface, resulting in an injection site that did not span all layers in a single section; thus, to demonstrate that the injection indeed encompassed all layers (and that the virus infected cells in all layers), we collapsed label from two sections. We have now corrected the magnification of panel C so that it matches the scale bar in panel A, and specify in the figure legend that panel A label is from two sections.

      Text regarding Figure 3: The term “injection sizes” is confusing. I think it is intended to mean “the area over which tdTomato-expressing cells were found” but this should be clarified.

      Throughout the manuscript, we have changed the term injection site to “viral-expression region”.

      Figure 3: What were the titers of the three AAV-h56D vectors?

      Titers are now reported in the new Supplementary Table 1.

      Figure 3: The yellow box in Figure 3C is slightly larger than the yellow boxes in 3A and 3B. Is this an error or should the inset of Figure 3 have a scale bar that differs from the 50 µm scale bar in 3A?

      There were indeed errors in scale bars in this figure, which we have now corrected. Now all boxes have the same scale bar.

      Was MM423 one of the animals that received the AAV-h56D injections or one of the three that received AAV-S5E2 injection?

      This is an animal that received a 315nl injection of AAV-PHP.eB-S5E2.tdTomato. This is now specified in the Methods (see p. 12) and in the new Supplementary Table 1.

      Please provide raw cell counts and post-injection survival times for each animal.

      We now provide this information in Supplementary Tables 1 and 2.

      How were the different injection volumes of the AAV-S5E2 virus arranged by animal? Which volume of the AAV-S5E2 virus was injected into the two animals who received single injections?

      We now provide this information in Supplementary Table 1.

      Figure 6A: the point is made in the text that "[the distribution of tdT+ and PV+ neurons] did not differ significantly... peaking in L2/3 and 4C " Is the fact that the number of tdT+ and PV+ peak in layers 2/3 and 4C a consequence of these layers being thicker than the others? If so, this statement seems trivial.

      No, and this is the reason why we measured density in addition to percent of cells across layers in Figure 2. Figure 2B shows that even when measuring density, therefore normalizing by area, GABA+ and PV+ cell density still peaks in L2/3 and 4. Thus, these peaks do not simply reflect the greater thickness of these layers.

      Do the authors have permission to use data from Xu et al. 2010?

      Yes, we do.

      Reviewer #2 (Recommendations For The Authors):

      Minor comments:

      "Viral strategies to restrict gene expression to PV neurons have also been recently developed (Mehta et al., 2019; Vormstein-Schneider et al., 2020)." Mich et al. should also be cited here. Cell Rep. 2021;34(13):108754.

      We thank the reviewer for pointing out this missing references. This is now cited.

      “GABA density in L4C did not differ from any other layers, but the percent of GABA+ cells in L4C was significantly higher than in L1 (p=0.009) and 4A/B (p=<0.0001).” This and other similar observations depend on calculating the percentage of cells relative to the total number of DAPI-labeled cells in each layer. Since it is apparent that there must be considerable variability between layers, it would be helpful to add a histogram showing the densities of all DAPI-labeled cells for each layer.

      This is not how we calculated density. Density, as now clarified in the Results on p. 4, was defined as the number of cells per unit area. Counts in each layer were divided by each layers’ counting area. This corrects for differences in number of total labeled cells per layer. Therefore, reporting DAPI density is not necessary (we did not count DAPI cell density per layer).

      "Identical injection volumes of each serotype, delivered at 3 different cortical depths (see Methods), resulted in different injection sizes, suggesting the different serotypes have different capacity of infecting cortical neurons. AAV7 produced the smallest injection site, which additionally was biased to the superficial and deep layers, with only few cells expressing tdT in the middle layers (Fig. 3B). AAV9 (Fig. 3A) and AAV1 (Fig. 3C) resulted in larger injection sites and infected all cortical layers." Differences noted here might reflect either differences related to the AAV serotype or to differences in titers. Please add details about titers for each vector and add comments as appropriate. Another interpretation would be that there are differences in viral spread within the tissue.

      We have now added Supplementary Table 1 which reports titers in addition to other information about injections. The titers and volumes used for AAV9 and AAV7 were identical, while the titer for AAV1 was higher. Therefore, the differences in infectivity, particularly the much smaller expression region obtained with AAV7 cannot be attributed to titer. Likely this is due to differences in tropism and/or viral spread among serotypes. This is now discussed (see Results p. 5bottom and 6 top).

      “Recently, several viral vectors have been identified that selectively and efficiently restrict gene expression to GABAergic neurons and their subtypes across several species, but a thorough validation and characterization of these vectors in primate cortex has lacked.” Is this really a fair statement, or is the characterization presented here also lacking? Methods used by others for quantifying specificity and efficiency are essentially the same as used here. See for example Mich et al. (which is not cited).

      The original validation in primates of the vectors examined in our study was based on small tissue samples and did not examine the laminar expression profile of transgene expression induced by these enhancer-AAVs. For example, the validation of the h56D-AAV in marmoset cortex in the original paper by Mehta et al (2019) was performed on a tissue biopsy with no knowledge of which cortical layers were included in the tissue sample. The only study that shows laminar expression in primate cortex (Mich et al., which is now cited), only shows qualitative images of viral expression across layers, reporting total specificity and coverage pooled across samples; moreover, the study by Mich et al.  deals with different PV-specific enhancers than the ones characterized in our study. Unlike any of the previous studies, here we have quantified specificity and coverage across layers.

      "Specifically, we have shown that the GABA-specific AAV9-h56D (Mehta et al., 2019) induces transgene expression in GABAergic cells with up to 91-94% specificity and 80% coverage, and the PV-specific AAV-PHP.eB-S5E2 (Vormstein-Schneider et al., 2020) induces transgene expression in PV cells with up to 98% specificity and 86-90% coverage." These statements in the discussion repeat the somewhat exaggerated coverage numbers noted above for the Abstract.

      The averages across all layers are reported in the Results. The Discussion, abstract and discussion report upper limits, and this is made clear by stating “up to”, and now we have also added “depending on layer”.

      Reviewer #3 (Recommendations For The Authors):

      Abstract:

      • Ln 2: Can you be more specific about what you mean by the 'various functions of inhibition'? e.g. do you mean 'the various inhibitory influences on the local microcircuit' or similar?

      These are listed in the introduction to the paper but there is no space in the abstract to do so. Now the sentence reads: “various computational functions of…”.

      • Ln 5: 'has' to 'is'/'has been'.

      The grammar here is correct “has derived”.

      • Ln 6: humans are primates! Maybe change this to 'nonhuman primates'?

      We have added “non-human”

      • Ln n-1: 'viral vectors represent' -> 'viral vectors are'.

      We have changed it to “are”

      Intro:

      • Many readers may expect 'VIP' to be listed as the third major sub-class of interneurons. Could you note that the 5HT3a receptor-expressing group includes VIP cells?

      Done (p.3).

      • "Understanding cortical inhibitory neuron function in the primate is critical for understanding cortical function and dysfunction in the model system closest to humans" - this seems close to being circular logic (not quite, but close). Could you modify this sentence to reflect why understanding cortical function and dysfunction in NHP may be of interest?

      This sentence now reads (p.3):” Understanding cortical inhibitory neuron function in the primate is critical for understanding cortical function and dysfunction in the model system closest to humans, where cortical inhibitory neuron dysfunction has been implicated in many neurological and psychiatric disorders, such as epilepsy, schizophrenia and Alzheimer’s disease (Cheah et al., 2012; Verret et al., 2012; Mukherjee et al., 2019)”. We also note that this was already stated in the previous version of the paper but in the Discussion section which read (and still reads on p. 9 2nd paragraph): “It is important to study inhibitory neuron function in the primate, because it is unclear whether findings in mice apply to higher species, and inhibitory neuron dysfunction in humans has been implicated in several neurological and psychiatric disorders (Marin, 2012; Goldberg and Coulter, 2013; Lewis, 2014).”.

      • "In particular, two recent studies have developed recombinant adeno-associated viral vectors (AAV) that restrict gene expression to GABAergic neurons". This sentence places the emphasis on the wrong component of the technology. The fact that AAV was used is irrelevant; these constructs could equally have been packaged in a lenti, CAV, HSV, rabies, etc. The emphasis should be on the recently developed regulatory elements (the enhancers/promoters).

      Same problem with the following excerpts; this text implies that the serotype/vector confers cell-type selectivity, but the results presented do not support this assertion (the promoter/enhancer is what confers the selectivity).

      • "specifically, three serotypes of an AAV that restricts gene expression to GABAergic neurons".

      • "one serotype of an AAV that restricts gene expression to PV cells".

      • "GABA- and PV-specific AAVs".

      • "GABA-specific AAV" (in results).

      • "PV-specific AAVs".

      • "In this study, we have characterized several AAV vectors designed to restrict expression to GABAergic cells" (in discussion).

      • "GABA-virus". GABA is a NT, not a virus.

      We have modified the language in all these sections and throughout the manuscript.

      Results:

      • Enhancer specificity and efficiency cannot be directly compared, because they were packaged in different serotypes of AAV.

      We agree, and in fact we are not making comparisons between different enhancers (i.e., S5E2 and h56D).

      The three different serotypes of AAV expressing reporter under the h56D promoter were only tested once each, and all in the same animal. There are many variables that can contribute to the success (or failure) of a viral injection, so observations with an n=1 cannot be considered reliable.

      The authors need to either: (1) replicate the h56D virus injections in (at least) a second animal, or (2) rewrite the paper to focus on the AAV.PhP mDlx virus alone - for which they have adequate data - and mention the h56D data as an anecdotal result, with clear warnings about the preliminary nature of the observations due to lack of replication.

      We agree about the lack of sufficient data to make strong statements about the differences between serotypes for the h56D-AAV. In the revised version of the manuscript, following the Reviewers’ suggestion, we have chosen to temper our claims about differences between serotypes for the h56D enhancer and use more caution in the interpretation of these data. We feel that these data still demonstrate sufficiently high efficiency and specificity across cortical layers of transgene expression in GABA cells using the h56D promoter, at least with two of the 3 AAV serotypes we tested, to warrant their use in primates. Due to cost, time and ethical considerations related to the use of primates, we chose not to perform additional experiments to determine precise differences among serotypes. Thus, for example, while it is possible that had we replicated these experiments, serotype 7 could have proven equally efficient and specific as the other two serotypes, we felt answering this question did not warrant additional experiments in this precious species. Our edits in regard to this point can be found in the Results on p. 6 and Discussion on p. 10.

      • Did the authors compare h56D vs mDlx? This would be a useful and interesting comparison.

      We did not.

      • 3 tissue sections were used for analysis. How were these selected? Did the authors use a stereological approach?

      For the analysis in Fig. 2, the 3 sections were randomly selected and for the positioning of the ROIs we selected a region in dorsal V1 anterior to the posterior pole  (to avoid laminar distortions due to the curvature of the brain). This is now specified (see p. 4).

      • "both GABA+ and PV+ cells peak in layers" revise for clarity (e.g., the counts peak).

      In now reads “GABA+ and PV+ cell percent and density” (see p.4).

      • "we refer to this virus as GABA-AAV" these are 3 different viruses!

      The idea here was to use an abbreviation instead of using the full viral name every single time. Clearly the reviewer does not like this, so we have removed this convention throughout the paper and now specify the entire viral name each time.

      • "Identical injection volumes of each serotype, delivered at 3 different cortical depths (see Methods), resulted in different injection sizes". Do you mean 'resulted in different volumes of expression'?

      Yes. We have now rephrased this as follows: “…resulted in viral expression regions that differed in both size as well as laminar distribution” (p.5).

      • “suggesting the different serotypes have different capacity of infecting cortical neurons”. You can’t draw any firm conclusions from a single injection. The rest of this section of the results, along with the whole of Figure 4, and Figure 7a-d, is in danger of being misleading. Please remove. The best you can do here is to say ‘we injected 3 different viruses that express reporter under the h56D promoter. The results are shown in Figure 3, but these are anecdotal, as only a single injection of each virus was performed’. You could then note in the discussion to what extent these results are consistent with the existing literature (e.g., AAV9 often produces good coverage in NHP – anterograde and retrograde, AAV1 also works well in the CNS, although generally doesn’t infect as aggressively as AAV9. I’m not familiar with any attempts to use AAV7).

      With respect to Fig. 4, our approach in the revised version is detailed above. For convenience we copy it below here. With respect to Fig 7A-D, we feel the results are more robust as the data from the 3 serotypes here were pooled together, as the 3 serotype similarly downregulated GABA and PV expression at the injection site, and we do not make any statement about differences among serotypes for the data shown in Fig. 7A-D.

      “In the revised version of the manuscript, following the Reviewer ’s suggestion, we have chosen to temper our claims about differences between serotypes for the h56D enhancer and use more caution in the interpretation of these data (see revised text in the Results on p. 6 and in the Discussion on p. 10). We feel that these data still demonstrate sufficiently high efficiency and specificity across cortical layers of transgene expression in GABA cells using the h56D promoter, at least with two of the 3 AAV serotypes we tested, to warrant their use in primates. Due to cost, time and ethical considerations related to the use of primates, we chose not to perform additional experiments to determine precise differences among serotypes. Thus, for example, while it is possible that had we replicated these experiments, serotype 7 could have proven equally efficient and specific as the other two serotypes, we felt answering this question did not warrant additional experiments in this precious species.”

      • Figure 3: why the large variation in tissue quality? Are the 3 upper images taken at the same magnification? If not, they need different scale bars. The cells in A (upper row) look much smaller than those in B and C, and the size of the 'inset' box varies.

      We thank the reviewer for noticing this. We discovered an error in the scale bar of Fig. 3C, so that the AAV1 injection site was shown at higher magnification than indicated by the wrong scale bar. We have now corrected the error in scale bars. We have also fixed the different box sizes.

      • "Overall, across all layers coverage ranged from 78%{plus minus}1.9 for injection volumes >300nl to 81.6%{plus minus}1.8 for injection volumes of 100nl." Coverage didn't differ between layers, so revise this to: "Overall, across all layers coverage ranged from 78% to 81.6%." or give an overall mean (~80%).

      We have corrected the sentence as suggested by the Reviewer (see p. 8 first paragraph).

      • "extending farther from the borders" -> "extending beyond the borders".

      We have corrected the sentence as suggested by the Reviewer (see p. 8).

      • "The reduced GABA and PV immunoreactivity caused by the viruses implies that the specificity of the viruses we have validated in this study is likely higher than estimated". Yes, but for balance you should also note that they may harm the physiology of the cell.

      We have added a sentence acknowledging this to the Discussion. Specifically, on p. 10, we now state: “However, this reduced immunoreactivity raises concerns about the virus or high levels of reporter protein possibly harming the cell physiology.”

      Discussion:

      • "but a thorough validation and characterization of these vectors in primate cortex has lacked" better to say "has been limited", because Dimidschstein 2016 (marmoset V1) and Vormstein-schneider 2020 (macaque S1 and PFC) both reported expression in NHP.

      We have added the following sentence to this paragraph of the Discussion. “In particular, previous studies have not characterized the specificity and coverage of these vectors across cortical layers.”(see p. 8).

      • "whether finding in mice" -> 'whether findings in mice'.

      Corrected, thanks.

      • The discussion re: species differences is missing reference to Kreinen 2020 (10.1038/s41586-020-2781-z).

      This reference has been added. Thanks.

      • “Injections of about 200nl volume resulted in higher specificity (95% across layers) and coverage” – this is misleading. The coverage was not statistically different among injection volumes.

      We have added the following sentence: ”although coverage did not differ significantly across volumes.” (see p. 10).

      • "it is possible that subtle alteration of the cortical circuit upon parenchymal injection of viruses (including AAVs) leads to alteration of activity-dependent expression of PV and GABA." Or (and I would argue, more likely) the expression of large quantities of your big reporter protein compromised the function of the cell, leading to reduced expression of native proteins. You don't mention any IHC to amplify the RFP signal, so I'm assuming that your images are of direct expression. If so, you are expressing A LOT of reporter protein.

      We have added a sentence acknowledging this to the Discussion. Specifically, on p. 10, we now state: “However, this reduced immunoreactivity raises concerns about the virus or high levels of reporter protein possibly harming the cell physiology.”

      Methods:

      • It's difficult to piece together which viruses were injected in which monkeys, at what volumes, and at what titer. Please compile this info into a table for ease of reference (including any other relevant parameters).

      We now provide a Supplementary Table 1.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors of this manuscript characterize new anion conducting that is more red-shifted in its spectrum than prior variants called MsACR1. An additional mutant variant of MsACR1 that is renamed raACR has a 20 nm red-shifted spectral response with faster kinetics. Due to the spectral shift of these variants, the authors proposed that it is possible to inhibit the expression of MsACR1 and raACR with lights at 635 nm in vivo and in vitro. The authors were able to demonstrate some inhibition in vitro and in vivo with 635 nm light. Overall the new variants with unique properties should be able to suppress neuronal activities with red-shifted light stimulation.

      Strengths:

      The authors were able to identify a new class of anion conducting channelrhodopsin and have variants that respond strongly to lights with wavelength >550 nm. The authors were able to demonstrate this variant, MsACR1, can alter behavior in vivo with 635 nm light. The second major strength of the study is the development of a red-shifted mutant of MsACR1 that has faster kinetics and 20 nm red-shifted from a single mutation.

      Weaknesses:

      The red-shifted raACR appears to work much less efficiently than MsACR1 even with 635 nm light illumination both in vivo (Figure 4) and in vitro (Figure 3E) despite the 20 nm red-shift. This is inconsistent with the benefits and effects of red-shifting the spectrum in raACR. This usually would suggest raACR either has a lower conductance than MsACR1 or that the membrane/overall expression of raACR is much weaker than MsACR1. Neither of these is measured in the current manuscript.

      Thank you for addressing this crucial issue. We posit that the diminished efficiency of raACR in comparison to MsACR1 WT can be attributed to the tenfold acceleration of its photocycle. As noted by Reviewer 1, the anticipated advantages associated with a red-shifted opsin, particularly in in vivo preparations, are offset by its accelerated off-kinetics. Consequently, the shorter dwell time of the open state leads to a reduced number of conducted ions per photon. Nevertheless, the operational light sensitivity is not drastically altered compared to MsACR WT (Fig. 3C). We believe that the rapid kinetics offer interesting applications, such as the precise inhibition of single action potentials through holography.

      There are limited comparisons to existing variants of ACRs under the same conditions in the manuscript overall. There should be more parallel comparison with gtACR1, ZipACR, and RubyACR in identical conditions in cultured cell lines, cultured neurons, and in vivo. This should be in terms of overall performance, efficiency, and expression in identical conditions. Without this information, it is unclear whether the effects at 635 nm are due to the expression level which can compensate for the spectral shift.

      We compared MsACR1 and raACR with GtACR1 in ND cells in supplemental figure 4. We concur that further comparisons could be useful to emphasise both the strengths of MsACRs and applications where they may not be as suitable. We are currently in the process of outlining a separate article. We firmly believe that each ACR variant occupies a distinct application niche, which necessitates a more comprehensive electrophysiological comparison to provide valuable insights to the scientific community.

      There should be more raw traces from the recordings of the different variants in response to short pulse stimulation and long pulse stimulation to different wavelengths. It is difficult to judge what the response would be like when these types of information are missing.

      We appreciate Reviewer 1's feedback and have compiled a collection of raw photoresponses, encompassing various pulse widths and wavelengths, which can be found in the Supplementary materials (Supplementary Figures 4 and 5).

      Despite being able to activate the channelrhodopsin with 635 nm light, the main utility of the variant should be transcranial stimulation which was not demonstrated here.

      We concur with Reviewer 1's assessment that MsACR prime application is indeed transcranial stimulation. However, it's worth emphasising that the full advantages of transcranial optical stimulation become most apparent when animals are truly freely moving without any tethered patch cords. Our ongoing research in the laboratory is dedicated to the development of a wireless LED system that can be securely affixed to the animal's skull. We aim to demonstrate the potential of these novell optogenetic approaches in the field of behavioural neuroscience in the coming year.

      Figure 3B is not clearly annotated and is difficult to match the explanation in the figure legend to the figure. The action potential spikings of neurons expressing raACR in this panel are inhibited as strongly as MsACR1.

      We have enhanced the figure caption and annotations for clarity. The traces presented in Figure 3B are intended to demonstrate the overall effectiveness of each variant. However, it is in the population data analysis, as depicted in Figure 3E, where the meaningful insights are revealed.

      For many characterizations, the number of 'n's are quite low (3-7).

      We acknowledge Reviewer 1's suggestion regarding the in vivo data and agree with the importance of including more animals, as well as control animals. However, we are committed to adhering to the principles of the 3Rs (Replacement, Reduction, Refinement) in animal research, and given the robustness of our observed effects, we will add animals to reach the minimal number of animals per condition (n = 2) to minimise unnecessary animal usage while ensuring statistical power.

      We will continue to adhere to the established standards in the field, aiming for a range of 3 to 7 cells per condition, sourced from at least two independent preparations, to ensure the robustness and reliability of our in vitro data.

      Reviewer #2 (Public Review):

      Summary:

      The authors identified a new chloride-conducting Channelrhodopsin (MsACR1) that can be activated at low light intensities and within the red part of the visible spectrum. Additional engineering of MsACR1 yielded a variant (raACR1) with increased current amplitudes, accelerated kinetics, and a 20nm red-shifted peak excitation wavelength. Stimulation of MsACR1 and raACR1 expressing neurons with 635nm in mice's primary motor cortices inhibited the animals' locomotion.

      Strengths:

      The in vitro characterization of the newly identified ACRs is very detailed and confirms the biophysical properties as described by the authors. Notably, the ACRs are very light sensitive and allow for efficient in vitro inhibition of neurons in the nano Watt/mm^2 range. These new ACRs give neuroscientists and cell biologists a new tool to control chloride flux over biological membranes with high temporal and spatial precision. The red-shifted excitation peaks of these ACRs could allow for multiplexed application with blue-light excited optogenetic tools such as cation-conducting channelrhodopsins or green-fluorescent calcium indicators such as GCaMP.

      Weaknesses:

      The in-vivo characterization of MsACR1 and raACR1 lacks critical control experiments and is, therefore, too preliminary. The experimental conditions differ fundamentally between in vitro and in vivo characterizations. For example, chloride gradients differ within neurons which can weaken inhibition or even cause excitation at synapses, as pointed out by the authors. Notably, the patch pipettes for the in vitro characterization contained low chloride concentrations that might not reflect possible conditions found in the in vivo preparations, i.e., increasing chloride gradients from dendrites to synapses.

      We appreciate Reviewer 2’s feedback regarding missing control experiments. We will respond to these concerns in another section of our manuscript, as suggested.

      Regarding the chloride gradient, we understand the concerns of Reviewer 2, yet we chose these ionic conditions, particularly as they were used in the initial electrical characterization of GtACR1 in a neuronal context (Mahn et al., 2016). We will make sure to provide this context in our manuscript to justify our choice of ionic conditions.

      Interestingly, the authors used soma-targeted (st) MsACR1 and raACR1 for some of their in vitro characterization yielding more efficient inhibition and reduction of co-incidental "on-set" spiking. Still, the authors do not seem to have utilized st-variants in vivo.

      At the time of submission, due to the long-term absence of our lab technician, we were not able to produce purified viruses. Therefore, we decided to move on with the submission. We now produced the virus externally, and will provide the experiments.

      Most importantly, critical in vivo control experiments, such as negative controls like GFP or positive controls like NpHR, are missing. These controls would exclude potential behavioral effects due to experimental artifacts. Moreover, in vivo electrophysiology could have confirmed whether targeted neurons were inhibited under optogenetic stimulations.

      We have several non-injected control animals that we used to calibrate this particular paradigm and never saw similar responses. However, we acknowledge the suggestion of Reviewer 2 and will include the GFP-injected control as recommended.

      Some of these concerns stem from the fact that the pulsed raACR stimulation at 635 nm at 10Hz (Fig. 3E) was far less efficient compared to MsACR1, yet the in vivo comparison yielded very similar results (Fig. 4D).

      As outlined previously, the accelerated photocycle of raACR results in a reduction in photocurrent amplitude, consequently diminishing the potency of inhibition per photon. In the context of in vitro stimulation, where single action potentials are recorded, this reduction in inhibition efficiency is resolved. However, in the realm of in vivo behavioural analysis, the observed effect is not contingent on single action potentials but rather stems from the disruption of the entire M1 motor network. In this context, despite the reduced efficiency of the fast-cycling raACR, it still manages to interrupt the M1 network, leading to similar behavioural outcomes.

      Also, the cortex is highly heterogeneous and comprises excitatory and inhibitory neurons. Using the synapsin promoter, the viral expression paradigm could target both types and cause differential effects, which has not been investigated further, for example, by immunohistochemistry. An alternative expression system, for example, under VGLUT1 control, could have mitigated some of these concerns.

      Indeed, we acknowledge the limitations of our current experimental approach. We are in the process of planning and conducting additional experiments involving cre-dependent expression of st-MSACR and st-raACR in PV-Cre mice.

      Furthermore, the authors applied different light intensities, wavelengths, and stimulation frequencies during the in vitro characterization, causing varying spike inhibition efficiencies. The in vivo characterization is notably lacking this type of control. Thus, it is unclear why the 635nm, 2s at 20Hz every 5s stimulation protocol, which has no equivalent in the in vitro characterization, was chosen.

      We appreciate the valuable comment from the reviewer. The objective of our in vitro characterization is to elucidate the general effects of specific stimulation parameters on the efficiency of neuronal inhibition. For instance, we aim to demonstrate that lower light intensities result in less efficient inhibition, or that pulse stimulation may lead to a less complete inhibition, albeit significantly reducing the energy input into the system.

      In the in vivo characterization, we face constraints such as animal welfare considerations and limitations in available laser lines, which prevent us from exploring the entire parameter space as comprehensively as in the in vitro preparation. Additionally, it is important to note that membrane capacitance tends to be higher in vivo compared to dissociated hippocampal neurons. Consequently, we have opted for a doubled stimulation frequency from 10 Hz to 20 Hz and the stimulation pattern of 2 seconds ”on” and 5 seconds “off”. This approach allows the animals to spend less time in an arrested state while still demonstrating the effect of MsACR and variants.

      In summary, the in vivo experiments did not confirm whether the observed inhibition of mouse locomotion occurred due to the inhibition of neurons or experimental artifacts.

      In addition, the author's main claim of more efficient neuronal inhibition would require them to threshold MsACR1 and raACR1 against alternative methods such as the red-shifted NpHR variant Jaws or other ACRs to give readers meaningful guidance when choosing an inhibitory tool.

      The light sensitivity of MsACR1 and raACR1 are impressive and well characterized in vitro. However, the authors only reported the overall light output at the fiber tip for the in vivo experiments: 0.5 mW. Without context, it is difficult to evaluate this value. Calculating the light power density at certain distances from the light fiber or thresholding against alternative tools such as NpHR, Jaws, or other ACRs would allow for a more meaningful evaluation.

      We thank the reviewers for their comments.

      Reviewer #1 (Recommendations For The Authors):

      The study would be much strengthened if the authors can perform more experiments and characterization to support their claims, in addition to showing more raw electrophysiological traces/results and not just summary charts and graphs.

      As outlined above, further experiments are planned. We appreciate the suggestion to include more raw electrophysiological traces. Photocurrent traces of all included mutants of MsACR1 measured in ND cells and traces of hippocampal neuronal measurements of non- and soma-targeted MsACR1 and raACR will be included as supplemental figures.

      Reviewer #2 (Recommendations For The Authors):

      Major concern:

      It is unclear if the optogenetic light stimulation in Fig. 4 caused direct inhibition of neuronal activity in M1, which cell types were targeted, and how MsACR1 and raACR1 compare to other optogenetic inhibitors.

      Also, the rationale for the light stimulation (635 nm, 2s, 20Hz, every 5s) is not clear.

      I would suggest the following to address these concerns:

      (1) M1 expression and stimulation of a negative control such as GFP to exclude that experimental artifacts cause the observed behavioral outcomes.

      We are now preparing the required GFP control, and will add it to the new version of the manuscript.

      (2) Expression and stimulation of NpHR as a positive control.

      We will use st-GtACR1 as a positive control.

      (3) Electrophysiological measurements of neuronal activity under optogenetic stimulation to confirm the effectiveness of neuronal inhibition, i.e. suppression of spontaneous firing under light etc.

      We concur with Reviewer 2 regarding the potential value of incorporating such in vivo optrode recordings into our manuscript to enable readers to assess the effectiveness of MsACR. As part of our plan for the next version of the manuscript, we intend to conduct these experiments.

      (4) ChR2 or other cation-conducting channelrhodopsins with the same expression paradigm could be used to observe diametrically opposite effects.

      As Reviewer 2 has already pointed out, the complex interactions that can occur in our viral strategy when an inhibitory opsin is expressed in both excitatory and inhibitory neurons make us sceptical about the possibility of an excitatory opsin leading to opposing effects.

      Considering the non-linear input-output function of cortical circuits, optogenetic activation of neurons, even when expressed in either inhibitory or excitatory neurons, is likely to result in the perturbation of the cortical network, which will likely also lead to locomotor arrest.

      (5) The authors should confirm whether the expression under synapsin preferentially targeted excitatory and inhibitory cells because inhibiting inhibitory cells could lead to the disinhibition of the principal cells. Synapsin promoters can drive expression in glutamatergic and GABAergic neurons. An alternative expression system under VGLUT1 promoter could yield better targeting.

      We concur with Reviewer 2 and will conduct the next set of experiments using the PV-Cre mouse line. Additionally, we will employ in vivo electrophysiology to further confirm the inhibition of the motor cortex network.

      (6) Titrating of optogenetic stimulation: The author should test whether increasing or decreasing light intensities and stimulation frequencies as well as different wavelengths (550 nm vs 635 nm) cause differences in inhibiting locomotion in vivo as it did for inhibiting the neuronal firing in vitro (Fig. 3B-E).

      The non-linear input-output function within cortical networks, coupled with our sole reliance on behaviour as a readout, will pose challenges in resolving subtle effects on locomotion arrest across various stimulation parameters.

      For our planned in vivo electrophysiology recordings, we will measure cortical firing rates as a proxy rather than relying solely on behavioural observations. This approach will allow us to map the fundamental axes of our parameter space in vivo, considering factors such as wavelength, light intensity, and frequency

      (7) Explanation of why the 20Hz/2s light stimulation protocol was chosen.

      As outlined above, considering animal welfare and increased membrane capacitance in vivo, we opted for the outlined stimulation protocol. This approach allows the animals to spend less time in an arrested state while still demonstrating the effect of MsACR and variants.

      (8) In vivo thresholding against other inhibitory tools, such as RubyACRs, Jaws, etc would provide critical guidance for the audience and potential users. It would be particularly important to compare the necessary light intensities for reaching similar behavioral outcomes.

      We concur with Reviewer 2 and will prepare data using GtACR1 as a reference.

      (9) The author should calculate or reasonably estimate the in vivo light intensity during optogenetic stimulation to provide a meaningful comparison to their in vitro characterization. Ideally, they can provide an estimated volume for efficient stimulation of MsACR1 and raACR1 and compare it to other optogenetic tools.

      We will conduct a Monte Carlo simulation and offer a comparison of the effective activation volume across various classes of optogenetic tools.

      Minor concerns:

      (1) Why were st- MsACR1 and raACR1 used in vitro but not in vivo? The viral constructs were described as AAV/DJ-hSyn1-MsACR-mCerulean and AAV/DJ-hSyn1-raACR-mCerulean.

      As mentioned earlier, we were unable to produce purified soma-targeted MsACR variants before the manuscript submission. We will now provide these measurements.

      (2) Light intensities for the spectral measurements are missing.

      During action spectra measurements, a motorised neutral density filter wheel is used to have equal photon flux for all tested wavelengths. Additionally, the light intensity is further reduced by using additional neutral density filters to ensure sufficiently low photocurrents to determine the spectral maximum. Therefore, the light intensity varied between constructs and sometimes measurements. We added the following line to the respective methods section to further clarify this: “(typically in the low µW-range at 𝜆max)”.

      (3) MsACR1 is slower and probably more light-sensitive than raACR1, which is faster but has larger photocurrents. These are complementary tradeoffs, and the audience might wonder how MsACR1 and raACR1 photocurrents compare under similar conditions. Therefore, I suggest an alternative representation in Fig. 2C. That is, the presentation of the excitation spectra under similar light intensities and with absolute photocurrent values.

      Unfortunately, due to the reasons stated above, MsACR1 and raACR action spectra were not recorded with the same light intensity. However, MsACR1 and raACR are compared under the same conditions for Fig. 2B, E, and F (560 nm light at ~3.2 mW/mm2) as well as in Supp. Fig. 4C.

      (4) Figure legends for figures 3F and G are missing details for describing the stimulation paradigm.

      We added more details about the stimulation paradigm.

    1. Author response:

      Reviewer #1 (Public Review):

      Summary:

      This work sets out to elucidate mechanistic intricacies in inflammatory responses in pneumonia in the context of the aging process (Terc deficiency - telomerase functionality).

      Strengths:

      Very interesting, conceptually speaking, approach that is by all means worth pursuing. An overall proper approach to the posited aim.

      We want to thank the reviewer for taking the time to review our manuscript and for providing positive feedback regarding our research question.

      Weaknesses:

      The work is heavily underpowered and may have statistical deficits. This precludes it in its current state from drawing unequivocal conclusions.

      Thank you for this essential and valuable comment. We fully accept that the small sample size of the Tercko/ko mice is a major limitation of our study and transparently discuss this in our manuscript.

      However, due to Animal Welfare regulations, only a reduced number of mice were approved because of the strong burden of disease. Consequently, only three non-infected and five infected mice were available to us. This reduced number of mice presents a clear limitation to our study. However, due to ethical considerations related to animal welfare and sustainability, as well as compliance with German animal welfare regulations, it is not possible to obtain additional Tercko/ko mice to increase the dataset. The animal studies are an important aspect of our study; however, our hypothesis was also investigated at multiple levels, including in an in vitro co-culture model (Figure 5), to ensure comprehensive analysis.

      Thus, we clearly demonstrated that S. aureus pneumonia in Tercko/ko mice leads to a more severe phenotype, orchestrated by the dysregulation of both innate and adaptive immune response.

      Reviewer #2 (Public Review):

      Summary:

      The authors demonstrate heightened susceptibility of Terc-KO mice to S. aureus-induced pneumonia, perform gene expression analysis from the infected lungs, find an elevated inflammatory (NLRP3) signature in some Terc-KO but not control mice, and some reduction in T cell signatures. Based on that, They conclude that disregulated inflammation and T-cell dysfunction play a major role in these phenomena.

      Strengths:

      The strengths of the work include a problem not previously addressed (the role of the Terc component of the telomerase complex) in certain aspects of resistance to bacterial infection and innate (and maybe adaptive) immune function.

      We would like to thank the reviewer for the positive feedback regarding our aim to investigate the impact of Terc deletion on the pulmonary immune response to S. aureus.

      Weaknesses:

      The weaknesses outweigh the strengths, dominantly because conclusions are plagued by flaws in experimental design, by lack of rigorous controls, and by incomplete and inadequate approaches to testing immune function. These weaknesses are as follows

      (1)  Terc-KO mice are a genomic knockout model, and therefore the authors need to carefully consider the impact of this KO on a wide range of tissues. This, however, is not the case. There are no attempts to perform cell transfers or use irradiation chimera or crosses that would be informative.

      We thank the reviewer for bringing up this important point. The aim of our study, however; was to investigate the impact of Terc deletion in the lung and on the response to bacterial pneumonia, rather than to provide a comprehensive characterization of the Tercko/ko model itself. This characterization of different tissues and cell types has already been conducted by previous studies. For instance, studies that characterize the general phenotype of the model (Herrera et al., 1999; Lee et al., 1998; Rudolph et al., 1999) but also investigations that shed light on the impact of Terc deletion on specific cell types such as microglia (Khan et al., 2015) or T cells (Matthe et al., 2022). The impact of Terc deletion on T cells is also discussed in our manuscript in lines 89 to 105. Furthermore, a section about the general phenotype of the Terc deletion model is included in the introduction in lines 126 to 138. Thus we discussed the relevant literature regarding Tercko/ko mice in our manuscript and attempted to provide a more in-depth characterization of the lung by investigating the inflammatory response to infection as well as changes in the gene expression (Figure 2-4).

      (2)  Throughout the manuscript the authors invoke the role of telomere shortening in aging, and according to them, their Terc-KO mice should be one potential model for aging. Yet the authors consistently describe major differences between young Terc-KO and naturally aging old mice, with no discussion of the implications. This further confuses the biological significance of this work as presented.

      Thank you for mentioning this relevant point. We want to apologize for the confusion regarding this matter. While Tercko/ko mice are a well-established model for premature aging, these effects become more apparent with increasing generations (G) and thus, G5 and 6 mice are the most affected by Terc deletion (Lee et al., 1998; Wong et al., 2008).

      Thus, while Tercko/ko mice are a common model for premature aging, this accelerated aging phenotype is predominantly apparent in later-generation Tercko/ko (G5 and 6) or aged Tercko/ko mice (Lee et al., 1998; Wong et al., 2008). Since the aim of this study was to analyze the impact of Terc deletion on the lung and its immune response to bacterial infections instead of the impact of telomere shortening and telomerase dysfunction, young G3 Tercko/ko mice (8 weeks) were used in this study. This is also mentioned in the lines 131-134. In this study, Tercko/ko mice were used not as a model of aging, but rather as a model specifically for Terc deletion. The old WT mice function as a control cohort to observe possible common but also deviating effects between aging and Terc deletion. In our sequencing data, we observe that uninfected young WT mice are very similar to uninfected Tercko/ko mice. Other studies have also reported this lack of major differences between uninfected WT and Tercko/ko mice in the G3 knockout mice (Kang et al., 2018). Conversely, uninfected young WT and Tercko/ko mice exhibited great differences, for instance, regarding the numbers of differentially expressed genes (Supplemental Figure 1H). Thus, differences between naturally aged mice and young G3 Tercko/ko mice are not surprising. To clarify this aspect we reconstructed the paragraph discussing the Tercko/ko mice (lines 126-134). Additionally we added a paragraph explaining the purpose of the naturally aged mice to the lines 134 to 138:

      “As control cohort age-matched young WT mice were utilized. To investigate whether Terc deletion, beyond critical telomere shortening, impacts the pulmonary immune response, we used young Tercko/ko mice. Additionally, naturally aged mice (2 years old) were infected to explore the potential link to a fully developed aging phenotype.”

      (3)  Related to #2, group design for comparisons lacks a clear rationale. The authors stipulate that Terc- KO will mimic natural aging, but in fact, the only significant differences seen between groups in susceptibility to S. aureus are, contrary to the authors' expectation, between young Terc-KO and naturally old mice (Figures 1A and B, no difference between young Terc-KO and young wt); or there are no significant differences at all between groups (Figures 1, C, D,).

      We thank the reviewer for this essential comment. As mentioned above the Tercko/ko mice in this study are not selected to model natural aging. To model telomerase dysfunction and accelerated aging selection of later generation or aged Tercko/ko mice would have been more suitable.

      The lack of statistical significance in some figures is likely due to the heterogeneity of disease phenotype of S. aureus infection in mice, which is a limitation of our study that we discuss in our discussion section in lines 577-583. The phenotype of S. aureus infection can vary greatly within a mouse population, highlighting the limitations of mice as a model for S. aureus infections. To account for this heterogeneity we divided the infected Tercko/ko mice cohort into different degrees of severity based on the clinical score and the presence of bacteria in organs other than the lung (mice with systemic infection).

      Despite the heterogeneity especially within the Tercko/ko mice cohort the differences between the knockout and young as well as old WT mice were striking. Including the fatal infections, 80% of the Tercko/ko mice had a severe course of disease, while none of the WT mice displayed a severe course (Figure 1A, B and Supplemental Figure 1A, B). This hints towards a clear role of Terc in the response to S. aureus infection in mice. Thus while in some figures the differences are not significant, strong trends towards a more severe phenotype of S. aureus infection in the Tercko/ko mice regarding bacterial load, score and inflammatory response could be observed in our study.

      Another example of inadequate group design is when the authors begin dividing their Terc-KO groups by clinical score into animals with or without "systemic infection" (the condition where a bacterium spreads uncontrollably across the many organs and via blood, which should be properly called sepsis), and then compare this sepsis group to other groups (Supplementary Figures 1G; Figure 2; lines 374-376 and 389- 391). This gives them significant differences in several figures, but because they did not clearly indicate where they applied this stratification in the figure legends, the data are somewhat confusing. Most importantly, methodologically it is highly inappropriate to compare one mouse with sepsis to another one without. If Terc-KO mice with sepsis are a comparator group, then their controls have to be wild-type mice with sepsis, who are dealing with the same high bacterial load across the body and are presumably forced to deploy the same set of immune defenses.

      We sincerely appreciate the significant time and effort you have invested in reviewing our manuscript. However, with all due respect, we must point out that the definition of sepsis you have referenced is considered outdated. According to the Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3), sepsis is defined as "a life-threatening organ dysfunction caused by a dysregulated host response to infection" (Marvin Singer, 2016, JAMA). Given this fundamental misunderstanding of our findings, we find the comment regarding the inadequacy of our groups to be both dismissive and lacking in scientific merit. We would like to emphasize that the group size used in our study is consistent with accepted standards in infection research. We strongly reject any insinuations of inadequacy that have been repeatedly mentioned throughout the review.

      In order to provide a nuanced investigation of disease severity in Tercko/ko mice, we added the term “systemic infection” to the figures whenever the mice were divided into groups of mice with and without systemic infection. This is the case for Figure 2A and Supplemental Figure 1C-E. The division into mice with and without systemic infection is also mentioned in the figure legend of Figure 2A in lines 933 to 936 and for Supplemental Figure 1 in lines 1053-1054. We agree that Supplemental Figure 1G is somewhat confusing as the mice with systemic infection are highlighted in this graph but not included as a separate group within our sequencing analysis. We added a sentence to the figure legend clarifying this (lines 1042-1045):

      “Nevertheless, the infected Tercko/ko mice were considered one group for the expression analysis and not split into separate groups for the subsequent analysis.”

      Additionally, we revised the section regarding this grouping in different degrees of severity in our Material and Methods section to clarify that this division was only performed for specific analysis (line 191):

      “…for the indicated analysis.”

      Furthermore, the mice which were classified as systemically infected mice were not septic mice, as mentioned above. Those mice were classified by us as systemically infected based on their clinical score and the presence of bacteria in other organs than the lung as stated in the lines 188-191 and 377-382.

      Bacteremia is a symptom of very severe cases of hospital-acquired pneumonia with a very high mortality (De la Calle et al., 2016).

      Therefore, the systemically infected mice or rather mice with bacteremia display an especially severe pneumonia phenotype, which is distinct from sepsis. The presence of this symptom in our Tercko/ko mice further highlights the clinical relevance of our study. This aspect was added to the manuscript in the lines 569-571.

      “The detection of bacteria in extra pulmonary organs is of particular interest, as bacteremia is a symptom of severe pneumonia and is associated with high mortality (De la Calle et al., 2016).”

      (4)  The authors conclude that disregulated inflammation and T-cell dysfunction play a major role in S. aureus susceptibility. This may or may not be an important observation, because many KO mice are abnormal for a variety of reasons, and until such reasons are mechanistically dissected, the physiological importance of the observation will remain unclear.

      Two points are important here. First, there is no natural counterpart to a Terc-KO, which is a complete loss of a key non-enzymatic component of the telomerase complex starting in utero.

      Second, the authors truly did not examine the key basic features of their model, including the features of basic and induced inflammatory and immune responses. This analysis could be done either using model antigens in adjuvants, defined innate immune stimuli (e.g. TLR, RLR, or NLR agonists), or microbial challenge. The only data provided along these lines are the baseline frequencies of total T cells in the spleen of the three groups of mice examined (not statistically significant, Figure 4B). We do not know if the composition of naïve to memory T cell subsets may have been different, and more importantly, we have no data to evaluate whether recruitment of the immune response (including T cells) to the lung upon microbial challenge is similar or different. So, what are the numbers and percentages of T cells and alveolar macrophages in the lung following S. aureus challenge and are they even comparable or are there issues in mobilizing the T cell response to the site of infection? If, for example, Terc-KO mice do not mobilize enough T cells to the lung during infection, that would explain the paucity in many T-cell- associated genes in their transcriptomic set that the authors report. That in turn may not mean dysfunction of T cells but potentially a whole different set of defects in coordinating the response in Terc-KO mice.

      We thank the reviewer for highlighting these important aspects. Regarding the first point, indeed there is no naturally occurring deletion of Terc in humans. However, studies reported reduced expression of Terc and Tert in the tissues of aged mice and rats (Tarry-Adkins et al., 2021; Zhang et al., 2018). Terc itself has been found to have several important immunomodulatory functions such as the activation of the NF- κB or PI3-kinase pathway (Liu et al., 2019; Wu et al., 2022). As those aforementioned pathways are relevant for the immune response to S. aureus infections, the authors were interested in exploring the impact of Terc deletion on the pulmonary immune response. The potential immunomodulatory functions of Terc are discussed in lines 106-121. To further clarify our rationale we added a sentence to the introduction in lines 121-125.

      “Interestingly, downregulation of Terc and Tert expression in tissues of aged mice and rats has been found (Tarry-Adkins, Aiken, Dearden, Fernandez-Twinn, & Ozanne, 2021; Zhang et al., 2018).

      Therefore, as a potential immunomodulatory factor reduced Terc expression could be connected to age- related pathologies.”

      Regarding the second point, as we focused on the effect of Terc deletion in the lung and its role in S. aureus infection, we investigated inflammatory and immune response parameters relevant to this setting. For instance, inflammation parameters in the lungs of all three mice cohorts were measured to investigate differences in the inflammatory response in the non-infected and infected mice (Figure 2A). Those measurements showed no baseline difference in key inflammatory parameters between young WT and Tercko/ko mice, which is consistent with previous findings (Kang et al., 2018). The inflammatory response to infection with S. aureus in the Tercko/ko mice cohort differed significantly from the other cohorts (Figure 2A), hinting towards a dysregulated inflammatory response due to Terc deletion. Furthermore, we investigated general immune cell frequencies such as dendritic cells, macrophages, and B cells in the spleen of all three mice cohorts to gather a baseline understanding of the general immune cell populations. In our manuscript only total T cell frequencies were included due to its relevance for our data regarding T cells (Figure 4B). This data could show that there was no difference of total amount of T cells in the spleen of all three mice cohorts. For a more detailed insight into our analysis we added the frequencies of the other immune cell populations analyzed in the spleen as a Supplemental Figure 3B-F. Additionally, a figure legend for the graphs was added.

      Therefore, while we did not analyze baseline frequencies of specific populations of T cells, we analyzed and characterized the inflammatory and immune response of our model in a way relevant to our research question.

      The differences observed in T cell marker and TCR gene expression was also partly present between the uninfected and infected Tercko/ko mice such as the complete absence of CD247 expression in infected Tercko/ko, which is however expressed in uninfected mice of this cohort (Figure 4A, C and D). Thus, this effect cannot be solely attributed to an inadequate mobilization of T cells to the lung after infectious challenge. However, we agree that a more detailed insight into recruited immune cells to the lung or frequencies of different T cell populations could contribute to a better understanding of the proposed mechanism and would be an interesting experiment to conduct in further studies. We accept this as a limitation of our study and included it in our discussion section in lines 720-724:

      “As total CD4+ T cells were analyzed in this study, it would be useful to investigate specific T cell populations such as memory and effector T cells to elucidate the potential mechanism leading to T cell dysfunctionality in further detail. Additionally, analysis of differences in immune cell recruitment to the lungs between young WT and Tercko/ko mice would be relevant.”

      (5)  Related to that, immunological analysis is also inadequate. First, the authors pull signatures from the total lung tissue, which is both imprecise and potentially skewed by differences, not in gene expression but in types of cells present and/or their abundance, a feature known to be affected by aging and perhaps by Terc deficiency during infection. Second, to draw any conclusions about immune responses, the authors would have to track antigen-specific T cells, which is possible for a wide range of microbial pathogens using peptide-MHC multimers. This would allow highly precise analysis of phenomena the authors are trying to conclude about. Moreover, it would allow them to confirm their gene expression data in populations of physiological interest

      We thank the reviewer for highlighting this important and relevant point. In our study, we aimed to investigate the role of Terc expression in modulating inflammation and the immune response to S. aureus infection in the lung. To address this, we examined the overall impact of age, genotype, and infection on lung inflammation and gene expression. Therefore, sequencing of total lung tissue was essential for addressing the research question posed. Our findings demonstrate that Tercko/ko mice exhibit a more severe phenotype following S. aureus infection, characterized by an increased bacterial load and heightened lung inflammation (Figures 1 and 2). Furthermore, our data suggest that Terc plays a role in regulating inflammation through activation of the NLRP3 inflammasome, along with the dysregulation of several T cell marker genes (Figures 2, 4, and 5). However, this study lacks a detailed analysis of distinct T cell populations, including antigen-specific T cells, as noted earlier. Investigating these aspects in future studies would be valuable to validate and expand upon our findings. We have incorporated these suggestions into the discussion section (lines 720-724)

      “As total CD4+ T cells were analyzed in this study, it would be useful to investigate specific T cell populations such as memory and effector T cells to elucidate the potential mechanism leading to T cell dysfunctionality in further detail. Additionally, analysis of differences in immune cell recruitment to the lungs between young WT and Tercko/ko mice would be relevant.”

      Nevertheless, our study provides first evidence of a potential connection between T cell functionality and Terc expression.

      Third, the authors co-incubate AM and T cells with S. aureus. There is no information here about the phenotype of T cells used. Were they naïve, and how many S. aureus-specific T cells did they contain? Or were they a mix of different cell types, which we know will change with aging (fewer naïve and many more memory cells of different flavors), and maybe even with a Terc-KO? Naïve T cells do not interact with AM; only effector and memory cells would be able to do so, once they have been primed by contact with dendritic cells bringing antigen into the lymphoid tissues, so it is unclear what the authors are modeling here. Mature primed effector T cells would go to the lung and would interact with AM, but it is almost certain that the authors did not generate these cells for their experiment (or at least nothing like that was described in the methods or the text).

      Thank you for bringing up this important question. For the co-cultivation experiment of T cells and alveolar macrophages, total CD4+ T cells of both young WT and Tercko/ko were used. We did not select for a specific population of T cells. Our sequencing data indicated the complete downregulation of CD247 expression, which is an important part of the T cell receptor, in the lungs of infected Tercko/ko mice (Figure 4A, C and D). Given that this factor is downregulated under chronic inflammatory conditions, we investigated the impact of the inflammatory response in alveolar macrophages on the expression of various T cell-derived cytokines, as well as CD247 expression (Figure 5D, E) (Dexiu et al., 2022). This aspect is also highlighted in the discussion in lines 623-637. Therefore, a co-cultivation model of T cells and alveolar macrophages was established and confronted with heat-killed S. aureus to elicit an inflammatory response of the macrophages. To emphasize this purpose, we have revised our statement about the model setup in lines 517-519 of the manuscript:

      “An overactive inflammatory response could be a potential explanation for the dysregulated TCR signaling.”

      The authors hope this will clarify the intent behind the model setup.

      (6)  Overall, the authors began to address the role of Terc in bacterial susceptibility, but to what extent that specifically involves inflammation and macrophages, T cell immunity, or aging remains unclear at present.

      We thank the reviewer for the helpful and relevant comments. The authors accept the limitations of the presented study such as the reduced number of Tercko/ko mice and the limitations of murine models for S. aureus infection itself and discuss those in the discussion section in the lines 559-561; 577-583; 690-692 and 720-726. However, we hope that our responses have provided sufficient evidence to convince the reviewer that our data supports a clear role for Terc expression in regulating the immune response to bacterial infections, particularly with respect to inflammation and its potential connection to T cell functionality.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:<br /> I really enjoyed this manuscript from Torsekar et al on "Contrasting responses to aridity by

      different-sized decomposers cause similar decomposition rates across a precipitation gradient". The authors aimed to examine how climate interacts with decomposers of different size categories to influence litter decomposition. They proposed a new hypothesis: "The opposing climatic dependencies of macrofauna and that of microorganisms and mesofauna should lead to similar overall decomposition rates across precipitation gradients".

      This study emphasizes the importance as well as the contribution of different groups of organisms (micro, meso, macro, and whole community) across different seasons (summer with the following characteristics: hot with no precipitation, and winter with the following characteristics: cooler and wetter winter) along a precipitation gradient. The authors made use of 1050 litter baskets with different mesh sizes to capture decomposers contribution. They proposed a new hypothesis that was aiming to understand the "dryland decomposition conundrum". They combined their decomposition experiment with the sampling of decomposers by using pittfall traps across both experiment seasons. This study was carried out in Israel and based on a single litter species that is native to all seven sites. The authors found that microorganism contribution dominated in winter while macrofauna decomposition dominated the overall decomposition in summer. These seasonality differences combined with the differences in different decomposers groups fluctuation along precipitation resulted in similar overall decomposition rates across sites.<br /> I believe this manuscript has a potential to advance our knowledge on litter decomposition.

      Strengths:

      Well design study with combination of different approaches (methods) and consideration of seasonality to generalize pattern.

      The study expands to current understanding of litter decomposition and interaction between factors affecting the process (here climate and decomposers).

      Weaknesses:

      The study was only based on a single litter species.

      We now discuss the advantages and limitations of this approach in the methods and devote a completely new paragraph to this important point in the discussion (lines 394-401).

      Reviewer #2 (Public Review):

      Summary: Torsekar et al. use a leaf litter decomposition experiment across seasons, and in an aridity gradient, to provide a careful test of the role of different-sized soil invertebrates in shaping the rates of leaf litter decomposition. The authors found that large-sized invertebrates are more active in the summer and small-sized invertebrates in the winter. The summed effects of all invets then translated into similar levels of decomposition across seasons. The system breaks down in hyper-arid sites.

      Strengths: This is a well-written manuscript that provides a complete statistical analysis of a nice dataset. The authors provide a complete discussion of their results in the current literature.

      Weaknesses:

      I have only three minor comments. Please standardize the color across ALL figures (use the same color always for the same thing, and be friendly to color-blind people).

      Thank you for this important suggestion. We have now changed all figures to standardize all colors and chose a more color-blind friendly pallete.

      Fig 1 may benefit from separating the orange line (micro and meso) into two lines that reflect your experimental setup and results. I would mention the dryland decomposition conundrum earlier in the Introduction.

      We based our novel hypotheses on a thorough literature search. Accordingly, decomposition is expected to be positively associated with moisture, regardless of the decomposer body size. Our contribution to theory was to suggest that macro-detritivores may respond very differently to climatic conditions and dominate litter decomposition in warm arid-lands (we listed the reasons in the text). Consequently, we did not distinguish between microorganisms and mesofauna. We assumed that both groups inhabit the litter substrate and have limited adaptation to dry conditions. Our results provide strong evidence that this presumption is likely wrong and that mesofauna respond to climate very differently from micro-decomposers. Yet, we cannot use hindsight understanding to improve our original hypothesis. We now emphasize this important point at the discussion as important future direction. 

      Although we are very appreciative and pleased with the reviewer enthusiasm to highlight the importance of our work as a possible solution to the longstanding dryland decomposition conundrum, we decided not to move it to the introduction. This is because we think that our work is not centred on resolving the DDC but provides more general principles that may lead to a paradigm shift in the way ecologists study nutrient cycling across ecosystems.

      And the manuscript is full of minor grammatical errors. Some careful reading and fixing of all these minor mistakes here and there would be needed.

      We apologize and did our best to find and fix those mistakes

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I really enjoyed this manuscript from Torsekar et al on "Contrasting responses to aridity by different-sized decomposers cause similar decomposition rates across a precipitation gradient". The authors aimed to examine how climate interacts with decomposers of different size categories to influence litter decomposition. They proposed a new hypothesis: "The opposing climatic dependencies of macrofauna and that of microorganisms and mesofauna should lead to similar overall decomposition rates across precipitation gradients".

      This study emphasizes the importance as well as the contribution of different groups of organisms (micro, meso, macro, and whole community) across different seasons (summer with the following characteristics: hot with no precipitation, and winter with the following characteristics: cooler and wetter winter) along a precipitation gradient. The authors made use of 1050 litter baskets with different mesh sizes to capture decomposers contribution. They proposed a new hypothesis that was aiming to understand the "dryland decomposition conundrum". They combined their decomposition experiment with the sampling of decomposers by using pitfall traps across both experiment seasons. This study was carried out in Israel and based on a single litter species that is native to all seven sites. The authors found that microorganism contribution dominated in winter while macrofauna decomposition dominated the overall decomposition in summer. These seasonality differences combined with the differences in different decomposers groups fluctuation along precipitation resulted in similar overall decomposition rates across sites.

      I believe this manuscript has the potential to advance our knowledge on litter decomposition. Below i provide my general and specific comments.

      General comments:

      (1) Study in general is well designed and well thought beforehand,

      (2) Study aims to expand the current understanding of the dryland decomposition conundrum

      (3) The should put a caveat to the fact they only use one litter species and call for examining litter mixture in the same gradient.

      (4) Please check the way you reduce the random effects from your initial model, I have provided a better way to do so in my specific comments

      (5) For Figure 1, authors can check my comment on this and see if they could revise the figure.

      Thank you for the positive feedback and your valuable comments. We have tried to best address all comments and suggestions for improvement and clarification

      Specific comments

      Line # 57 Please write "Theory suggests" instead of "Theory suggest"

      We changed the text as suggested

      Line # 70, please write "Indeed, handful evidence shows" instead of "Indeed, handful evidence show"

      We changed the text as suggested

      Figure 1: I like this conceptual framework. I have a silly question, why is it that the slopes of the whole community at the beginning (between Hyperarid and Arid) is the same as the Macro fauna, I would think the slope should be higher as this is adding up right? and also the same goes for the decomposition of whole community later on. For me this should reflect the adding or summing up (if i am right) then the authors should think about how this could be reflected in the figure.

      We agree with your interpretation that the whole community decomposition reflects the addition by constituent decomposers. The slope of the whole community decomposition between hyper-arid and arid is slightly higher than the one of macro decomposition to reflect the additive effect of macro with meso+micro decomposition. We have now changed the figure slightly to make this point more visible (Line 106).

      Line # 111 Please make "Methods" bold as well to be consistent with others headings.

      We changed the formatting as suggested

      Line #125 and in other lines as well please replace "X" by "x" to denote multiplication.

      We changed the formatting as suggested

      Table 1 Please add "*" to climate like this "Climate*" so that the end note of the table could make sense

      Thank you for this suggestion. We have now added the asterisk referring to the note below the Table.

      Figure 2, please consider putting at line #133, mean annual precipitation (MAP), as such for line # 135 You can directly says The precipitation map ....

      We made both changes as suggested.

      Line # 138 I would not use the different units for the same values. I do understand that you want to emphasize the accuracy but i would write instead 3 +- 0.001 g

      We changed the units as suggested.

      Line # 145, how is the litter basket customized to rest at 1 cm above ground level?

      We have now clarified –that we cut-open windows one centimeter above the cage floor. The cages were positioned on the soil (line 144).

      Lines # 181-183, I like the approach of checking the necessity of having the random effects. However, it has been reported that likelihood ratio test (LRT) are not really reliable to test for random effects. I will suggest you rather use permutations instead. I think the function is confint(MODEL) you need to specify the number of permutation the higher the better but you should start with 99 first and see how the results look like if promising then you can even go to 9999. But it will need computation power and and time.

      Thank you for the suggestion. We now used a simulation-based exact test, instead of a LRT, to examine the random effect, as recommended by the authors from the “lme4” package. As recommended, we used 9999 simulations. The simulation test yielded a similar result to those originally reported (see lines 181-183).

      Line # 187, 188, 188, please do not use capital letter to start mesofauna, macrofauna and whole-community

      We changed the formatting as suggested

      Line # 205 Please add the version number of R in the text.

      We now included the version number as suggested.

      Line # 209-211, could you please check whether "then" is the word you want to use or "than"

      Our bad- we indeed meant “than” and have made the appropriate changes.

      Line # 227 and in other places as well please provide the second degree of freedom of the F test.

      Thank you for this important comment. We have now added the second degree of freedom to the relevant results (lines 229, 232).

      Figure 3 and Figure 4 show some results that are negative, can you please explain what might be the reasons behind this?

      We now explain this important point in the figures’ captions.

      Figure 5 Please add label to the x-axis.

      Thank you-we have now included a label.

      Line # 357, the sentence "... meso-decomposition, like microbial decomposition,...", I don't understand which criteria authors used to classify microbial decomposition as "meso-decomposition"?

      We now remove this potential cause of confusion by using the term ‘meso-decomposition’ to distinguish from microbial decomposition (Line 366).

      Line # 380 Kindly put "per se" in italic.

      We changed the formatting as suggested

      References

      The references format are not consistent. For example for the same journal (say Trends in Ecology and Evolution) the authors sometimes wrote the full name like at line # 36 (and also realize that "vol" should not be written as such) but wrote the abbreviations at line #42

      Our bad- we apologize and carefully checked all references to make sure the style is consistent.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Strengths: 

      Overall the work is novel and moves the field of Alzheimer's disease forward in a significant way. The manuscript reports a novel concept of aberrant activity in VIP interneurons during the early stages of AD thus contributing to dysfunctions of the CA1 microcircuit. This results in the enhancement of the inhibitory tone on the primary cells of CA1. Thus, the disinhibition by VIP interneurons of Principal Cells is dampened. The manuscript was skillfully composed, and the study was of strong scientific rigor featuring well-designed experiments. Necessary controls were present. Both sexes were included.

      We express our gratitude to the reviewer for their keen appreciation of our efforts and their enthusiasm for the outcomes of this research.

      Limitations:

      (1) The authors attributed aberrant circuit activity to the accumulation of "Abeta intracellularly" inside IS-3 cells. That is problematic. 6E10 antibody recognizes amyloid plaques in addition to Amyloid Precursor Protein (APP) as well as the C99 fragment. There are no plaques at the ages 3xTg mice were examined. Thus, the staining shown in Figure 1a is of APP/C99 inside neurons, not abeta accumulations in neurons. At the ages of 3-6 months, 3xTg starts producing abeta oligomers and potentially tau oligomers as well (Takeda et al., 2013 PMID: 23640054; Takeda et al., 2015 PMID: 26458742 and others). Emerging literature suggests that abeta and tau oligomers disrupt circuit function. Thus, a more likely explanation of abeta and tau oligomers disrupting the activity of VIP neurons is plausible.

      The Reviewer correctly points out that 3xTg-AD mice typically do not exhibit plaques before 6 months of age, with limited amounts even up to 12 months, particularly in the hippocampus. To the best of our knowledge, the 6E10 antibody binds to an epitope in APP (682-687) that is also present in the Abeta (3-8) peptide. Consequently, 6E10 detects full-length APP, α-APP (soluble alpha-secretase-cleaved APP), and Abeta (LaFerla et al., 2007). Nonetheless, we concur with the Reviewer's observation that the detected signal includes Abeta oligomers and the C99 fragment, which is currently considered an early marker of AD pathology (Takasugi et al., 2023; Tanuma et al., 2023). Studies have demonstrated intracellular accumulation of C99 in 3-month-old 3xTg mice (Lauritzen et al., 2012), and its binding to the Kv7 potassium channel family, which results in inhibiting their activity (Manville and Abbott, 2021). If a similar mechanism operates in IS-3 cells, it could explain the changes in their firing properties observed in our study. Consequently, we have revised the manuscript to include this crucial information in both the Results and Discussion sections.

      (2) Authors suggest that their animals do not exhibit loss of synaptic connections and show Figure 3d in support of that suggestion. However, imaging with confocal microscopy of 70micron thick sections would not allow the resolution of pre- and post-synaptic terminals. More sensitive measures such as electron microscopy or array tomography are the appropriate techniques to pursue. It is important for the authors to either remove that data from the manuscript or address the limitations of their technique in the discussion section. There is a possibility of loss of synaptic connections in their mouse model at the ages examined.

      We appreciate the Reviewer’s perspective on the techniques used for imaging synaptic connections. While we acknowledge the limitations of confocal microscopy for resolving pre- and post-synaptic structures in thick sections, we respectfully disagree regarding the exclusive suitability of electron microscopy (EM). Our approach involved confocal 3D image acquisition using a 63x objective at 0.2 um lateral resolution and 0.25 Z-step, providing valuable quantitative insights into synaptic bouton density. Despite the challenges posed by thick sections, this method together with automatic analysis allows for careful quantification. Although EM offers unparalleled resolution, it presents challenges in quantification. We have included the important details regarding image acquisition and analysis in the revised manuscript.

      Reviewer #2 (Public Review):

      Summary:

      The submitted manuscript by Michaud and Francavilla et al., is a very interesting study describing early disruptions in the disinhibitory modulation exerted by VIP+ interneurons in CA1, in a triple transgenic model of Alzheimer's disease. They provide a comprehensive analysis at the cellular, synaptic, network, and behavioral level on how these changes correlate and might be related to behavioral impairments during these early stages of the disease.

      Main findings:

      - 3xTg mice show early Aß accumulation in VIP-positive interneurons.

      - 3xTg mice show deficits in a spatially modified version of the novel object recognition test. - 3xTg mice VIP cells present slower action potentials and diminished firing frequency upon current injection.

      - 3xTg mice show diminished spontaneous IPSC frequency with slower kinetics in Oriens / Alveus interneurons.

      - 3xTg mice show increased O/A interneuron activity during specific behavioral conditions. - 3xTg mice show decreased pyramidal cell activity during specific behavioral conditions.

      Strengths:

      This study is very important for understanding the pathophysiology of Alzheimer´s disease and the crucial role of interneurons in the hippocampus in healthy and pathological conditions.

      We are thankful to the reviewer for their insightful recognition of our efforts and their enthusiasm for the results of this research.

      Weaknesses:

      Although results nicely suggest that deficits in VIP physiological properties are related to the differences in network activity, there is no demonstration of causality.

      We completely agree with the reviewer's observation regarding the lack of demonstration of causality in our results. Investigating causality in the relationship between deficits in VIP physiological properties and differences in network activity is indeed a crucial aspect of this project. However, achieving this goal will require a significant amount of time and dedicated manipulations in a new mouse model (VIP-Cre-3xTg). We appreciate the importance of this line of investigation and consider it as a priority for our future research endeavors.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Limitations:

      (1) The authors should describe their model and state the age at which these mice start depositing amyloid plaques and neurofibrillary tangles. Readers might not be familiar with this model. It is also important to mention that circuit disruptions are assessed prior to plaque and tangle formation.

      We have included a detailed description of the 3xTg-AD mouse model in the Introduction section, including information on the age at which amyloid plaques and neurofibrillary tangles begin to appear. Additionally, we have clarified that circuit disruptions were assessed before the formation of plaques and tangles. These details have been added to both the Introduction and the Results sections to ensure clarity for readers unfamiliar with the model.

      (2) Ns are presented in Supplemental Table 1. Units are presented in a note to Supplementary Table 1. It would be advisable to specify Ns and units as the data is being presented in the results section or figure legends for easy access.

      We have now included the Ns (sample sizes), specifying the number of cells or sections and the number of experimental animals, directly within the Results section and in the figure legends. This ensures that readers have immediate access to this information without needing to refer to the supplementary materials.

      (3) Several typos require correction:

      a. "mamory" - Line 22, page 5.

      b. The term "Interneurons" is abbreviated as both "INs" and "IN" throughout the manuscript. The author should consistently choose one abbreviation.

      We have corrected the typo "mamory" to "memory" on line 22, page 5. Additionally, we have standardized the abbreviation for "Interneurons" to "INs" throughout the manuscript for consistency.

      (4) Note 2 in Supplementary Table 1 states that animals of both sexes with equal distribution were used throughout the study. It would be best for the reader to assess the data distribution based on sex. Thus, it is advisable for the authors to depict male and female data points as distinct symbols throughout the figures.

      Unfortunately, we do not have detailed sex-disaggregated data for all datasets, which limits our ability to depict male and female data points separately across all figures. Therefore, we have opted to pool data from both sexes for a more comprehensive analysis. We believe this approach maintains the robustness of our findings.

      Reviewer #2 (Recommendations for the authors):

      Major Points:

      - To keep the logical line of reasoning and to be able to interpret the results, it would be important to use the same metrics when comparing the population activity of O/A interneurons and principal cells in the different behavioral conditions.

      We have revised Figures 4 and 5 to enhance the coherence in data presentation. This includes using consistent metrics for comparing the population activity of both O/A interneurons and principal cells across different behavioral conditions. These changes ensure a clearer and more logical interpretation of the results.

      - Although results nicely suggest that deficits in VIP physiological properties are related to the differences in network activity, there is no demonstration of causality. Would it be possible to test if manipulating VIP neurons one could obtain such specific results? Alternatively, it could be discussed more in detail how the decrease in disinhibition could lead to the changes in network activity demonstrated here.

      We agree with the reviewer that establishing causality between VIP neuron deficits and changes in network activity would be very important. However, demonstrating causality would require a new line of investigation, involving the use of specific mouse models to selectively manipulate VIP neurons. This is an exciting direction that we plan to prioritize in our future research. For this study, we have included a discussion on the potential mechanisms by which decreased disinhibition might lead to the observed changes in network activity. Specifically, we propose that in young adult 3xTg-AD mice, the altered firing of I-S3 cells may lead to enhanced inhibition of principal cells. This could shift the excitation/inhibition balance, input integration and firing output of principal cells thereby impacting overall network activity. These points are discussed in detail in the revised Discussion section.

      - On the same lines the correlations showed in the manuscript, would be more robust if there was an in vivo demonstration that 3xTg mice indeed show decreased activity in vivo. The same experiments could also clarify if VIP cells in control animals are more active at the time of decision-making and during object exploration as suggested in the manuscript.

      Thank you for your comment. In response to the point raised, we would like to highlight that we have recently documented the increased activity of VIP-INs in the D-zone of the T-maze and during object exploration in a study published in Cell Reports (Tamboli et al., 2024). This publication is now referenced in our manuscript to support our findings. Regarding the in vivo activity of 3xTg mice, our observations indicated no significant differences in major behavioral patterns such as locomotion, rearing, and exploration of the T-maze when comparing Tg and non-Tg mice. These findings are presented in detail in Figure 4c and Supplementary Fig. 5. We believe these data support the robustness of our correlations by demonstrating that the overall behavioral activity of 3xTg mice is comparable to that of non-transgenic controls, thus focusing attention on the specific roles of VIP-INs in early prodromal state of AD pathology.

      Minor Points:

      - Figure 1c: Heading of VIP-Tg should have capital letters.

      Thank you for pointing that out. We have corrected the heading to "VIP-Tg" with capital letters in Figure 1c.

      - Figure 1d: The finding that no change was observed in the percentage of VIP+/CR+ is based on three animals and 3-4 slices per mouse. However, the result of VIP+CR+ in tg-mice has an outlier that might bias the results. I would suggest increasing the number of animals to confirm these results.

      Thank you for your insightful suggestion. We addressed the potential impact of the outlier in the VIP+/CR+ cell density analysis by recalculating the results after removing the outlier using the interquartile range method. This reanalysis revealed a statistically significant difference in the VIP+/CR+ cell density between non-Tg and Tg mice, which we have now detailed in the Results section. Despite this, we have chosen to retain the outlier in our final presentation to accurately represent the biological variability observed in our sample. We agree that increasing the number of animals would further validate these findings and will consider this in future studies.

      - Figure 3d: Would it be possible to identify the recorded interneurons? Is it expected that most of those are OLM cells?

      Thank you for your question. We were unable to fully recover all recorded cells using biocytin staining. However, for those cells with preserved axonal structures, we identified both OLM and bistratified cells, which are the primary targets of I-S3 cells. We have now included this information in the Results section to clarify the types of interneurons identified.

      - Figure 3: Why quantify VGat terminals instead of quantification of VIP-GFP terminals? Combined with the Calretinine labeling it would be more useful to indicate that no changes were observed at the morphological bouton level specifically in disinhibitory interneurons. Please also describe which imageJ plugin was used for the quantification.

      Thank you for your question. Our primary objective was to quantify the synaptic terminals of CR+ INs in the CA1 O/A region, which are predominantly formed by I-S3 cells. Therefore, VGaT and CR co-localization was used to guide this analysis. GFP expression in axonal boutons can sometimes be inconsistent and less reliable for precise quantification. For this analysis, we utilized the “Analyze Particles” function in ImageJ, combined with watershed segmentation, which is now specified in the Methods section.

      -  Figure 4g: How was the statistical test performed? If data was averaged across mice, please add error bars and data points in the figure.

      Thank you for your question. To compare the alternation percentage between non-Tg and Tg mice, we used Fisher’s Exact test as detailed in Supplementary Table 1. In this analysis, we considered each animal's choice individually, comparing the preference for correct versus incorrect choices between the two groups. Since Fisher’s Exact test is designed for analyzing qualitative data rather than quantitative data, averaging across mice was not applicable, and therefore, we did not include error bars or data points in the figure.

      - Figure 4h: To conclude that the increase in activity is larger in the 3xTg mice, there should be a statistical comparison for the magnitude of change between the decision and the stem zone for control and 3xTg mice. To show that there is no significant difference in this measurement in the control mice is insufficient.

      Thank you for your suggestion. We performed a statistical comparison of the magnitude of change in activity between the stem zone and the D-zone for non-Tg and 3xTg mice, as recommended. Our analysis showed no significant difference in this magnitude of change between the two genotypes. These results have now been included in the Results section. However, we would like to highlight an important finding regarding the nature of these changes. In the 3xTg mice, there was a consistent increase in the activity of O/A INs when entering the Dzone. In contrast, non-Tg mice displayed a range of responses, including both increases and decreases in activity. This indicates a higher reliability in the firing of O/A INs in the D-zone of 3xTg mice. Our recent study suggests that VIP-INs are particularly active in the D-zone (Tamboli et al., 2024). Therefore, the absence or reduced input from VIP-INs in 3xTg mice may lead to the observed higher engagement of O/A INs in this zone. We believe this observation is crucial for understanding the differential yet nuanced changes in neural dynamics in these mice.

      - In the methods, it is stated that there was a pre-selection of animals depending on learning performance. Would it be possible to also show the data from animals that did not properly learn? Alternatively, it would be useful to plot the correlation between performance in this test and the difference between activity in the stem and the decision-making zone. The reason to ask for this is that there is a trend for control animals to show reduced alternations (50 vs 80%, although not significant, it is a big difference). Considering that there is also a trend in control animals to show increased activity in the decision-making zone, it would be important to confirm that this is not only due to differences in performance. The current statistical procedure does not allow discarding this.

      In this study, we excluded from the analysis the animals that refused to explore the T-maze and spent all their time in the stem corner, or refused to explore the objects and stayed in the open field maze (OFM) corner. These exclusions applied to both non-Tg (n = 6) and Tg (n = 5) groups, indicating that low exploratory activity is not necessarily linked to AD-related mutations. During the T-maze test, we also observed several animals that made incorrect choices (4 out of 9 non-Tg and 1 out of 6 Tg mice). However, due to the low number of animals making incorrect choices, we were unable to form a separate group for analysis based on incorrect choices. These details are now provided in the Methods section.

      - Figure 4i. It is not clear when exactly cell activity was measured. If it was during the entire recording time, I think it would be interesting to see if the activity of O/A interneurons is different specifically during interaction with the object in 3xTg mice.

      Cell activity was indeed measured throughout the entire recording session and analyzed in relation to animal behavior (immobility to walking; Fig. 4d,e), and periods specifically related to interaction with objects were extracted for analysis (Figure 4i).

      - Why was the object modulation measured during a different task in which both objects were the same? The figure is misleading in that sense, as it suggests the experiment was the same as for the other panels with two different objects. It would be important to correct this if the authors want to correlate the deficits in NOR in 3xTg mice and changes in IN activity.

      The study specifically investigated object-modulated neural activity during the Sampling phase. Therefore, two identical objects were placed in the arena for animal exploration. As mentioned above, due to several animals failing to explore the OFM and objects on the second day, they were excluded from the analysis, preventing the conduct of the novel-object exploration Test Trial. Both non-Tg and Tg mice showed a lack of exploration in the OFM and Tmaze, for reasons that remain unclear. Consequently, we opted to present robust data on neural activity during the initial sampling of two identical objects. However, further investigation is needed to understand how this activity relates to deficits observed in the classical NOR test.

      - Figure. 5c-f. I would strongly suggest performing the same quantification and displaying similar figures for the fiber photometry experiments in interneurons and principal cells. It would help to interpret the data.

      We have taken the reviewer's suggestion into account and standardized the data analysis and presentation. Figures 4d, e and 5c, d now depict the walk-induced activity in INs and PCs, respectively. Figures 4h and 5f compare activity between the stem and D-zone in the T-maze. Additionally, Figures 4j and 5h illustrate the object modulation of INs and PCs, respectively.

      - Although velocity and mobility were quantified, it would be important to show also that they are not different during those times when activity was dissimilar, as in the decision zone.

      We have analyzed these data and found no significant differences between the two genotypes in terms of velocity and mobility during these periods. This analysis is now presented in Supplementary Figure 5e, f and detailed in the Results section.

      - Figure 5g-h. Similarly, I would suggest using the same metrics in order to correlate the results from interneuron and principal cell activity photometry.

      We have updated this figure to align with the presentation of interneurons (Figure 4j) and included RMS analysis to emphasize lower variance in object modulation of PCs as an indicator of increased network inhibition.

      - Was object modulation variance also different for INs depending on the mouse phenotype?

      We conducted this additional analysis but did not find any significant difference.

      - Figure S4: would it be possible to identify the postsynaptic partners?

      As mentioned above, for those cells with preserved axonal structures, we identified both OLM and bistratified cells. We have now included this information in the Results section to clarify the types of interneurons identified.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this study, the authors address a fundamental unresolved question in cerebellar physiology: do synapses between granule cells (GCs) and Purkinje cells (PCs) made by the ascending part of the axon (AA) have different synaptic properties from those made by parallel fibers? This is an important question, as GCs integrate sensorimotor information from numerous brain areas with a precise and complex topography.

      Summary:

      The authors argue that CGs located close to PCs essentially contact PC dendrites via the ascending part of their axons. They demonstrate that joint high-frequency (100 Hz) stimulation of distant parallel fibers and local CGs potentiates AA-PC synapses, while parallel fiber-PC synapses are depressed. On the basis of paired-pulse ratio analysis, they concluded that evoked plasticity was postsynaptic. When individual pathways were stimulated alone, no LRP was observed. This associative plasticity appears to be sensitive to timing, as stimulation of parallel fibers first results in depression, while stimulation of the AA pathway has no effect. NMDA, mGluR1 and GABAA receptors are involved in this plasticity.

      Strengths:

      Overall, the associative modulation of synaptic transmission is convincing, and the experiments carried out support this conclusion. However, weaknesses limit the scope of the results.

      Weaknesses:

      One of the main weaknesses of this study is the suggestion that high-frequency parallel-fiber stimulation cannot induce long term potentiation unless combined with AA stimulation. Although we acknowledge that the stimulation and recording conditions were different from those of other studies, according to the literature (e.g. Bouvier et al 2016, Piochon et al 2016, Binda et al, 2016, Schonewille et al 2021 and others), high-frequency stimulation of parallel fibers leads to long-term postsynaptic potentiation under many different experimental conditions (blocked or unblocked inhibition, stimulation protocols, internal solution composition). Furthermore, in vivo experiments have confirmed that high-frequency parallel fibers are likely to induce long-term potentiation (Jorntell and Ekerot, 2002; Wang et al, 2009).

      This article provides further evidence that long-term plasticity (LTP and LTD) at this connection is a complex and subtle mechanism underpinned by many different transduction pathways. It would therefore have been interesting to test different protocols or conditions to explain the discrepancies observed in this dataset.

      Even though this is not the main result of this study, we acknowledge that the control experiments done on PF stimulation add a puzzling result to an already contradictory literature. High frequency parallel fibre stimulation (in isolation) has been shown to induce long term potentiation in vitro, but not always, and most importantly, this has been shown in vivo. This was the reason for choosing that particular stimulation protocol. Examination of in vitro studies, however, show that the results are variable and even contradictory. Most were done in the presence of GABAA receptor antagonists, including the SK channel blocker Bicuculline, whereas in the study by Binda (2016), LTP was blocked by GABAA receptor inhibition. In some studies also, LTP was under the control of NMDAR activation only, whereas in Binda (2016), it was under the control of mGluR activation. Moreover, most experiments were done in mice, whereas our study was done in rats. Our results reveal multiple mechanisms working together to produce plasticity, which are highly sensitive to in vitro conditions. We designed our experiments to be close to the physiological conditions, with inhibition preserved and a physiological chloride gradient. It is likely that experimental differences have given rise to the variability of the results and our inability to reproduce PF-LTP, but it was not the aim of this study to dissect the subtleties of the different experimental protocols and models.

      We have modified the Discussion to cover that point fully.

      Another important weakness is the lack of evidence that the AAs were stimulated. Indeed, without filling the PC with fluorescent dye or biocytin during the experiment, and without reconstructing the anatomical organization, it is difficult to assess whether the stimulating pipette is positioned in the GC cluster that is potentially in contact with the PC with the AAs. According to EM microscopy, AAs account for 3% of the total number of synapses in a PC, which could represent a significant number of synapses. Although the idea that AAs repeatedly contact the same Purkinje cell has been propagated, to the best of the review author's knowledge, no direct demonstration of this hypothesis has yet been published. In fact, what has been demonstrated (Walter et al 2009; Spaeth et al 2022) is that GCs have a higher probability of being connected to nearby PCs, but are not necessarily associated with AAs.

      We fully agree with the reviewer that we have not identified morphologically ascending axon synapses, and we stress this fact both in the first paragraph of the Results section, and again at the beginning of Discussion. Our point is mainly topographical, given the well documented geometrical organisation of the cerebellar cortex. Strictly speaking, inputs are local (including AAs) or distal (PFs). Similarly, the studies by Isope and Barbour (2002) and Walter et al. (2009), just like Sims and Hartell (2005 and 2006), have coined the term ‘ascending axon’ when drawing conclusions about locally stimulated inputs. Moreover, our results do not rely on or assume multiple contacts, stronger connections, or higher probability of connections between ascending axons and Purkinje cells. Our results only demonstrate a different plasticity outcome for the two types of inputs. Therefore, our manuscript could be rephrased with the terms ‘local’ and ‘distal’ granule cell inputs, but this would have no more implication for the results or the computation performed in Purkinje cells. However, in our experience, these terms are more confusing, and consistent with the literature, we do not wish to make this modification. However, we have modified the abstract of the manuscript to clarify this point.

      Reviewer #2 (Public Review):

      Summary:

      The authors describe a form of synaptic plasticity at synapses from granule cells onto Purkinje cells in the mouse cerebellum, which is specific to synapses proximal to the cell body but not to distal ones. This plasticity is induced by the paired or associative stimulation of the two types of synapses because it is not observed with stimulation of one type of synapse alone. In addition, this form of plasticity is dependent on the order in which the stimuli are presented, and is dependent on NMDA receptors, metabotropic glutamate receptors and to some degree on GABAA receptors. However, under all experimental conditions described, there is a progressive weakening or run-down of synaptic strength. Therefore, plasticity is not relative to a stable baseline, but relative to a process of continuous decline that occurs whether or not there is any plasticity-inducing stimulus.

      As highlighted by the reviewer, we observed a postsynaptic rundown of the EPSC amplitude for both input pathways. Rundown could be mistaken for a depression of synaptic currents, not for a potentiation, and the progressive decrease of the EPSC amplitude during the course of an experiment leads to an underestimate of the absolute potentiation. We have taken the view to provide a strong set of control data rather than selecting experiments based on subjective criteria or applying a cosmetic compensation procedure. We have conducted control experiments with no induction (n = 17), which give a good indication of the speed and amplitude of the rundown. Comparison shows a highly significant potentiation of the ascending axon EPSC. Depression of the parallel fibre EPSC, on the other hand, was not significantly different from rundown, and we have not spoken of parallel fibre long term depression. The data show thus very clearly that ascending axon and parallel fibre synapses behave differently following the costimulation protocol.

      Strengths:

      The focus of the authors on the properties of two different synapse-types on cerebellar Purkinje cells is interesting and relevant, given previous results that ascending and parallel fiber synapses might be functionally different and undergo different forms of plasticity. In addition, the interaction between these two synapse types during plasticity is important for understanding cerebellar function. The demonstration of timing and order-dependent potentiation of only one pathway, and not another, after associative stimulation of both pathways, changes our understanding of potential plasticity mechanisms. In addition, this observation opens up many new questions on underlying intracellular mechanisms as well as on its relevance for cerebellar learning and adaptation.

      Weaknesses and suggested improvements:

      A concern with this study is that all recordings demonstrate "rundown", a progressive decrease in the amplitude of the EPSC, starting during the baseline period and continuing after the plasticity-induction stimulus. In the absence of a stable baseline, it is hard to know what changes in strength actually occur at any set of synapses. Moreover, the issues that are causing rundown are not known and may or may not be related to the cellular processes involved in synaptic plasticity. This concern applies in particular to all the experiments where there is a decrease in synaptic strength.

      We have provided an answer to that point directly below the summary paragraph. We will just add here that if the phenomenon causing rundown was involved in plasticity, it should affect plasticity of both inputs, which was not the case, clearly distinguishing the ascending axon and parallel fibre inputs.

      The authors should consider changes in the shape of the EPSC after plasticity induction, as in Fig 1 (orange trace) as this could change the interpretation.

      Figure 1 shows an average response composed of evoked excitatory and inhibitory synaptic currents. The third section of Supplementary material (supplementary figure 3) shows that this complex shape is given by an EPSC followed by a delayed disynaptic IPSC. We would like to point out that while separating EPSC from IPSC might appear difficult from average traces due to the averaged jitter in the onset of the synaptic currents, boundaries are much clearer when analysing individual traces. In the same section we discuss the results of experiments in which transient applications of SR 95531 before and after the induction protocol allowed us to measure the EPSC, while maintaining the same experimental conditions during induction. Analysis of the kinetics of the EPSCs during SR application at the beginning and end of experiments, showed that there is no change in the time to peak of both AA and PF response. The decay time of AA- and PF-EPSCs are slightly longer at the end of the experiment, even if the difference is not significant for AA inputs. This analysis has been added to the Supplementary material. Our analysis, that uses as template the EPSCs kinetics measured at the beginning and at the end of the experiments, takes directly into account these changes. The results show clearly that the presence of disynaptic inhibition doesn’t significantly affect the measure of the peak EPSC after the induction protocol nor the estimate of plasticity.

      In addition, the inconsistency with previous results is surprising and is not explained; specifically, that no PF-LTP was induced by PF-alone repeated stimulation.

      In our experimental conditions, PF-LTP was not induced when stimulating PF only, the condition that reproduces experiments in the literature. As discussed in our response to reviewer 1, a close look at the literature, however, reveals variabilities and contradictions behind seemingly similar results. They reveal intricate mechanisms working together to produce plasticity, which are sensitive to in vitro conditions. We designed our experiments to be close to physiological conditions, with inhibition preserved and a physiological chloride gradient. It is likely that experimental differences have given rise to the variability of the results and our inability to observe PF-LTP. We have modified the Discussion section to cover that point thoroughly in the context of past results. 

      The authors test the role of NMDARs, GABAARs and mGluRs in the phenotype they describe. The data suggest that the form of plasticity described here is dependent on any one of the three receptors. However, the location of these receptors varies between the Purkinje cells, granule cells and interneurons. The authors do not describe a convincing hypothetical model in which this dependence can be explained. They suggest that there is crosstalk between AA and PF synapses via endocannabinoids downstream of mGluR or NO downstream of NMDARs. However, it is not clear how this could lead to the long-term potentiation that they describe. Also, there is no long-lasting change in paired-pulse ratio, suggesting an absence of changes in presynaptic release.

      We suggest in the result section that the transient change in paired pulse ratio (PPR) is linked to a transient presynaptic effect, but there was no significant long term change of the PPR, suggesting that the long term effects observed are linked to postsynaptic changes. We now stress this point in the Results and Discussion sections.

      Concerning the involvement of multiple molecular pathways, investigators often tested for the involvement of NMDAR or mGluRs in cerebellar plasticity, rarely both. Here we showed that both pathways are involved. The conjunctive requirement for NMDAR and mGluR activation could easily be explained based on the dependence of cerebellar LTP and LTD on the concentrations of both NO and postsynaptic calcium (Coesman et al., 2004; Safo and Regehr, 2005; Bouvier et al., 2016; Piochon et al., 2016).

      We also observed an effect of GABAergic inhibition. GABAergic inhibition was elegantly shown by Binda (2016) to regulate calcium entry together with mGluRs, and control plasticity induction. A similar mechanism could contribute to our results, although inhibition might have additional effects. We have modified the Discussion of the manuscript to clarify the pathways involved in plasticity and added a diagram to highlight the links between the different molecular pathways, potential cross talk mechanisms, and the location of receptors.

      Is the synapse that undergoes plasticity correctly identified? In this study, since GABAergic inhibition is not blocked for most experiments, PF stimulation can result in both a direct EPSC onto the Purkinje cell and a disynaptic feedforward IPSC. The authors do address this issue with Supplementary Fig 3, where the impact of the IPSC on the EPSC within the EPSC/IPSC sequence is calculated. However, a change in waveform would complicate this analysis. An experiment with pharmacological blockade will make the interpretation more robust. The observed dependence of the plasticity on GABAA receptors is an added point in favor of the suggested additional experiments.

      We did consider that due to long recording times there might be kinetic changes, and that’s the reason why the experiments of Supplementary figure 3 were done with pharmacological blockade of GABAAR with SR, both before and again after LTP induction. The estimate of the amplitude of the EPSC is based on the actual kinetics of the response at both times.

      A primary hypothesis of this study is that proximal, or AA, and distal, or PF, synapses are different and that their association is specifically what drives plasticity. The alternative hypothesis is that the two synapse-types are the same. Therefore, a good control for pairing AA with PF would be to pair AA with AA and PF with PF, thereby demonstrating that pairing with each other is different from pairing with self.

      Pairing AA with AA would be difficult because stimulation of AA can only be made from a narrow band below the PC and we would likely end up stimulating overlapping sets of synapses. However, Figure 5 shows the effect of stimulating PF and PF, while also mimicking the sparse and dense configuration of the control experiment. It shows that sparse PF do not behave like AA. Sims and Hartell (2006) also made an experiment with sparse PF inputs and observed clear differences between sparse local (AA) and sparse distal (PF) synapses.

      It is hypothesized that the association of a PF input with an AA input is similar to the association of a PF input with a CF input. However, the two are very different in terms of cellular location, with the CF input being in a position to directly interact with PF-driven inputs. Therefore, there are two major issues with this hypothesis: 1) how can subthreshold activity at one set of synapses affect another located hundreds of micrometers away on the same dendritic tree? 2) There is evidence that the CF encodes teaching/error or reward information, which is functionally meaningful as a driver of plasticity at PF synapses. The AA synapse on one set of Purkinje cells is carrying exactly the same information as the PF synapses on another set of Purkinje cells further up and down the parallel fiber beam. It is suggested that the two inputs carry sensory vs. motor information, which is why this form of plasticity was tested. However, the granule cells that lead to both the AA and PF synapses are receiving the same modalities of mossy fiber information. Therefore, one needs to presuppose different populations of granule cells for sensory and motor inputs or receptive field and contextual information. As a consequence, which granule cells lead to AA synapses and which to PF synapses will change depending on which Purkinje cell you're recording from. And that's inconsistent with there being a timing dependence of AA-PF pairing in only one direction. Overall, it would be helpful to discuss the functional implications of this form of plasticity.

      We do not hypothesise that association of the AA and PF inputs is similar to the association of PF and climbing fibre inputs. We compare them because it is the other known configuration triggering associative plasticity in Purkinje cells. It is indeed interesting to observe that even if the inputs are very small compared to the powerful climbing fibre input, they can be effective at inducing plasticity. Physiologically, the climbing fibre signal has been linked to error and reward signals, but reward signals are also encoded by granule cell inputs (Wagner et al., 2017). We have modified the discussion to make sure that we do not suggest equivalence with CF induced LTD.

      Moreover, we fully agree that AA and PF synapses made up by a given granule cell carry the same information, and cannot encode sensory and motor information at the same time. AA synapses from a local granule cell deliver information about the local receptive field, but PF synapses from the same granule cell will deliver contextual information about that receptive field to distant Purkinje cells. In the context of sensorimotor learning, movement is learnt with respect to a global context, not in isolation, therefore learning a particular association must be relevant. The associative plasticity we describe here could help explain this functional association. We have clarified the discussion.

      Reviewer #3 (Public Review):

      Granule cells' axons bifurcate to form parallel fibers (PFs) and ascending axons (AAs). While the significance of PFs on cerebellar plasticity is widely acknowledged, the importance of AAs remains unclear. In the current paper, Conti and Auger conducted electrophysiological experiments in rat cerebellar slices and identified a new form of synaptic plasticity in the AA-Purkinje cell (PC) synapses. Upon simultaneous stimulation of AAs and PFs, AA-PC EPSCs increased, while PFs-EPSCs decreased. This suggests that synaptic responses to AAs and PFs in PCs are jointly regulated, working as an additional mechanism to integrate motor/sensory input. This finding may offer new perspectives in studying and modeling cerebellum-dependent behavior. Overall, the experiments are performed well. However, there are two weaknesses. First, the baseline of electrophysiological recordings is influenced significantly by run-down, making it difficult to interpret the data quantitatively. The amplitude of AA-EPSCs is relatively small and the run-down may mask the change. The authors should carefully reexamine the data with appropriate controls and statistics. Second, while the authors show AA-LTP depends on mGluR, NMDA receptors, and GABA-A receptors, which cell types express these receptors and how they contribute to plasticity is not clarified. The recommended experiments may help to improve the quality of the manuscript.

      As highlighted by the reviewer and developed above in response to reviewer 2, we observed a postsynaptic rundown of the EPSC amplitude. Rundown could be mistaken for a depression of synaptic currents, not for a potentiation. Moreover, we have conducted control experiments with no induction (n = 17), which give a good indication of the speed and amplitude of the rundown, and provide a baseline. Comparison shows a highly significant potentiation of the ascending axon EPSC, relative to baseline and relative to these control experiments. Depression of the parallel fibre EPSC on the other hand was not significantly different from rundown. For that reason we have not spoken of parallel fibre long term depression. The data, however, show that ascending axon and parallel fibre synapses behave very differently following the costimulation protocol.

      We have discussed above in our response to reviewer 2 the potential involvement of mGluRs, NMDARs and GABAARs. We have clarified the discussion of the pathways involved in plasticity and added a diagram to highlight the links between the different molecular pathways, potential cross talk mechanisms, and the location of receptors.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      - If Chloride concentration cannot be modified, recordings should be performed at the Chloride reversal potential to avoid strong bias in amplitude measurements (e.g. in Figures 3 and 5 outward current was observed while not visible in Figures 1 and 4.

      The balance between excitation and inhibition dictates whether there is a visible outward component, and this varies with the connections tested. Careful control experiments with SR application presented in supplementary figure 3 show that the delay of the IPSC does not significantly affect measurement of the peak amplitude of the EPSC. The reversal potential for Clin our study (-85 mV), chosen to reproduce the physiological gradient in Purkinje cells, is too low to record from Purkinje cells at this potential in good conditions as it activates the hyperpolarisation activated cation current Ih, generating huge inward currents.

      - It is not clear whether, during the current clamp, the potential was maintained at -65 mV throughout the induction protocol.

      The potential was set and maintained around -65mV during the induction protocol. The method section has been amended to specify that point.

      - Experiments using GABAB or endocannabinoid antagonists would have been interesting to assess the role of presynaptic plasticity occluding postsynaptic plasticity.

      We are not sure why the reviewer suggested these particular experiments to test for the role of presynaptic plasticity. GABAB and endocannabinoid receptor activation both have presynaptic effects at granule cell to Purkinje cell synapses. They decrease release probability, and as a result increase the paired pulse ratio (Dittman and Regehr, 1997; Safo and Regehr, 2005). Here we only observed a transient decrease of the paired pulse ratio. Additionally, presynaptic endocannabinoid receptor activation, linked to postsynaptic mGluR1 activation and release of endocannabinoids, was shown to be required for induction of postsynaptic PF-LTD (Safo and Regehr, 2005). This effect required climbing fibre stimulation and mGluR activation. Here we show that mGluR1 inhibition did not inhibit the PF depression nor affect the transient change in PPR. Therefore there is no indication that activation of these receptors could induce a pre-synaptic depression occluding postsynaptic plasticity.

      - To give credit to this new plasticity in contradiction with many previous studies, induction pathways should be addressed more deeply.

      As developed earlier in response to the public review, this study does not contradict previous studies, expect maybe that by Binda et al., (2016), conducted on mice. From our point of view, our study in fact reconciles past results which have alternatively involved the mGluR or NMDAR pathways, whereas the molecular downstream pathways they recruit can easily cooperate. We aim to describe a new phenomenon and we cannot cover the mechanistic dissection which has been performed to date on plasticity in the cerebellar cortex.

      - The quality of the figures could be enhanced by modifying the dashed line.

      We have made the dashed line more discrete.

      Reviewer #2 (Recommendations For The Authors):

      - Is there cross-talk between the two synaptic pathways?

      In order to explain the associative nature of AA-LTP we suggest that a signal is generated at the AA input during the induction protocol only when the PF input is also stimulated, i.e. a form of cross-talk takes place between the two synaptic territories. We have not tested for cross-talk during control conditions but we discuss the fact that given the size of the Purkinje cell dendritic tree, the size of the inputs and their geometrical configuration, it is highly unlikely. We discuss possible cross-talk mechanisms.

      - Clarification question: "While the peak amplitude of the first response in the pair of stimulations showed a progressive decline, the peak amplitude of the second response of both AA and PF underwent either LTP or LTD respectively..." Does this mean that all LTP/LTD figures show the amplitude of the second EPSC in the paired pulse stimulation, and that the first EPSC has a different response? If so, this should be mentioned in the Methods section and implications discussed.

      All figures show both the amplitude of the first and second EPSCs in the pair of stimulations. In Figure 1A, 3A, 4A and 5B the paired stimulation protocol is depicted with colours and symbols used in the associated graphs, with closed symbols for the first and open symbols for the second EPSC. Figure legends have been amended to clarify this point. The average values given in the Results section and figure legends relate to the first EPSC only for clarity. As can be seen from the figures, long term plasticity affected the first and second EPSC in a very similar manner. However, individual symbols show that during a transient period, the first and second EPSCs are differentially affected by the induction protocol, resulting in a transient change of the PPR.

      Minor suggestions:

      - It would be helpful to have a reference for the statement that 1-2% of stimulated fibers come from nearby GCs when stimulation is distal.

      We have modified the text to explain our calculation based on the data of Pichitpornchai et al., 1994. P4 result section.

      - Does the shading over the plasticity time course traces come from the standard error of the mean?

      Shading over the plasticity time course plots shows the standard error of the mean. This is now clearly stated in figure legends.

      Reviewer #3 (Recommendations For The Authors):

      Major points:

      (1) Whether the plasticity between AAs and PCs is regulated by the post-synaptic or pre-synaptic mechanisms should be addressed or discussed. Based on the results of PPR (mostly unchanged after induction), the post-synaptic mechanism may be more significant. Supplemental Figure 2C shows a trend toward a positive correlation between AALTP and the number of spikes, suggesting intracellular calcium levels in the post-synaptic Purkinje cells may be important. Whether this is true or not can be directly tested by the addition of BAPTA in the recording pipettes.

      The absence of a long lasting effect on the paired pulse ratio (PPR) indicates that postsynaptic mechanisms are involved in long term changes. This is in line with the dependence of plasticity induced with similar protocols on the concentrations of NO and postsynaptic calcium, both affecting postsynaptic targets, as developed in our response to reviewer 2. BAPTA interferes with calcium and mGluR signalling, and could be used to further confirm the involvement of a postsynaptic mechanism, however, we did not wish to pursue further the dissection of the signalling cascade. We have modified the Results and Discussion sections to include a discussion of pre and postsynaptic mechanisms.

      (2) Most results from the plasticity experiments are shown as average/sem and do not include individual data, making ithard to appreciate the magnitude of the changes. The authors could show the individual data at some time points (e.g. 5 min before and 30 min after induction), plot bar-graphs (Figure 2C with individual data), or boxplots to compare different conditions and perform statistics.

      Individual data points are now visible for plasticity induction in Figure 2C and Supplementary Figure 2 for a number of conditions. Statistics have been performed as detailed in the text and legend of Fig 2.

      (3) In addressing point #2, it is strongly recommended that the authors include the values for controls without inductionbecause AA/PF-EPSCs undergo significant run-down. In most experiments, the authors compare the magnitude of plasticity with baseline changes in Supplemental Figure 1. This should not be appropriate for some experiments, such as Figures 3 & 4, where pharmacological treatments are performed. The authors should carefully consider including the appropriate controls from baseline recording to rule out significant confound by the run-down.

      We agree that control experiments without stimulation (no Stim) are only appropriate controls for the initial synchronous stimulation and AA and PF only experiments (Fig 1). All the other experiments were compared to the synchronous stimulation experiments, not to control No Stim. The synchronous stimulation protocol is strictly the same as that applied in experiments with pharmacological treatments and the appropriate control to test whether treatments affected plasticity. This is now systematically specified in the Results section.

      (4) The authors recorded mixed EPSC/IPSCs and used a fitting approach to extract EPSCs. Applying AMPA-receptor blockers to check that extracted IPSCs are correctly predicted may solidify the reliability of the approach. An additional concern is that this approach can only be used if the waveform of EPSC/IPSC does not change with plasticity. The authors should compare the waveforms between conditions to address this point.

      Fits were not used to extract EPSCs. EPSCs were isolated by blocking IPSCs with SR95531, and the IPSCs were then extracted by subtraction from the mixed EPSC/IPSC. Fits were then done of the isolated EPSC and the extracted IPSC. This procedure was applied both at the start of the experiment and at the end to avoid changes in kinetics that would influence measurements. A section of supplementary material is devoted to this analysis. Isolating IPSCs using AMPAR blockers is not possible as IPSCs are disynaptic. AMPAR blockers would fully suppress inhibition.

      (5) While the AA-LTP depends on NMDA-Rs, which cell type is responsible is not clear. Recording NMDA components in AA/PF-EPSCs should be informative in addressing this point. Cesana et al suggested that AA induces significant activation of NMDA-Rs in Golgi cells (PMID: 23884948). Whether AA stimuli could significantly evoke NMDA current in the experimental condition used in this paper could provide essential information.

      The granule cell to Purkinje cell EPSCs are devoid of an NMDAR component (Llano et al., 1991), and there is no postsynaptic NMDARs at granule cell to PC synapses, but a proportion of presynaptic boutons show the presence of NMDARs (Bidoret et al, 2009). This is now stated clearly on p8.  Presynaptic NMDAR have been involved in LTP and LTD of parallel fibre synapses (Casado et al., 2002; Bouvier et al., 2016; Schonewille et al., 2021), and linked to the activation of NOS in granule cell axons. However, we do not know whether presynaptic NMDARs are also present at AA synapses. NMDAR and NOS are also expressed by molecular layer interneurons, and have sometimes been involved in LTD induction (Kono et al., 2019), although this is disputed. In the paper by Cesana (2013), white matter stimulation activated mossy fibre inputs to granule cells, and as a consequence, granule cell to Golgi cell disynaptic EPSCs. The authors identified AA synapses on the basolateral dendrites of Golgi cells, and showed NMDAR activation associated with the mossy fibre to granule cell EPSC. Granule cell to Golgi cell synapses were shown to activate both postsynaptic AMPA and NMDA receptors (Dieudonné, 1999). But to our knowledge, Golgi cells do not express NOS. Therefore it is unlikely that activation of NMDARs in Golgi cells is linked to synaptic plasticity in Purkinje cells.

      (6) Pharmacological experiments in Figure 3 show that AA-LTP is dependent on mGluR. The authors mentioned that it could be explained by the presence and absence of mGluRs in PFs and AAs, respectively. This is an important and reasonable possibility and should be tested. The authors could simply check whether slow EPSCs can be recorded by the AA activation.

      Activation of the mGluR slow EPSC by AA stimulation would reveal the presence of mGluRs at AA inputs. We know, however, that sparse PF stimulation does not activate the mGluR slow EPSC nor endocannabinoid release unless glutamate transporters are blocked (Marcaggi and Attwell., 2005). This is thought to reflect insufficient glutamate buildup in the sparse configuration to activate mGluR1s. AA inputs are sparsely distributed and are not expected to activate the slow EPSC either, and this is confirmed by our own experiments (CA personal communication). However, mGluR1 mediated Ca2+ release from stores shows a higher sensitivity to glutamate than the slow EPSC (Canepari and Ogden, 2006) and might take place with sparse inputs, but Ca2+ signals have not been investigated in this configuration. Therefore the absence of the slow EPSC is not sufficient proof that mGluR1s are not activated and not present at AA synapses. This is now further discussed p12.

      Minor points:

      (1) The authors should describe how they adjusted the stimulation strength for both AAs and PFs.

      Adjustment of the stimulation intensity is now described in the Methods section.

      (2) A rationale explaining why the authors chose the current induction protocol (synchronous stimulation of both inputs) should be included. This will help the readers to understand the background of the study.

      Papers by Sims and Hartell (2005, 2006) and experimental evidence indicated that AA and PF inputs may have different properties, and as a result may play different roles. Moreover, based on the morphology of the cerebellar granule cell and Purkinje cell, AA and PF inputs can carry different information to a given Purkinje cell. We reasoned that co-presentation of the inputs might represent an important piece of information for the circuit, signalling functional association, and lead to plasticity, as seen for motor command and sensory feedback in cerebellar-like structures, or for PF and climbing fibre. We have tried to convey that rational in the abstract and introduction.

      (3) Supplemental Figure 2B: the x-axis may be labeled incorrectly, Is the x-axis of the top graph for PF PF-EPSC? Thex-axis for the bottom graphs should be the summation of AA- and PF-EPSCs.

      This has been corrected.

      (4) "mglur1" on page 10 should be mGluR1.

      This has been corrected.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Please reorder the supplementary figures in the order they are referred to in the Results section for ease of reading. Supp Fig 5 b - should read 'Mean normalized fluorescence of LC ROIs (n = 87) during immobile periods aligned to the switch from familiar to novel environment.’

      We thank the reviewer for highlighting these issues and have reordered the supplementary figures and edited the figure legends appropriately.

      Reviewer #2 (Recommendations For The Authors):

      The authors should include sample size justifications (e.g. based on previous studies, considerations of statistical power, practical considerations, or a combination of these factors).

      In response to this concern, we have added a statement to the “Imaging Sessions” section of the methods. Here we highlight sample sizes were largely based on previous studies and/or limited by the difficulty of recordings and the limited number of visible axons per imaging session.

      Reviewer #3 (Recommendations For The Authors):

      The addition of Supp. Fig 5 partially addresses my previous point 3. However, the claim of dissociation between VTA-CA1 and LC-CA1 would be strengthened by showing that VTA-CA1 axons do not respond to the darkness -> familiar environment in Supp Fig 5. This is particularly important given that (1) the additional 2 VTA-CA1 axons in the revision were not recorded during transitions to novel environments and (2) the overall concern of the reviewers that the low n and heterogeneity of the VTA-CA1 dataset may lead to a false negative. Providing VTA-CA1 data for the darkness -> familiar environment would provide a within-manuscript replication that these axons are not responding to environment changes; a major claim of this manuscript.

      While we agree that data of VTA-CA1 axons during the switch from darkness to the familiar environment would provide additional evidence that these axons are not responding to environment changes, unfortunately, VTA axons were not recorded during the switch from familiar to novel.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment 

      The authors present 16 new well-preserved specimens from the early Cambrian Chengjiang biota. These specimens potentially represent a new taxon which could be useful in sorting out the problematic topology of artiopodan arthropods - a topic of interest to specialists in Cambrian arthropods. Because the anatomic features in the new specimens were neither properly revealed nor correctly interpreted, the evidence for several conclusions is inadequate. 

      We thank the Senior Editor, Reviewing Editor and three reviewers for their work, and for their comments aimed at improving this project and manuscript. We have engaged with all the comments in detail, in order to strengthen our work. This includes adding additional data to support that all Acanthomeridion specimens belong to a single species, running further phylogenetic analyses including more trilobite terminals to test the specific hypothesis and interpretation raised by Reviewer 2, and visualising our results in treespace in order to determine support for the different interpretations of the ventral structures and their implications for the evolution of Artiopoda. We have also greatly expanded the introduction, which we feel adds clarity to areas misunderstood by some reviewers in the previous version of the manuscript.

      Our point-by-point response to the public reviews of the reviewers are outlined below. We have also made changes resulting from the additional suggestions which are not public, which we have not reproduced below. We submit a new version of the main text, and can provide a tracked changes version if required. The new main text includes 9 figures and is 8624 words including captions and reference list.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      Du et al. report 16 new well-preserved specimens of atiopodan arthropods from the Chengjiang biota, which demonstrate both dorsal and ventral anatomies of a potential new taxon of artipodeans that are closely related to trilobites. Authors assigned their specimens to Acanthomeridion serratum and proposed A. anacanthus as a junior subjective synonym of Acanthomeridion serratum. Critically, the presence of ventral plates (interpreted as cephalic liberigenae), together with phylogenic results, lead authors to conclude that the cephalic sutures originated multiple times within the Artiopoda. 

      We thank Reviewer 1 for their comments on the strengths and weaknesses of the previous version of the manuscript. We hope that the revised version strengthens our conclusions that Acanthomeridion anacanthus is a junior synonym of A. serratum.

      Strengths: 

      New specimens are highly qualified and informative. The morphology of the dorsal exoskeleton, except for the supposed free cheek, was well illustrated and described in detail, which provides a wealth of information for taxonomic and phylogenic analyses. 

      Weaknesses: 

      The weaknesses of this work are obvious in a number of aspects. Technically, ventral morphology is less well revealed and is poorly illustrated. Additional diagrams are necessary to show the trunk appendages and suture lines. Taxonomically, I am not convinced by the authors' placement. The specimens are markedly different from either Acanthomeridion serratum Hou et al. 1989 or A. anacanthus Hou et al. 2017. The ontogenetic description is extremely weak and the morpholical continuity is not established. Geometric and morphometric analyses might be helpful to resolve the taxonomic and ontogenic uncertainties. 

      We appreciate that the reviewer was not convinced by our synonimisation in the first version of the manuscript. The recommendation of the reviewer to provide linear morphometric support for our synonymisation was much appreciated. We have provided measurements of the length and width of the thorax (Figure 6 in the new version), visualising the position of specimens previously assigned to A. anacanthus, to show this morphological continuity. These act as a complement to Figure 5, which shows the fossils in an ontogenetic trend.

      I am confused by the author's description of the free cheek (libragena) and ventral plate. Are they the same object? How do they connect with other parts of the cephalic shield, e.g. hypostome, and fixgena? Critically, the homology of cephalic slits (eye slits, eye notch, dorsal suture, facial suture) is not extensively discussed either morphologically or functionally.

      We appreciate that the brevity of the introduction in the previous version led to some misunderstandings and some confusion. We have provided a greatly expanded introduction, including a new Figure 1, which outlines the possible homologies of the ventral plates and the three hypotheses considered in this study. The function of the cephalic and dorsal suture are now discussed in more detail both in introduction and discussion.

      Finally, the authors claimed that phylogenic results support two separate origins rather than a deep origin. However, the results in Figure 4 can explain a deep homology of the cephalic suture at molecular level and multiple co-options within the Atiopoda. 

      A deep molecular origin is difficult to demonstrate using solely fossil material from an extinct group such as Artiopoda. Thus our study focuses on morphological origins. The number of losses required for a deep morphological origin means that we favour multiple independent morphological origins.

      Reviewer #2 (Public Review): 

      Overall: This paper describes new material of Acanthomeridion serratum that the authors claim supports its synonymy with Acanthomeridion anacanthus. The material is important and the description is acceptable after some modification. In addition, the paper offers thoughts and some exploration of the possibility of multiple origins of the dorsal facial suture among artiopods, at least once within Trilobita and also among other non-trilobite artiopods. Although this possibility is real and apparently correct, the suggestions presented in this paper are both surprising and, in my opinion, unlikely to be true because the potential homologies proposed with regard to Acanthomeridion and trilobite-free cheeks are unconventional and poorly supported. 

      What to do? I can see two possibilities. One, which I recommend, is to concentrate on improving the descriptive part of the paper and omit discussion and phylogenetic analysis of dorsal facial suture distribution, leaving that for more comprehensive consideration elsewhere. The other is to seek to improve both simultaneously. That may be possible but will require extensive effort. 

      We thank the reviewer for their detailed comments and suggestions for multiple ways in which we might revise the manuscript. We have taken the option that is more effort, but we hope more reward, in interrogating the larger question alongside improving the descriptive part of the paper. This has taken a long time and incorporation of new techniques, but has in our opinion greatly strengthened the work.

      Major concerns 

      Concern 1 - Ventral sclerites as free cheek homolog, marginal sutures, and the trilobite doublure 

      Firstly, a couple of observations that bear on the arguments presented - the eyes of A. serratum are almost marginal and it is not clear whether a) there is a circumocular suture in this animal and b) if there was, whether it merged with the marginal suture. These observations are important because this animal is not one in which an impressive dorsal facial suture has been demonstrated - with eyes that near marginal it simply cannot do so. Accordingly, the key argument of this paper is not quite what one would expect. That expectation would be that a non-trilobite artiopod, such as A. serratum, shows a clear dorsal facial suture. But that is not the case, at least with A. serratum, because of its marginal eyes. Rather, the argument made is that the ventral doublure of A. serratum is the homolog of the dorsal free cheeks of trilobites. This opens up a series of issues. 

      We appreciate that the reviewer disagrees with both interpretations we offered for the ventral plates, and has offered a third interpretation for the homology of this feature with the doublure of trilobites. Support for our original interpretation comes from the position of the eye stalks in Acanthomeridion, which fall very close to the suture between ventral plate rest of the cephalon. However, we appreciate that the reviewer has a valid interpretation, that the ventral plates might be homologues of the doublure alone.

      To clarify the (two, now three) hypotheses of homology for the ventral plates considered in this study, we provide a new summary figure (Figure 1). In addition, the introduction has been greatly lengthened with further discussion of the different suture types in trilobites, their importance for trilobite classification schemes, and extensive references to older literature are now included. Further, we add background to the hypotheses around the origins of dorsal ecdysial sutures. 

      We add that the interpretation of A. serratum as having features homologous to the dorsal sutures of trilobites is already present in the literature, and so while the reviewer may disagree with it, it is certainly a hypothesis that requires testing.

      The paper's chief claim in this regard is that the "teardrop" shaped ventral, lateral cephalic plates in Acanthomeridion serratum are potential homologs of the "free cheeks" of those trilobites with a dorsal facial suture. There is no mention of the possibility that these ventral plates in A. serratum could be homologs of the lateral cephalic doublure of olenelloid trilobites, which is bound by an operative marginal suture or, in those trilobites with a dorsal facial suture, that it is a homolog of only the doublure portions of the free cheeks and not with their dorsal components. 

      We include this third possibility in our revised analyses and manuscript. To test this properly required adding in an olenelloid trilobite to our matrix, as we needed a terminal that had both a marginal and circumoral suture, but not fused. We chose Olenellus getzi for this purpose, as it is the only Olenellus with some appendages known (the antennae). We also added further characters to the morphological matrix, and additional trilobites from which soft tissues are known, in order to better resolve this part of the tree. Trilobites in the final analyses were: Anacheirurus adserai, Cryptolithus tesselatus, Eoredlichia intermedia, Olenoides serratus, Olenellus getzi, Triarthrus eatoni.

      However, addition of these trilobites added a further complication. Under unconstrained analysis, Olenellus getzi was resolved with Eoredlichia intermediata as a clade sister to all other trilobites.

      Thus the topology of Paterson et al. 2019 (PNAS) was not recovered, and so the hypothesis of Reviewer 2 could not be robustly tested. In order to achieve a topology comparable to Paterson et al., we ran a further three analyses, where we constrained a clade of all trilobites except for O. getzi. This recovered a topology where the earliest diverging trilobites had unfused sutures, and thus one suitable for considering the role of Acanthomeridion serratum ventral plates as homologues of the doublure of trilobites.

      Unfortunately, for these analyses (both constrained and unconstrained), Acanthomeridion was not resolved as sister to trilobites, but instead elsewhere in the tree (see Table 1 in main text, Fig. 9, and  SFig 9). Thus our analyses do not find support for the reviewer’s hypothesis as multiple origins of this feature are still required.

      It was still an excellent point that we should consider this hypothesis, and we have retained it, and discussion surrounding it, in our manuscript.

      The introduction to the paper does not inform the reader that all olenelloids had a marginal suture - a circumcephalic suture that was operative in their molting and that this is quite different from the situation in, say, "Cedaria" woosteri in which the only operative cephalic exoskeletal suture was circumocular. The conservative position would be that the olenelloid marginal suture is the homolog of the marginal suture in A. serratum: the ventral plates thus being homolog of the trilobite cephalic doublure, not only potential homolog to the entire or dorsal only part of the free cheeks of trilobites with a dorsal facial suture. As the authors of this paper decline to discuss the doublure of trilobites (there is a sole mention of the word in the MS, in a figure caption) and do not mention the olenelloid marginal suture, they give the reader no opportunity to assess support for this alternative. 

      At times the paper reads as if the authors are suggesting that olenelloids, which had a marginal cephalic suture broadly akin to that in Limulus, actually lacked a suture that permitted anterior egression during molting. The authors are right to stress the origin of the dorsal cephalic suture in more derived trilobites as a character seemingly of taxonomic significance but lines such as 56 and 67 may be taken by the non-specialist to imply that olenelloids lacked a forward egressionpermiting suture. There is a notable difference between not knowing whether sutures existed (a condition apparently quite common among soft-bodied artiopods) and the well-known marginal suture of olenelloids, but as the MS currently reads most readers will not understand this because it remains unexplained in the MS. 

      As noted in response to a previous point (above) we now have a greatly expanded introduction which should give the reader an opportunity to assess support for this alternative hypothesis. We now include Olenellus getzi in our analyses, and have added characters to the morphological matrix to make this clear.

      A reference to the case of ‘Cedaria’ woosteri is made in the introduction to highlight further the variability of trilobites, as is a reference to Foote’s analysis of cranidial shapes and support this provides for a  single origin of the dorsal suture.

      With that in mind, it is also worth further stressing that the primary function of the dorsal sutures in those which have them is essentially similar to the olenelloid/limulid marginal suture mentioned above. It is notable that the course of this suture migrated dorsally up from the margin onto the dorsal shield and merged with the circumocular suture, but this innovation does not seem to have had an impact on its primary function - to permit molting by forward egression. Other trilobites completely surrendered the ability to molt by forward egression, and there are even examples of this occurring ontogenetically within species, suggesting a significant intraspecific shift in suture functionality and molting pattern. The authors mention some of this when questioning the unique origin of the dorsal facial suture of trilobites, although I don't understand their argument: why should the history of subsequent evolutionary modification of a character bear on whether its origin was unique in the group? 

      We include reference to evolutionary modification and loss of this character as it is important to stress that if a character is known to have been lost multiple times it is possible that it had a deeper root (in an earlier diverging member of Artiopoda than Trilobita) and was lost in olenelloids. This is the question that we seek to address in our manuscript.

      The bottom line here is that for the ventral plates of A. serratum to be strict homologs of only the dorsal portion of the dorsal free cheeks, there would be no homolog of the trilobite doublure in A. serratum. The conventional view, in contrast, would be that the ventral plates are a homolog of the ventral doublure in all trilobites and ventral plates in artiopods. I do not think that this paper provides a convincing basis for preferring their interpretation, nor do I feel that it does an adequate job of explaining issues that are central to the subject. 

      We stress that our interpretations – that the ventral plates are not homologous to any artiopodan feature or that they are homologous to the free cheeks of trilobites – have both been raised in the literature before. Whereas we could not find mention of the reviewer’s ‘conventional view’ relating to Acanthomeridion. We appreciate that this view is still valid and worth investigating, which we have done in the further analyses conducted. However, we did not find support for it. Instead we find some support for both ventral plates as homologues of free cheeks, and as unique structures within Artiopoda.

      Concern 2. Varieties of dorsal sutures and the coexistence of dorsal and marginal sutures 

      The authors do not clarify or discuss connections between the circumocular sutures (a form of dorsal suture that separates the visual surface from the rest of the dorsal shield) and the marginal suture that facilitates forward egression upon molting. Both structures can exist independently in the same animal - in olenelloids for example. Olenelloids had both a suture that facilitated forward egression in molting (their marginal suture) and a dorsal suture (their circumocular suture). The condition in trilobites with a dorsal facial suture is that these two independent sutures merged - the formerly marginal suture migrating up the dorsal pleural surface to become confluent with the circumocular suture. (There are also interesting examples of the expansion of the circumocular suture across the pleural fixigena.) The form of the dorsal facial suture has long figured in attempts at higher-level trilobite taxonomy, with a number of character states that commonly relate to the proximity of the eye to the margin of the cephalic shield. The form of the dorsal facial suture that they illustrate in Xanderella, which is barely a strip crossing the dorsal pleural surface linking marginal and circumocular suture, is comparable to that in the trilobites Loganopeltoides and Entomapsis but that is a rare condition in that clade as a whole. The paper would benefit from a clear discussion of these issues at the beginning - the dorsal facial suture that they are referring to is a merged circumcephalic suture and circumocular suture - it is not simply the presence of a molt-related suture on the dorsal side of the cephalon. 

      We have added in an expanded introduction where these points are covered in detail. We appreciate that this was not clear in the earlier version, and this suggestion has greatly improved our work.

      Concern 3. Phylogenetics 

      While I appreciate that the phylogenetic database is a little modified from those of other recent authors, still I was surprised not to find a character matrix in the supplementary information (unless it was included in some way I overlooked), which I would consider a basic requirement of any paper presenting phylogenetic trees - after all, there's no a space limit. It is not possible for a reviewer to understand the details of their arguments without seeing the character states and the matrix of state assignments. 

      A link to a morphobank project was included in the first submission. This project has been updated for the current submission, including an additional matrix to treat the reviewer’s hypothesis for the ventral plates. Morphobank Project #P4290. Email address: P4290, reviewer password:

      Acanthomeridion2023, accessible at morphobank.org. We have added in additional details for the reviewer and others to help them access the project:

      The project can be accessed at morphobank.org, using the below credentials to log in:  Email address: P4290, Password: Acanthomeridion 2023.

      The section "phylogenetic analyses" provides a description of how tree topology changes depending on whether sutures are considered homologous or not using the now standard application of both parsimony and maximum likelihood approaches but, considering that the broader implications of this paper rest of the phylogenetic interpretation, I also found the absence of detailed discussion of the meaning and implications of these trees to be surprising, because I anticipated that this was the main reason for conducting these analysis. The trees are presented and briefly described but not considered in detail. I am troubled by "Circles indicate presence of cephalic ecdysial sutures" because it seems that in "independent origin of sutures" trilobites are considered to have two origins (brown color dot) of cephalic ecdysial sutures - this may be further evidence that the team does not appreciate that olenelloids have cephalic ecdysial sutures, as the basal condition in all trilobites. Perhaps I'm misunderstanding their views, but from what's presented it's not possible to know that. Similarly, in the "sutures homologous" analyses why would there be two independent green dots for both Acanthomeridion and Trilobita, rather than at the base of the clade containing them both, as cephalic ecdysial sutures are basal to both of them? Here again, we appear to see evidence that the team considers dorsal facial sutures and cephalic ecdysial sutures to be synonymous - which is incorrect.  

      We appreciate that the reviewer misunderstood the meaning of the dots, leading to confusion. The dots indicated how features were coded in the phylogenetic analysis. In our revised version of this figure (Figure 8 in the new version), these dots are now clearly labelled as indicating ‘coding in phylogenetic matrix’. Further, with the revised character list, we now can provide additional detail for the types of sutures (relevant as we now include more trilobite terminals).

      This point aside, and at a minimum, that team needs to do a more thorough job of characterizing and considering the variety of conditions of dorsal sutures among artiopods, their relationships to the marginal suture and to the circumocular suture, the number, and form of their branches, etc. 

      We thank the reviewer for this summary, and appreciate their concerns and thorough review. Our revised version takes into account all these points raised, and they have greatly improved the clarity, scope and thoroughness of the work.

      Reviewer #3 (Public Review): 

      Summary:

      Well-illustrated new material is documented for Acanthomeridion, a formerly incompletely known Cambrian arthropod. The formerly known facial sutures are shown to be associated with ventral plates that the authors very reasonably homologise with the free cheeks of trilobites. A slight update of a phylogenetic dataset developed by Du et al, then refined slightly by Chen et al, then by Schmidt et al, and again here, permits another attempt to optimise the number of origins of dorsal ecdysial sutures in trilobites and their relatives. 

      Strengths:

      Documentation of an ontogenetic series makes a sound case that the proposed diagnostic characters of a second species of Acanthomeridion are variations within a single species. New microtomographic data shed some light on appendage morphology that was not formerly known. The new data on ventral plates and their association with the ecdysial sutures are valuable in underpinning homologies with trilobites. 

      We thank the Reviewer 3 for their positive comments about the manuscript. We appreciate the constructive comments for improvements, and detailed corrections, which we have incorporated into our revised work.

      Weaknesses:

      The main conclusion remains clouded in ambiguity because of a poorly resolved Bayesian consensus and is consistent with work led by the lead author in 2019 (thus compromising the novelty of the findings). The Bayesian trees being majority rules consensus trees, optimising characters onto them (Figure 7b, d) is problematic. Optimising on a consensus tree can produce spurious optimisations that inflate tree length or distort other metrics of fit. Line 264 refers to at least three independent origins of cephalic sutures in artiopodans but the fully resolved Figure 7c requires only two origins. 

      We thank the reviewer for pointing this out. However now the analyses have been re-run we have new results to consider. The results still support multiple origins of sutures. We also note that the dots were indicating how terminals were coded. This is now clearer in the revised version of this figure (Figure 8 in the new version).

      We have extended our interrogation of the trees by incorporating treespace analyses. These add support for the nodes of interest (around the base of trilobites), showing that the coding of Acanthomeridion ventral plate homologies impacts its position in the tree, and thus has implications for our understanding of the evolution of sutures in trilobites.

      The question of how many times dorsal ecdysial sutures evolved in Artiopoda was addressed by Hou et al (2017), who first documented the facial sutures of Acanthomeridion and optimised them onto a phylogeny to infer multiple origins, as well as in a paper led by the lead author in Cladistics in 2019. Du et al. (2019) presented a phylogeny based on an earlier version of the current dataset wherein they discussed how many times sutures evolved or were lost based on their presence in

      Zhiwenia/Protosutura, Acanthomeridion, and Trilobita. To their credit, the authors acknowledge this (lines 62-65). The answer here is slightly different (because some topologies unite Acanthomeridion and trilobites). 

      The following points are not meant to be "Weaknesses" but rather are refinements: 

      I recommend changing the title of the paper from "cephalic sutures" to "dorsal ecdysial sutures" to be more precise about the character that is being tracked evolutionarily. Lots of arthropods have cephalic sutures (e.g., the ventral marginal suture of xiphosurans; the Y-shaped dorsomedian ecdysial line in insects). The text might also be updated to change other instances of "cephalic sutures" to a more precise wording. 

      We appreciate this point and have changed the title as suggested. 

      The authors have provided (but not explicitly identified) support values for nodes in their Bayesian trees but not in their parsimony ones. Please do the jackknife or bootstrap for the parsimony analyses and make it clear that the Bayesian values are posterior probabilities. 

      With the addition of further trilobite terminals to our parsimony analyses, the results became poor.

      Specifically the internal relationships of trilobites did not conform to any previous study, and Olenellus getzi was not resolved as an early diverging member of the group. This meant that these analyses could not be used for addressing the hypothesis of reviewer two. We decided to exclude reporting parsimony analysis results from this version to avoid confusion.

      We have added a note that the values reported at the nodes are posterior probabilities to figures S8, S9 and S10 where we show the full Bayesian results.

      In line 65 or somewhere else, it might be noted that a single origin of the dorsal facial sutures in trilobites has itself been called into question. Jell (2003) proposed that separate lineages of Eutrilobita evolved their facial sutures independently from separate sister groups within Olenellina. 

      We have added this to the introduction (Line 98). Thank you for raising this point.

      I have provided minor typographic or terminological corrections to the authors in a list of recommendations that may not be publicly available. 

      We appreciate the points made by the reviewer and their detailed corrections, which we have corrected in the revised version.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this paper the authors provide a characterisation of auditory responses (tones, noise, and amplitude modulated sounds) and bimodal (somatosensory-auditory) responses and interactions in the higher order lateral cortex (LC) of the inferior colliculus (IC) and compare these characteristic with the higher order dorsal cortex (DC) of the IC - in awake and anaesthetised mice. Dan Llano's group have previously identified gaba'ergic patches (modules) in the LC distinctly receiving inputs from somatosensory structures, surrounded by matrix regions receiving inputs from auditory cortex. They here use 2P calcium imaging combined with an implanted prism to - for the first time - get functional optical access to these subregions (modules and matrix) in the lateral cortex of IC in vivo, in order to also characterise the functional difference in these subparts of LC. They find that both DC and LC of both awake and anaesthetised appears to be more responsive to more complex sounds (amplitude modulated noise) compared to pure tones and that under anesthesia the matrix of LC is more modulated by specific frequency and temporal content compared to the gaba'ergic modules in LC. However, while both LC and DC appears to have low frequency preferences, this preference for low frequencies is more pronounced in DC. Furthermore, in both awake and anesthetized mice somatosensory inputs are capable of driving responses on its own in the modules of LC, but very little in the matrix. The authors now compare bimodal interactions under anaesthesia and awake states and find that effects are different in some cases under awake and anesthesia - particularly related to bimodal suppression and enhancement in the modules.

      The paper provides new information about how subregions with different inputs and neurochemical profiles in the higher order auditory midbrain process auditory and multisensory information, and is useful for the auditory and multisensory circuits neuroscience community.

      The manuscript is improved by the response to reviewers. The authors have addressed my comments by adding new figures and panels, streamlining the analysis between awake and anaesthetised data (which has led to a more nuanced, and better supported conclusion), and adding more examples to better understand the underlying data. In streamlining the analyses between anaesthetised and awake data I would probably have opted for bringing these results into merged figures to avoid repetitiveness and aid comparison, but I acknowledge that that may be a matter of style. The added discussions of differences between awake and anaesthesia in the findings and the discussion of possible reasons why these differences are present help broaden the understanding of what the data looks like and how anaesthesia can affect these circuits.

      As mentioned in my previous review, the strength of this study is in its demonstration of using prism 2p imaging to image the lateral shell of IC to gain access to its neurochemically defined subdivisions, and they use this method to provide a basic description of the auditory and multisensory properties of lateral cortex IC subdivisions (and compare it to dorsal cortex of IC). The added analysis, information and figures provide a more convincing foundation for the descriptions and conclusions stated in the paper. The description of the basic functionality of the lateral cortex of the IC are useful for researchers interested in basic multisensory interactions and auditory processing and circuits. The paper provides a technical foundation for future studies (as the authors also mention), exploring how these neurochemically defined subdivisions receiving distinct descending projections from cortex contribute to auditory and multisensory based behaviour.

      Minor comment:

      - The authors have now added statistics and figures to support their claims about tonotopy in DC and LC. I asked for and I think allows readers to better understand the tonotopical organisation in these areas. One of the conclusions by the authors is that the quadratic fit is a better fit that a linear fit in DCIC. Given the new plots shown and previous studies this is likely true, though it is worth highlighting that adding parameters to a fitting procedure (as in the case when moving from linear to quadratic fit) will likely lead to a better fit due to the increased flexibility of the fitting procedure.

      Thank you for the suggestion. We have highlighted that the quadratic function allowed the regression model to include the cells tuned to higher frequencies at the rostromedial part of the DC and result in a better fit, which is consistent with the tonotopic organization that was previously described as shown in text at (lines 208-211).

      Reviewer #2 (Public Review):

      Summary:

      The study describes differences in responses to sounds and whisker deflections as well as combinations of these stimuli in different neurochemically defined subsections of the lateral and dorsal cortex of the inferior colliculus in anesthetised and awake mice.

      Strengths:

      A major achievement of the work lies in obtaining the data in the first place as this required establishing and refining a challenging surgical procedure to insert a prism that enabled the authors to visualise the lateral surface of the inferior colliculus. Using this approach, the authors were then able to provide the first functional comparison of neural responses inside and outside of the GABA-rich modules of the lateral cortex. The strongest and most interesting aspects of the results, in my opinion, concern the interactions of auditory and somatosensory stimulation. For instance, the authors find that a) somatosensory-responses are strongest inside the modules and b) somatosensory-auditory suppression is stronger in the matrix than in the modules. This suggests that, while somatosensory inputs preferentially target the GABA-rich modules, they do not exclusively target GABAergic neurons within the modules (given that the authors record exclusively from excitatory neurons we wouldn't expect to see somatosensory responses if they targeted exclusively GABAergic neurons) and that the GABAergic neurons of the modules (consistent with previous work) preferentially impact neurons outside the modules, i.e. via long-range connections.

      Weaknesses:

      While the findings are of interest to the subfield they have only rather limited implications beyond it and the writing is not quite as precise as it could be.

      Reviewer #3 (Public Review):

      The lateral cortex of the inferior colliculus (LC) is a region of the auditory midbrain noted for receiving both auditory and somatosensory input. Anatomical studies have established that somatosensory input primarily impinges on "modular" regions of the LC, which are characterized by high densities of GABAergic neurons, while auditory input is more prominent in the "matrix" regions that surround the modules. However, how auditory and somatosensory stimuli shape activity, both individually and when combined, in the modular and matrix regions of the LC has remained unknown.

      The major obstacle to progress has been the location of the LC on the lateral edge of the inferior colliculus where it cannot be accessed in vivo using conventional imaging approaches. The authors overcame this obstacle by developing methods to implant a microprism adjacent to the LC. By redirecting light from the lateral surface of the LC to the dorsal surface of the microprism, the microprism enabled two-photon imaging of the LC via a dorsal approach in anesthetized and awake mice. Then, by crossing GAD-67-GFP mice with Thy1-jRGECO1a mice, the authors showed that they could identify LC modules in vivo using GFP fluorescence while assessing neural responses to auditory, somatosensory, and multimodal stimuli using Ca2+ imaging. Critically, the authors also validated the accuracy of the microprism technique by directly comparing results obtained with a microprism to data collected using conventional imaging of the dorsal-most LC modules, which are directly visible on the dorsal IC surface, finding good correlations between the approaches.

      Through this innovative combination of techniques, the authors found that matrix neurons were more sensitive to auditory stimuli than modular neurons, modular neurons were more sensitive to somatosensory stimuli than matrix neurons, and bimodal, auditory-somatosensory stimuli were more likely to suppress activity in matrix neurons and enhance activity in modular neurons. Interestingly, despite their higher sensitivity to somatosensory stimuli than matrix neurons, modular neurons in the anesthetized prep were overall more responsive to auditory stimuli than somatosensory stimuli (albeit with a tendency to have offset responses to sounds). This suggests that modular neurons should not be thought of as primarily representing somatosensory input, but rather as being more prone to having their auditory responses modified by somatosensory input. However, this trend was different in the awake prep, where modular neurons became more responsive to somatosensory stimuli. Thus, to this reviewer, one of the most intriguing results of the present study is the extent to which neural responses in the LC changed in the awake preparation. While this is not entirely unexpected, the magnitude and stimulus specificity of the changes caused by anesthesia highlight the extent to which higher-level sensory processing is affected by anesthesia and strongly suggests that future studies of LC function should be conducted in awake animals.

      Together, the results of this study expand our understanding of the functional roles of matrix and module neurons by showing that responses in LC subregions are more complicated than might have been expected based on anatomy alone. The development of the microprism technique for imaging the LC will be a boon to the field, finally enabling much-needed studies of LC function in vivo. The experiments were well-designed and well-controlled, the limitations of two-photon imaging for tracking neural activity are acknowledged, and appropriate statistical tests were used.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      - Increase font size of scale bars on figure 6.

      Thank you for the suggestion. We have increased the font size of the scale bar.

      Reviewer #2 (Recommendations For The Authors):

      Line 505: typo: 'didtinction'

      Thank you for the suggestion and we do apologize for the typo. We have fixed the word as shown in the text (line 506).

      No further comments.

      Reviewer #3 (Recommendations For The Authors):

      Line 543: Change "contripute" to "contribute"

      Thank you for the suggestion and we do apologize for the typo. We have fixed the word as shown in the text (line 544).

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #2 (Public Review):

      The authors indicated that the adherence of ETEC is to intestinal epithelial cells. However, it is also possible that the majority of ETEC may reside in the intestinal mucus, particularly under in vivo infection condition. The colonization of ETEC in the jejunum and colon of piglets (Fig 2C) and in the intestines of mice (Fig S2A) does not necessarily reflect the adherence of ETEC to epithelial cells. Please verify these observations with other methods, such as immunostaining. Also, while Salmonella enterica serovar Typhimurium or Listeria monocytogenes can invade organoids within 1 hour, it is unknown if ETEC invade into organoids in this study. Clarifying this will help resolve if A. muciniphila block the adherence and/or invasion of ETEC. Please also address if A. muciniphila metabolites could prevent ETEC infection in the organoid models.

      In the original manuscript, the sentence “ETEC K88 adheres to intestinal epithelial cells and induces gut inflammation (Yu et al., 2018)” in line 447 is a reference cited for the purpose of connecting the previous and the following, and it is not our result. We have deleted this sentence on line 457. Previous studies have shown that ETEC enter into intestinal epithelial cells after only one hour of infection (Xiao et al., 2022; Qian et al., 2023). Whether A. muciniphila metabolites prevent ETEC infection in the organoid models is not the focus of this manuscript, it may be further explored by other members of the research group in the future.

      References:

      Xiao K, Yang Y, Zhang Y, Lv QQ, Huang FF, Wang D, Zhao JC, Liu YL. 2022. Long-chain PUFA ameliorate enterotoxigenic Escherichia coli-induced intestinal inflammation and cell injury by modulating pyroptosis and necroptosis signaling pathways in porcine intestinal epithelial cells. Br. J. Nutr. 128(5):835-850.

      Qian MQ, Zhou XC, Xu TT, Li M, Yang ZR, Han XY. 2023. Evaluation of Potential Probiotic Properties of Limosilactobacillus fermentum Derived from Piglet Feces and Influence on the Healthy and E. coli-Challenged Porcine Intestine. Microorganisms. 11(4).

      Reviewer #3 (Public Review):

      Summary:

      The manuscript by Ma et al. describes a multi-model (pig, mouse, organoid) investigation into how fecal transplants protect against E. coli infection. The authors identify A. muciniphila and B. fragilis as two important strains and characterize how these organisms impact the epithelium by modulating host signaling pathways, namely the Wnt pathway in lgr5 intestinal stem cells.

      Strengths:

      The strengths of this manuscript include the use of multiple model systems and follow up mechanistic investigations to understand how A. muciniphila and B. fragilis interacted with the host to impact epithelial physiology.

      Weaknesses:

      After revision, the bioinformatics section of the methods is still jumbled and may indicate issues in the pipeline. Important parameters are not included to replicate analyses. Merging the forward and reverse reads may represent a problem for denoising. Chimera detection was performed prior to denoising.

      Potential denoising issues for NovaSeq data was not addressed in the response. The authors did not clarify if multiple testing correction was applied; however, it may be assumed not as written. The raw sequencing data made available through the SRA accession (if for the correct project) indicates it was a MiSeq platform; however, the sample names do not appear to link up to this experimental design and metadata not sufficient to replicate analyses.

      We have redescribed the method for microbiome sequencing analysis on lines 298-327.

      Recommendations for the authors:

      Reviewer #3 (Recommendations For The Authors):

      SRA accession must be confirmed and metadata made available.

      We updated the SRA data.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      (1) In the first paragraph of the result section it is not clear why the authors introduce the function of p53ΔAS/ΔAS in thymocyte and then they mention fibroblasts. The authors should clarify this point. The authors should also explain based on what rationale they use doxorubicin and nutlin to analyze p53 activity (Figure 1 and figure S1). 

      We thank the reviewer for this comment. In the revised manuscript, we corrected this by mentioning, at the beginning of the Results section: “We analyzed cellular stress responses in thymocytes, known to undergo a p53-dependent apoptosis upon irradiation (Lowe et al., 1993), and in primary fibroblasts, known to undergo a p53-dependent cell cycle arrest in response to various stresses - e.g. DNA damage caused by irradiation or doxorubicin (Kastan et al., 1992), and the Nutlin-mediated inhibition of Mdm2, a negative regulator of p53 (Vassilev et al., 2004).”

      (2) The authors should provide quantification for the western blot in figure 2D because the reduction of p53 protein level in mutant vs wt tumors is not striking. 

      In the previous version of the manuscript, the quantification of p53 bands had been included, but quantification results were mentioned below the actin bands, rather than the p53 bands, and this was probably confusing. We have corrected this in the revised version of the manuscript. The quantification results are now provided just below the p53 bands in Figs. 1B and 2D, which should clarify this point. For Figure 2D, the quantifications show a strong decrease in p53 levels for 3 out of 4 analyzed mutant tumors. For consistency purposes, in the revised manuscript the quantification results also appear below Myc bands in Fig. 2C.

      (3) In the discussion section, the authors propose that a difference in Ackr4 expression may have prognostic value and that measuring ACKR4 gene expression in male patients with Burkitt lymphoma could be useful to identify the patients at higher risk. However the authors perform a lot of correlative analysis, both in mice and in patients, but the manuscript lacks of functional experiments that could help to functionally characterize Ackr4 and Mt2 in the etiology of B-cell lymphomas in males (both in mouse and in human models).

      In the previous version of the manuscript, we proposed that Ackr4 might act as a suppressor of B-cell lymphomagenesis by attenuating Myc signaling. This hypothesis relied on studies showing that Ackr4 impairs the Ccr7 signaling cascade, which may lead to decreased Myc activity (Ulvmar et al., 2014; Shi et al., 2015; Bastow et al., 2021) and that the loss of Ccr7 may delay Myc-driven lymphomagenesis (Rehm et al., 2011). Furthermore, we proposed that the increased expression of Mt2 in p53ΔAS/ΔAS Em-Myc male splenic cells reflected an increase in Myc activity, because Mt2 is known to be regulated by Myc (Qin et al., 2021) and because the Mt2 promoter is bound by Myc in B cells according to experiments reported in the ChIP-Atlas database. However, in the first version of the manuscript this hypothesis might have appeared only partially supported by our data because an increase in Myc activity could be expected to have a more general impact, i.e. an impact not only on the expression of Mt2, but also on the expression of many canonical Myc target genes. In the revised manuscript, we show that this is indeed the case. We performed a gene set enrichment analysis (GSEA) comparing the RNAseq data from p53ΔAS/ΔAS Eμ-Myc and p53+/+ Eμ-Myc male splenic cells and found an enrichment of hallmark Myc targets in p53ΔAS/ΔAS Eμ-Myc cells. These new data, which strengthen our hypothesis of differences in Myc signaling intensity, are presented in Fig. 3K and Table S2.

      Importantly, we now go beyond correlative analyses by providing direct experimental evidence that ACKR4 impacts on the behavior of Burkitt lymphoma cells. We used a CRISPR-Cas9 approach to knock-out ACKR4 in Raji Burkitt lymphoma cells and found that ACKR4 KO cells exhibited a 4-fold increase in chemokine-guided cell migration. These new data are presented in Figure 4F and the supplemental Figures S5-S7.  

      Finally, following a suggestion of Reviewer#2, we now also point out that “Ackr4 regulates B cell differentiation (Kara et al., 2018), which raises the possibility that an altered p53-Ackr4 pathway in p53ΔAS/ΔAS Eμ-Myc male splenic cells might contribute to increase the pools of pre-B and immature B cells that may be prone to lymphomagenesis.”

      In sum, we now mention in the Discussion that a decrease in Ackr4 expression might promote B-cell lymphomagenesis through three non-exclusive mechanisms.

      Reviewer #2 (Recommendations For The Authors): 

      (1) A great addition would be to demonstrate how p53AS specifically contributes to the regulation of Ackr4. In particular, is there evidence that p53AS might be preferentially recruited on p53 RE within that gene as compared to WT? The availability of specific antibodies that distinguish between AS and WT p53 might help to address this (experimentally complex) question. As a note, usage of such antibodies would also strengthen Fig 1B, in which the AS isoform appears as a mere faint shadow under p53, thus making its "disappearance" in trp53ΔAS/ΔAS difficult to evaluate. 

      We agree with the referee that efficient antibodies against p53-AS isoforms would have been useful. In fact, we tried a non-commercial antibody developed for that purpose, but it led to many unspecific bands in western blots and appeared not reliable. Importantly however, our luciferase assays clearly show that both p53-a and p53-AS can transactivate Ackr4, a result that might be expected because these isoforms share the same DNA binding domain. Furthermore, because p53-a isoforms appear more abundant than p53-AS isoforms at the protein and RNA levels (Figs. 1B and S1A), and because the loss of p53-AS isoforms leads to a significant decrease in p53-a protein levels (Figs. 1B and 2D), we think that in p53ΔAS/ΔAS cells the reduction in p53-a levels might be the main reason for a decreased transactivation of Ackr4. This is now more clearly discussed in the revised manuscript.

      (2) A most interesting observation is in Fig3 A and Fig S3, showing that spleen cells of p53ΔAS Eμ-Myc males (but not females) were enriched in pre-B and immature B cells as compared to WT counterparts. This observation points to a possible defect in B cell maturation process. It would be most interesting to determine whether this particular defect is directly mediated by a p53AS-Ackr4 axis. The hypothesis raised by the authors in the Discussion section is that increased Ackr4 expression may delay lymphomatogenesis, but data in Fig 3A and 3S actually suggest that ΔAS increases the pool of immature B-cell that may be prone to lymphomagenesis. 

      We thank the reviewer for this useful comment, which we integrated in the Discussion of the revised manuscript. Ackr4 was shown to regulate B cell differentiation (Kara at al. (2018) J Exp Med 215, 801–813), so this is indeed one of the possible mechanisms by which a deregulation of the p53-Ackr4 axis might promote lymphomagenesis. We now mention: “Ackr4 regulates B cell differentiation (Kara et al., 2018), which raises the possibility that an altered p53-Ackr4 pathway in p53ΔAS/ΔAS Eμ-Myc male splenic cells might contribute to increase the pools of pre-B and immature B cells that may be prone to lymphomagenesis.” This is presented as one of three possible mechanisms by which decreased Ackr4 levels may promote tumorigenesis, the two others being the impact of Ackr4 on the chemokine-guided migration of lymphoma cells and its apparent effect on Myc signalling.

      (3) The concordance with a male-specific prognostic effect of Ackr4 is most interesting in itself but is only of correlative evidence with respect to the study. Is there any information on whether p53AS expression is also a prognostic factor in BL? And is there evidence that Ackr4 may also be a male-specific prognostic factor in other B-cell malignancies, e.g. Multiple Myeloma?

      We have now performed the CRISPR-mediated knock-out of ACKR4 in Burkitt lymphoma cells and found that it leads to a dramatic increase in chemokine-guided cell migration, which goes beyond correlation. This significant new result is mentioned in the revised abstract and presented in detail in Figures 4F and S5-S7.

      Regarding p53-AS isoforms, they are murine-specific isoforms (Marcel et al. (2011) Cell Death Diff 18, 1815-1824), so there is no information on p53-AS expression in Burkitt lymphoma. Human p53 isoforms with alternative C-terminal domains are p53b and p53g isoforms, but the datasets we analyzed did not provide any information on the relative levels of p53a (the canonical isoform), p53b or p53g isoforms. We agree with the referee that this is an interesting question, but that cannot be answered with currently available datasets.

      Regarding the different types of B-cell malignancies, we had already shown that Ackr4 is a male-specific prognostic factor in Burkitt lymphomas but not in Diffuse Large B cell lymphomas, which indicated that it is not a prognostic factor in all types of B cell lymphomas. For this revision, we also searched for its potential prognostic value in multiple myeloma, and found that, as for DLBCL, it is not a prognostic factor in this cancer type. This new analysis is presented in Figure S4C.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: This article explores the role of Ecdysone in regulating female sexual receptivity in Drosophila. The researchers found that PTTH, throughout its role as a positive regulator of ecdysone production, negatively affects the receptivity of adult virgin females. Indeed, loss of larval PTTH before metamorphosis significantly increases female receptivity right after adult eclosion and also later. However, during metamorphic neurodevelopment, Ecdysone, primarily through its receptor EcR-A, is required to properly develop the P1 neurons since its silencing led to morphological changes associated with a reduction in adult female receptivity. Nonetheless, the result shown in this manuscript sheds light on how Ecdysone plays a dual role in female adult receptivity, inhibiting it during larval development and enhancing it during metamorphic development. Unfortunately, this dual and opposite effect in two temporally different developmental stages has not been highlighted or explained. 

      Strengths: This paper exhibits multiple strengths in its approach, employing a well-structured experimental methodology that combines genetic manipulations, behavioral assays, and molecular analysis to explore the impact of Ecdysone on regulating virgin female receptivity in Drosophila. The study provides clear and substantial findings, highlighting that removing PTTH, a positive Ecdysone regulator, increases virgin female receptivity. Additionally, the research expands into the temporal necessity of PTTH and Ecdysone function during development. 

      Weaknesses: 

      There are two important caveats with the data that are reflecting a weakness: 

      (1) Contradictory Effects of Ecdysone and PTTH: One notable weakness in the data is the contrasting effects observed between Ecdysone and its positive regulator PTTH. PTTH loss of function increases female receptivity, while ecdysone loss of function reduces it. Given that PTTH positively regulates Ecdysone, one would expect that the loss of function of both would result in a similar phenotype or at least a consistent directional change. 

      A1. As newly formed prepupae, the ptth-Gal4>UAS-Grim flies display similar changes in gene expression to the genetic control flies to response to a high-titer ecdysone pulse. These include the repression of EcR (McBrayer et al.,2007). We tested whether there is a similar feedforward relationship between PTTH and EcR-A. We quantified the EcR-A mRNA level of PTTH -/- and PTTH -/+ in the whole body of newly formed prepupae. Indeed, PTTH -/- induced increased EcR-A expression in the whole body of newly formed prepupae compared with PTTH -/+ flies. Because of the function of EcR-A in gene expression, this suggests that PTTH -/- disturbs the regulation of a serious of gene expressions during metamorphosis. However, it is not sure that the EcR-A expression in pC1 neurons is increased compared with genetic controls when PTTH is deleted. Furthermore, PTTH -/- must affect development of other neurons rather than only pC1 neurons. So, the feedforward relationship between PTTH and EcRA at the start of prepupal stage is one possible cause for the contradictory effects of PTTH -/- and EcR-A RNAi in pC1 neurons.  

      (2) Discordant Temporal Requirements for Ecdysone and PTTH: Another weakness lies in the different temporal requirements for Ecdysone and PTTH. The data from the manuscript suggest that PTTH is necessary during the larval stage, as shown in Figure 2 E-G, while Ecdysone is required during the pupal stage, as indicated in Figure 5 I-K. Ecdysone is a crucial developmental hormone with precisely regulated expression throughout development, exhibiting several peaks during both larval and pupal stages. PTTH is known to regulate Ecdysone during the larval stage, specifically by stimulating the kinetics of Ecdysone peaking at the wandering stage. However, it remains unclear whether pupal PTTH, expressed at higher levels during metamorphosis, can stimulate Ecdysone production during the pupal stage. Additionally, given the transient nature of the Ecdysone peak produced at wandering time, which disappears shortly before the end of the prepupal stage, it is challenging to infer that larval PTTH will regulate Ecdysone production during the pupal stage based on the current state of knowledge in the neuroendocrine field.  

      Considering these two caveats, the results suggest that the authors are witnessing distinct temporal and directional effects of Ecdysone on virgin female receptivity.  

      A2. First of all, it is necessary to clarify the detailed time for the manipulation of Ptth gene and PTTH neurons. In Figure 3, activation of PTTH neurons during the stage 2 inhibited the female receptivity. The “stage 2” is from six hours before the 3rd-instar larvae to the end of the wandering larvae (the start of prepupae). In Figure 5, The “pupal stage” is from the prepupal stage to the end of pupal stage. This “pupal stage” includes the forming of prepupae when the ecdysone peak is not disappeared. The time of manipulating Ptth and EcR-A in pC1 neurons are continuous. In addition, the pC1-Gal4 expressing neurons appear also at the start of prepupal stage. So, it is possible that PTTH regulates female receptivity through the function of EcR-A in pC1 neurons. 

      Reviewer #1 (Recommendations For The Authors): 

      In light of the significant caveat previously discussed, I will just make a few general suggestions: 

      (1) The paper primarily focuses on robust phenotypes, particularly in PTTH mutants, with a well-detailed execution of several experiments, resulting in thorough and robust outcomes. However, due to the caveat previously presented (opposite effect in larva and pupa), consider splitting the paper into two parts: Figures 1 to 4 deal with the negative effect of PTTH-Ecdysone on early virgin female receptivity, while Figures 5 to 7 focus on the positive metamorphic effect of Ecdysone in P1 metamorphic neurodevelopment. However, in this scenario, the mechanism by which PTTH loss of function increases female receptivity should be addressed.

      A3. It is a good suggestion that splitting the paper into two parts associated with the PTTH function and EcR function in pC1 neurons separately, if it is impossible that PTTH functions in female receptivity through the function of EcR-A in pC1 neurons. However, because of the feedforward relationship between PTTH and EcR-A in the newly formed prepupae, and the time of manipulating Ptth and EcR-A in pC1 neurons is continuous, it is possible that these two functions are not independent of each other. So, we still keep the initial edition.

      (2) Validate the PTTH mutants by examining homozygous mutant phenotypes and the dose-dependent heterozygous mutant phenotype using existing PTTH mutants. This could also be achieved using RNAi techniques.

      A4. We did not get other existing PTTH mutants. We instead decreased the PTTH expression in PTTH neurons and dsx+ neurons, but did not detect the similar phenotype to that of PTTH -/-. Similarly, the overexpression through PTTH-Gal4>UAS-PTTH is also not sufficient to change female receptivity. It is possible that both decreasing and increasing PTTH expression are not sufficient to change female receptivity.

      (3) Clarify if elav-Gal4 is not expressed in PTTH neurons and discuss how the rescue mechanisms work (hormonal, paracrine, etc.) in the text.

      A5. We tested the overlap of elav-Gal4>GFP signal and the stained PTTH with PTTH antibody. We did not detect the overlap. It suggests that elav-Gal4 is not expressed in PTTH neurons. However, we detected the expression of PTTH (PTTH antibody) in CNS when overexpressed PTTH using elav-Gal4>UASPTTH based on PTTH -/-. Furthermore, this rescued the phenotype of PTTH -/- in female receptivity. Insect PTTH isoforms have similar probable signal peptide for secreting. Indeed, except for the projection of axons to PG gland, PTTH also carries endocrine function acting on its receptor Torso in light sensors to regulate light avoidance of larvae. The overexpressed PTTH in other neurons through elav-Gal4>UASPTTH may act on the PG gland through endocrine function and then induce the ecdysone synthesis and release. So that, although elav-Gal4 is not expressed in PTTH neurons, the ecdysone synthesis triggered by PTTH from the hemolymph may result in the rescued PTTH -/- phenotype in female receptivity.

      (4) Consider renaming the new PTTH mutant to avoid confusion with the existing PTTHDelta allele. 

      A6. We have renamed our new PTTH mutant as PtthDelete.

      (5) Include the age of virgin females in each figure legend, especially for Figures 2 to 7, to aid in interpretation. This is essential information since wild-type early virgins -day 1- show no receptivity. In contrast, they reach a typical 80% receptivity later, and the mechanism regulating the first face might differ from the one occurring later.

      A7. We have included the age of virgin females in each figure legend. 

      (6) Explain the relevance of observing that PTTH adult neurons are dsx-positive, as it's unclear why this observation is significant, considering that these neurons are not responsible for the observed receptivity effect in virgin females. Alternatively, address this in the context of the third instar larva or clarify its relevance.  

      A8. We decreased the DsxF expression in PTTH neurons and did not detect significantly changed female receptivity. Almost all neurons regulating female receptivity, including pC1 neurons, express DsxF. We suppose that PTTH neurons have some relationship with other DsxF-positive neurons which regulate female receptivity. Indeed, we detected the overlap of dsx-LexA>LexAop-RFP and torso-Gal4>UAS-GFP during larval stage. Furthermore, decreasing Torso expression in pC1 neurons significantly inhibit female receptivity. 

      These results suggest that, PTTH regulates female receptivity not only through ecdysone, but also may through regulating other neurons especially DsxF-positive neurons associated with female receptivity directly. 

      Reviewer #2 (Public Review): 

      Summary: The authors tried to identify novel adult functions of the classical Drosophila juvenile-adult transition axis (i.e. ptth-ecdysone). Surprisingly, larval ptth-expressing neurons expressed the sex-specific doublesex gene, thus belonging to the sexual dimorphic circuit. Lack of ptth during late larval development caused enhanced female sexual receptivity, an effect rescued by supplying ecdysone in the food. Among many other cellular players, pC1 neurons control receptivity by encoding the mating status of females. Interestingly, during metamorphosis, a subtype of pC1 neurons required Ecdysone Receptor A in order to regulate such female receptivity. A transcriptomic analysis using pC1-specific Ecdyone signaling down-regulation gives some hints of possible downstream mechanisms. 

      Strengths: the manuscript showed solid genetic evidence that lack of ptth during development caused enhanced copulation rate in female flies, which includes ptth mutant rescue experiments by overexpressing ptth as well as by adding ecdysone-supplemented food. They also present elegant data dissecting the temporal requirements of ptth-expressing neurons by shifting animals from non-permissive to permissive temperatures, in order to inactivate neuronal function (although not exclusively ptth function). By combining different drivers together with a EcR-A RNAi line authors also identified the Ecdysone receptor requirements of a particular subtype of pC1 neurons during metamorphosis. Convincing live calcium imaging showed no apparent effect of EcR-A in neural activity, although some effect on morphology is uncovered. Finally, bulk RNAseq shows differential gene expression after EcR-A down-regulation. 

      Weaknesses: the paper has three main weaknesses. The first one refers to temporal requirements of ptth and ecdysone signaling. Whereas ptth is necessary during larval development, the ecdysone effect appears during pupal development. ptth induces ecdysone synthesis during larval development but there is no published evidence about a similar role for ptth during pupal stages. Furthermore, larval and pupal ecdysone functions are different (triggering metamorphosis vs tissue remodeling). The second caveat is the fact that ptth and ecdysone loss-of-function experiments render opposite effects (enhancing and decreasing copulation rates, respectively). The most plausible explanation is that both functions are independent of each other, also suggested by differential temporal requirements. Finally, in order to identify the effect in the transcriptional response of down-regulating EcR-A in a very small population of neurons, a scRNAseq study should have been performed instead of bulk RNAseq. 

      In summary, despite the authors providing convincing evidence that ptth and ecdysone signaling pathways are involved in female receptivity, the main claim that ptth regulates this process through ecdysone is not supported by results. More likely, they'd rather be independent processes. 

      B1. Clarification: in Figure 3, activation of PTTH neurons during the stage 2 inhibited the female receptivity. The “stage 2” is from six hours before the 3rd-instar larvae to the end of the wandering larvae (the start of prepupae). In Figure 5, The “pupal stage” is from the start of prepupal stage to the end of pupal stage. This “pupal stage” includes the forming of prepupae when the ecdysone peak is not disappeared. The time of manipulating Ptth and EcR-A in pC1 neurons are continuous. In addition, the pC1-Gal4 expressing neurons appear also at the start of prepupal stage. So, it is possible that PTTH regulates female receptivity through the function of EcR-A in pC1 neurons. 

      B2. During the forming of prepupae, the ptth-Gal4>UAS-Grim flies display similar changes in gene expression to the genetic control flies to response to a high-titer ecdysone pulse. These include the repression of EcR (McBrayer et al.,2007). We tested whether there is a similar feedforward relationship between PTTH and EcR-A. We quantified the EcR-A mRNA level of PTTH -/- and PTTH -/+ in the whole body of newly formed prepupae. Indeed, PTTH -/- induced increased EcR-A compared with PTTH -/+ flies. Because of the function of EcR-A in gene expression, this suggests that PTTH -/- disturbs the regulation of a serious of gene expressions during metamorphosis. However, it is not sure that the EcR-A expression in pC1 neurons is increased compared with genetic controls when PTTH is deleted. Furthermore, PTTH -/- must affect the development of other neurons rather than only pC1 neurons. So, the feedforward relationship between PTTH and EcR-A at the start of prepupal stage is one possible cause for the contradictory effects of PTTH -/- and EcR-A RNAi in pC1 neurons.

      B3. We will do single cell sequencing in pC1 neurons for the exploration of detailed molecular mechanism of female receptivity in the future.

      Reviewer #2 (Recommendations For The Authors): 

      Additional experiments and suggestions: 

      - torso LOF in the PG to determine whether or not the ecdysone peak regulated by ptth (there is a 1-day delay in pupation) is responsible for the ptth effect in L3. In the same line, what happens if torso is downregulated in the pC1 neurons? Is there any effect on copulation rates? 

      B4. Because the loss of phm-Gal4, we could not test female receptivity when decreasing the expression of Torso in PG gland. However, decreasing Torso expression in pC1 neurons significantly inhibit female receptivity. This suggests that PTTH regulates female receptivity not only through ecdysone but also through regulating dsx+ pC1 neurons in female receptivity directly.

      - What is the effect of down-regulating ptth in the dsx+ neurons? No ptth RNAi experiments are shown in the paper. 

      B5. We decreased PTTH expression in dsx+ neurons but did not detect the change in female receptivity.  We also decreased PTTH expression in PTTH neurons using PTTH-Gal4, also did not detect the change in female receptivity. Similarly, the overexpression through PTTH-Gal4>UAS-PTTH is also not sufficient to change female receptivity. It is possible that both decreasing and increasing PTTH expression are not sufficient to change female receptivity.

      - Why are most copulation rate experiments performed between 4-6 days after eclosion? ptth LOF effect only lasts until day 3 after eclosion (but very weak-fig 1). Again, this supports the idea that ptth and ecdysone effects are unrelated.

      B6. Most behavioral experiments were performed between 4-6 days after eclosion as most other studies in flies, because the female receptivity reaches the peak at that time. Ptth LOF made female receptivity enhanced from the first day after eclosion. This seems like the precocious puberty. Wild type females reach high receptivity at 2 days after eclosion (about 75% within 10 min). We suppose that Ptth LOF effect only lasts until day 3 after eclosion because too high level of receptivity of control flies to exceed.

      It is not sure whether the effect of PTTH-/- in female receptivity disappears after the 3rd day of adult flies. So that it is not sure whether PTTH and EcR-A effects in pC1 neurons are unrelated.

      - The fact that pC1d neuronal morphology changes (and not pC1b) does not explain the effect of EcR-A LOF. Despite it is highlighted in the discussion, data do not support the hypothesis. How do these pC1 neurons look like in a ptth mutant animal regarding Calcium imaging and/or morphology? 

      B7. We detected the pattern of pC1 neurons when PTTH is deleted. Consistent with the feedforward relationship between PTTH and expression of EcR-A in newly formed prepupae, PTTH deletion induced less established pC1-d neurons contrary to that induced by EcR-A reduction in pC1 neurons. However, it is not sure that the expression of EcR-A in pC1 neurons is increased when PTTH is deleted. Furthermore, on the one hand, manipulation of PTTH has general effect on the neurodevelopment not only regulating pC1 neurons. On the other hand, the detailed pattern of pC1-b neurons which is the key subtype regulating female receptivity when EcR-A is decreased in pC1 neurons or PTTH is deleted could not be seen clearly. So, the abnormal development of pC1-b neurons, if this is true, is just one of the possible reasons for the effect of PTTH deletion on female receptivity.

      - The discussion is incomplete, especially the link between ptth and ecdysone; discuss why the phenotype is the opposite (ptth as a negative regulator of ecdysone in the pupa, for instance); the difference in size due to ptth LOF might be related to differential copulation rates.  

      B8. We have revised the discussion. We could not exclude the effect of size of body on female receptivity when PTTH was deleted or PTTH neurons were manipulated, although there was not enough evidence for the effect of body size on female receptivity.

      - scheme of pC neurons may help. 

      B9. We have tried to label pC1 neurons with GFP and sort pC1 neurons through flow cytometry sorting, but could not success. This may because the number of pC1 neurons is too low in one brain. We will try single-cell sequencing in the future. 

      - Immunofluorescence images are too small.

      B10. We have resized the small images.

      Reviewer #3 (Public Review): 

      Summary: 

      This manuscript shows that mutations that disable the gene encoding the PTTH gene cause an increase in female receptivity (they mate more quickly), a phenotype that can be reversed by feeding these mutants the molting hormone, 20-hydoxyecdysone (20E). The use of an inducible system reveals that inhibition or activation of PTTH neurons during the larval stages increases and decreases female receptivity, respectively, suggesting that PTTH is required during the larval stages to affect the receptivity of the (adult) female fly. Showing that these neurons express the sex-determining gene dsx leads the authors to show that interfering with 20E actions in pC1 neurons, which are dsx-positive neurons known to regulate female receptivity, reduces female receptivity and increases the arborization pattern of pC1 neurons. The work concludes by showing that targeted knockdown of EcRA in pC1 neurons causes 527 genes to be differentially expressed in the brains of female flies, of which 123 passed a false discovery rate cutoff of 0.01; interestingly, the gene showing the greatest down-regulation was the gene encoding dopamine beta-monooxygenase. 

      Strengths 

      This is an interesting piece of work, which may shed light on the basis for the observation noted previously that flies lacking PTTH neurons show reproductive defects ("... females show reduced fecundity"; McBrayer, 2007; DOI 10.1016/j.devcel.2007.11.003). 

      Weaknesses: 

      There are some results whose interpretation seem ambiguous and findings whose causal relationship is implied but not demonstrated. 

      (1) At some level, the findings reported here are not at all surprising. Since 20E regulates the profound changes that occur in the central nervous system (CNS) during metamorphosis, it is not surprising that PTTH would play a role in this process. Although animals lacking PTTH (rather paradoxically) live to adulthood, they do show greatly extended larval instars and a corresponding great delay in the 20E rise that signals the start of metamorphosis. For this reason, concluding that PTTH plays a SPECIFIC role in regulating female receptivity seems a little misleading, since the metamorphic remodeling of the entire CNS is likely altered in PTTH mutants. Since these mutants produce overall normal (albeit larger--due to their prolonged larval stages) adults, these alterations are likely to be subtle. Courtship has been reported as one defect expressed by animals lacking PTTH neurons, but this behavior may stand out because reduced fertility and increased male-male courtship (McBrayer, 2007) would be noticeable defects to researchers handling these flies. By contrast, detecting defects in other behaviors (e.g., optomotor responses, learning and memory, sleep, etc) would require closer examination. For this reason, I would ask the authors to temper their statement that PTTH is SPECIFICALLY involved in regulating female receptivity.  

      C1. We agree with that, it is not surprising that PTTH regulates the profound changes that occur in the CNS during metamorphosis through ecdysone. Also, the behavioral changes induced by PTTH mutants include not only female receptivity. We will temper the statement about the function of PTTH on female receptivity.

      We think there are two new points in our text although more evidences are needed in the future. On the one hand, PTTH deletion and the reduction of EcR-A in pC1 neurons during metamorphosis have opposite effects on female receptivity. On the other hand, development of pC1-b neurons regulated by EcR-A during metamorphosis is important for female receptivity.

      (2) The link between PTTH and the role of pC1 neurons in regulating female receptivity is not clear. Again, since 20E controls the metamorphic changes that occur in the CNS, it is not surprising that 20E would regulate the arborization of pC1 neurons. And since these neurons have been implicated in female receptivity, it would therefore be expected that altering 20E signaling in pC1 neurons would affect this phenotype. However, this does not mean that the defects in female receptivity expressed by PTTH mutants are due to defects in pC1 arborization. For this, the authors would at least have to show that PTTH mutants show the changes in pC1 arborization shown in Fig. 6. And even then the most that could be said is that the changes observed in these neurons "may contribute" to the observed behavioral changes. Indeed, the changes observed in female receptivity may be caused by PTTH/20E actions on different neurons.

      C2. As newly formed prepupae, the ptth-Gal4>UAS-Grim flies display similar changes in gene expression to the genetic control flies to response to a high-titer ecdysone pulse. These include the repression of EcR (McBrayer et al., 2007). We tested whether there is a similar feedforward relationship between PTTH and EcR-A. We quantified the EcR-A mRNA level of PTTH -/- and PTTH -/+ in the whole body of newly formed prepupae. Indeed, PTTH -/- induced upregulated EcR-A in the whole body of newly formed prepupae compared with PTTH -/+ flies. We also detected the pattern of pC1 neurons when PTTH is deleted. Consistent with the feedforward relationship between PTTH and expression of EcR-A in newly formed prepupae, PTTH deletion induced less established pC1-d neurons contrary to that induced by EcR-A reduction in pC1 neurons. 

      However, it is not sure that the expression of EcR-A in pC1 neurons increases compared with genetic controls when PTTH is deleted. Furthermore, on the one hand, manipulation of PTTH has general effect on the neurodevelopment. On the other hand, the detailed pattern of pC1-b neurons which is the key subtype regulating female receptivity through EcR-A function in pC1 neurons could not be seen clearly. So, the abnormal development of pC1b neurons, if this is true, is just one of the possible reasons for the effect of PTTH deletion on female receptivity.

      (3) Some of the results need commenting on, or refining, or revising:  a- For some assays PTTH behaves sometimes like a recessive gene and at other times like a semidominant, and yet at others like a dominant gene. For instance, in Fig. 1D-G, PTTH[-]/+ flies behave like wildtype (D), express an intermediate phenotype (E-F), or behave like the mutant (G). This may all be correct but merits some comment.

      C3. Female receptivity increases with the increase of age after eclosion, not only for wild type flies but also PTTH mutants. At the first day after eclosion (Figure 1D), maybe the loss of PTTH in PTTH[-]/+ flies is not enough for sexual precocity as in PTTH -/-. At the second day after eclosion and after (Figure 1E-G), the loss of PTTH in PTTH[-]/+ flies is sufficient to enhance female receptivity compared with wild type flies. However, After the 2nd day of adult, female receptivity of all genotype flies increases sharply. At the 3rd day of adult and after, female receptivity of PTTH -/- reaches the peak and the receptivity of PTTH[-]/+ reaches more nearly to PTTH -/- when flies get older.  

      b - Some of the conclusions are overstated. i) Although Fig. 2E-G does show that silencing the PTTH neurons during the larval stages affects copulation rate (E) the strength of the conclusion is tempered by the behavior of one of the controls (tub-Gal80[ts]/+, UAS-Kir2.1/+) in panels F and G, where it behaves essentially the same as the experimental group (and quite differently from the PTTH-Gal4/+ control; blue line).(Incidentally, the corresponding copulation latency should also be shown for these data.). ii) For Fig. 5I-K, the conclusion stated is that "Knock-down of EcR-A during pupal stage significantly decreased the copulation rate." Although strictly correct, the problem is that panel J is the only one for which the behavior of the control lacking the RNAi is not the same as that of the experimental group. Thus, it could just be that when the experiment was done at the pupal stage is the only situation when the controls were both different from the experimental. Again, the results shown in J are strictly speaking correct but the statement is too definitive given the behavior of one of the controls in panels I and K. Note also that panel F shows that the UAS-RNAi control causes a massive decrease in female fertility, yet no mention is made of this fact.

      C4. i) For all figures in the text, only when all the control groups were significant different from assay group, we say the assay group is significantly different. In Figure 2E-G, the control groups were both different from the assay group only at the larval stage. The difference between two control groups may due to the genetic background. We have described more detailed statistical analysis in the legend. In addition, the corresponding copulation latency has been shown. ii) For Figure 5, we have revised the conclusion in text as “when the experiment was done at the pupal stage is the only situation when the controls were both different from the experimental.” Besides, the UAS-RNAi control causes a massive decrease in female fertility in panel F has been mentioned.

      Reviewer #3 (Recommendations For The Authors): 

      (1) I am not sure that PTTH neurons should be referred to as "PG neurons". I am aware that this name has been used before but the PG is a gland that does not have neurons; it is not even innervated in all insects. 

      C5. Agree. “PG neurons” has been changed into “PTTH neurons”.

      (2) Fig. 1A warrants some explanation. One can easily imagine what it shows but a description is warranted. 

      C6. Explanation has been added.

      (3) When more than one genotype is compared it would be more useful to use letters to mark the genotypes that are not statistically different from each other rather than simply using asterisks. For instance, in the case of copulation latencies shown in Fig. 1E-G, which result does the comparison refer to? For example, since the comparisons are the result of ANOVAs, which comparison receives "*" in Fig. 1F? Is it PTTH[-]/+ vs PTTH[-]/PTTH[-] or vs. +/+? 

      C7. Referred genotypes and conditions were marked in all figure legends.

      (4) Fig. 1H: Why is copulation latency of PTTH[-]/PTTH[-]+elav-GAL4 significantly different from that of PTTH[-]/PTTH[-]? This merits a comment. Also, why was elav-GAL4 used to effect the rescue and not the PTTH-GAL4 driver? 

      C8. We could not explain this phenomenon. This may due to the different genetic backgrounds between controls. We have mentioned this in figure legend.

      (5) Fig. 2C, the genotype is written in a confusing order, GAL4+UAS should go together as should LexA+LexAop. 

      C9. We have revised for avoiding confusion.

      (6) In Fig. 2, is "larval stage" the same period that is shown in Fig. 3A? Please clarify.

      C10. We have clarified this in text and legends.

      (7) Fig. 6. The fact that pC1 neurons can be labeled using the pC1-ss2-Gal4 at the start of the pupal stage does not mean that this is when these neurons appear (are born), only when they start expressing this GAL4. Other types of evidence would be needed to make a statement about the birthdate of these neurons. 

      C11. We have revised the description for the appearance of pC1-ss2-Gal4>GFP. The detailed birth time of pC1 neurons will be tested in future.

      (8) The results shown in Fig. 7 are not pursued further and thus appear like a prelude to the next manuscript. Unless the authors have more to add regarding the role of one of the differentially expressed genes (e.g., dopamine beta-monooxygenase, which they single out) I would suggest leaving this result out. 

      C12. We have leave this out.

      (9) Female flies lacking PTTH neurons were reported to show lower fecundity by McBrayer et al. (2007) and should be cited. 

      C13. This important study has been cited in the first manuscript. In this revision, we have cited it again when mentioning the lower fecundity of female flies lacking PTTH neurons.

      (10) Line 230: when were PTTH neurons activated? Since they are dead by 10h post-eclosion it isn't clear if this experiment even makes sense. 

      C14. Yes, we did this for making sure that PTTH neurons do not affect female receptivity at adult stage again.

      (11) Line 338: the statements in the figures say that PTTH function is required during the larval stages, not during metamorphosis 

      C15. This has been revised as “The result suggested that EcR-A in pC1 neurons plays a role in virgin female receptivity during metamorphosis. This is consistent with that PTTH regulates virgin female receptivity before the start of metamorphosis.”

      (12) Did the authors notice any abnormal behavior in males? McBrayer et al. (2007) mention that males lacking PTTH neurons show male-male courtship. This may remit to the impact of 20E on other dsx[+] neurons. 

      C16. Yes, we have noticed that males lacking PTTH show male-male courtship. It is possible that PTTH deletion induces male-male courtship through the impact of 20E on other dsx+ or fru+ neurons. We have added the corresponding discussion.

      (13) Line 145: please define CCT at first use 

      C17. CCT has been defined.

      (14) Overall the manuscript is well written; however, it would still benefit from editing by a native English speaker. I have marked a few corrections that are needed, but I probably missed some. 

      + Line 77: "If female is not willing..." should say "If THE female is not willing..." 

      + Line 78 "...she may kick the legs, flick the wings," should say "...she may kick HER legs, flick HER wings," 

      + Lines 93-94 this sentence is unclear: "...while the neurons in that fru P1 promoter or dsx is expressed regulate some aspects..." 

      + Line 108 "...similar as the function of hypothalamic-pituitary-gonadal (HPG).." should say "...similar

      TO the function of hypothalamic-pituitary-gonadal (HPG).." 

      + Line 152 "Due to that 20E functions through its receptor EcR.." should say ""BECAUSE 20E ACTS through its receptor EcR.." 

      + Lines 155, 354 "unnormal" is not commonly used (although it is an English word); "abnormal" is usually used instead. 

      + Line 273: "....we then asked that whether ecdysone regulates" delete "that"  + Sentences lines 306-309 need to be revised.

      C18. Thank you for your suggestions. We have revised as you advise.

    1. Author response:

      The following is the authors’ response to the original reviews.

      The manuscript lacks the conclusion section to summarize their finding. The rebuttal is too simple to state where and in which way the authors have made their revisions. In this case, please return this revision to the authors and ask them revise their contribution carefully.

      We now indicate in detail the places and the way that we make revisions. Specific revisions in sentences/words are marked with blue color in the main text where necessary. A conclusion is now provided at the end of the main text (lines 264-275). Other major revisions include:

      (1) We add Fig. 5 as a new figure to reconstruct ovule structure of Alasemenia and to compare three- and four-winged ovules. This is followed by Fig. 6 relating to mathematical analysis.

      (2) We re-organize (sequences of some) paragraphs and revise sentences in Discussion, and then divide Discussion into three parts: “Late Devonian acupulate ovules and their functions” (lines 124-150), “Late Devonian winged ovules and evolution of ovular wings” (lines 151-179), “Mathematical analysis of wind dispersal of ovules with 1-4 wings” (lines 180-262).

      (3) We move “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section from the supplementary information to the main text as the third part of Discussion (lines 180-262). The original paragraph headed with Mathematical analysis in Results is now modified and inserted to “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section (lines 250-256). The last paragraph in the original Supplementary information is now greatly modified and presented at the end of “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section (lines 256-262).

      (4) With moving “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section from the supplementary information to the main text, five references are accordingly added to the list (lines 278-282, 296-300, 329-330).

      (5) We change the format of citing references in the main text.

      We have therefore returned your manuscript to you to allow you to make the updates necessary to address the editors comments. Please ensure that you also update your preprint with the newly revised version once complete.

      Many thanks for this allowance and we now make the necessary updates to address the editors’ and reviewers’ comments. At the same time, the new version is also provided as a preprint.

      Reviewer #1 (Public Review):

      Summary:

      Winged seeds or ovules from the Devonian are crucial to understanding the origin and early evolutionary history of wind dispersal strategy. Based on exceptionally well-preserved fossil specimens, the present manuscript documented a new fossil plant taxon (new genus and new species) from the Famennian Series of Upper Devonian in eastern China and demonstrated that three-winged seeds are more adapted to wind dispersal than one-, two- and four-winged seeds by using mathematical analysis.

      Many thanks for these positive comments by the reviewer.

      Strengths:

      The manuscript is well organised and well presented, with superb illustrations. The methods used in the manuscript are appropriate.

      Many thanks for the reviewer’s positive comments.

      Weaknesses:

      I would only like to suggest moving the "Mathematical analysis of wind dispersal of ovules with 1-4 wings" section from the supplementary information to the main text, leaving the supplementary figures as supplementary materials.

      Ok, following the suggestion, we have moved this “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section to the main text (lines 180-262). It now represents the third part of Discussion. The original paragraph headed with Mathematical analysis in Results is now modified and inserted to “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section (lines 250-256). The last paragraph in the original Supplementary information is now greatly modified and presented at the end of “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section (lines 256-262).

      Reviewer #2 (Public Review):

      Summary:

      This manuscript described the second earliest known winged ovule without a capule in the Famennian of Late Devonian. Using Mathematical analysis, the authors suggest that the integuments of the earliest ovules without a cupule, as in the new taxon and Guazia, evolved functions in wind dispersal.

      Yes, these include our description, mathematical analysis and suggestion.

      Strengths:

      The new ovule taxon's morphological part is convincing. It provides additional evidence for the earliest winged ovules, and the mathematical analysis helps to understand their function.

      Many thanks for these positive comments of the reviewer.

      Weaknesses:

      The discussion should be enhanced to clarify the significance of this finding. What is the new advance compared with the Guazia finding? The authors can illustrate the character transformations using a simplified cladogram. The present version of the main text looks flat.

      To clarify the significance of this finding, the discussion is now enhanced in the following respects. We now re-organize the contents of Discussion and divide it into three parts. These three parts are entitled “Late Devonian acupulate ovules and their functions” (lines 124-150), “Late Devonian winged ovules and evolution of ovular wings” (lines 151-179), “Mathematical analysis of wind dispersal of ovules with 1-4 wings” (lines 180-262). The third part is transformed from the original Supplementary information.

      Regarding new advance (Alasemenia) compared with Guazia and illustration of the character transformations:

      (1) we now provide a new figure (Fig. 5) to reconstruct ovule of Alasemenia and to compare the structure of these two ovules.

      (2) in the second part of Discussion, we now say “As in Alasemenia (Fig. 5a), the integumentary wings of acupulate ovule of Guazia are broad, thin and fold inwards along the abaxial side, but their numbers are four in each ovule and their free portions usually arch centripetally (Fig. 5c; Wang et al., 2022, Figure 5).”

      (3) also in the second part of Discussion, we now say “Compared to Warsteinia with short and straight wings and Guazia with long but distally inwards curving wings, Alasemenia with longer and outwards extending wings would efficiently reduce the rate of descent and be more capably moved by wind. Furthermore, the quantitative analysis in mathematics indicates that three-winged ovules such as Alasemenia are more adapted to wind dispersal than four-winged ovules including Warsteinia and Guazia (see following).”

      (4) in the third part of Discussion, we now say “Significantly, the maximum windward area of each wing of Alasemenia is greater than that of Guazia and Warsteinia with four wings. All these factors suggest that Alasemenia is well adapted for anemochory.”

      (5) in Conclusion, we now say “Compared to Famennian four-winged ovules of Warsteinia and Guazia, Alasemenia with three distally outwards extending wings shows advantage in anemochory.”

      Recommendations for the authors:

      Ok, we undertake some revisions and keep some original contents.

      Reviewer #1 (Recommendations For The Authors):

      I would only like to suggest moving the "Mathematical analysis of wind dispersal of ovules with 1-4 wings" section from the supplementary information to the main text, leaving the supplementary figures as supplementary materials.

      Ok, following the suggestion, we now move this “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section to the main text (lines 180-262). It now represents the third part of Discussion.

      Reviewer #2 (Recommendations For The Authors):

      (1) The mathematical part as the supplement can be incorporated into the text.

      Ok, following the suggestion, we now move this “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section to the main text (lines 180-262). It now represents the third part of Discussion. The original paragraph headed with Mathematical analysis in Results is now modified and inserted to “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section (lines 250-256). The last paragraph in the original Supplementary information is now greatly modified and presented at the end of “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section (lines 256-262).

      (2) The comparisons between three- or four-winged ovules are not addressed enough.

      We now add Fig. 5 as a new figure. Based on this figure and revisions, the comparisons between three- and four-winged ovules now include:

      a) “Their integumentary wings illustrate diversity in number (three or four per ovule), length, folding or flattening, and being straight or curving distally. As in Alasemenia (Fig. 5a), the integumentary wings of acupulate ovule of Guazia are broad, thin and fold inwards along the abaxial side, but their numbers are four in each ovule and their free portions usually arch centripetally (Fig. 5c; Wang et al., 2022, Figure 5). In contrast to Alasemenia, Warsteinia has four integumentary wings without folding and their free portions are short and straight (Rowe, 1997, TEXT-FIG. 4).” (lines 154-160).

      b) “Furthermore, the quantitative analysis in mathematics indicates that three-winged ovules such as Alasemenia are more adapted to wind dispersal than four-winged ovules including Warsteinia and Guazia (see following).” (lines 166-168).

      c) “The relative wind dispersal efficiency of three-winged seeds is obviously better than that of single- and two- winged seeds, and is close to that of four-winged seeds (Fig. 6). In addition, three-winged seeds have the most stable area of windward, which also ensures the motion stability in wind dispersal. Significantly, the maximum windward area of each wing of Alasemenia is greater than that of Guazia and Warsteinia with four wings.” (lines 256-261).

      d) “Compared to Famennian four-winged ovules of Warsteinia and Guazia, Alasemenia with three distally outwards extending wings shows advantage in anemochory.” (lines 272-274).

      (3) The significance of this finding should be well summarized with solid evidence.

      It has been summarized in Abstract (lines 19-28) and is now further summarized especially in the newly provided Conclusion (lines 264-275).

    1. Author response:

      Reviewer #1

      - The entire study is based on only 2 adult animals, that were used for both the single cell dataset and the HCR. Additionally, the animals were caught from the ocean preventing information about their age or their life history. This makes the n extremely small and reduces the confidence of the conclusions. 

      This statement is incorrect.  While the scRNAseq was indeed performed in two animals (n=2), the HCR-FISH was performed in 3-5 animals (depending on the probe used).  These were different animals from those used for the scRNAseq.  We are partly responsible for this confusion, since we did not state the number of animals used for the HSC-FISH in the manuscript. 

      - All the fluorescent pictures present in this manuscript present red nuclei and green signals being not color-blind friendly. Additionally, many of the images lack sufficient quality to determine if the signal is real. Additional images of a control animal (not eviscerated) and of a negative control would help data interpretation. Finally, in many occasions a zoomed out image would help the reader to provide context and have a better understanding of where the signal is localized. 

      Fluorescent photos will be changed to color-blind friendly colors. 

      Diagrams, arrows and new photos will be included as to guide readers to the signal

      or labeling in cells. In the original manuscript 6 out of 7 cluster validations included a photo of a normal, non-eviscerated control.  We will make certain that this is highlighted in the resubmission and that ALL figures with HCR-FISH labeling will include data from control animals.

      - The Authors frequently report the percentage of cells with a specific feature (either labelled or expressing a certain gene or belonging to a certain cluster). This number can be misleading since that is calculated after cell dissociation and additional procedures (such as staining or sequencing and dataset cleanup) that can heavily bias the ratio between cell types. Similarly, the Authors cannot compare cell percentage between anlage and mesentery samples since that can be affected by technical aspects related to cell dissociation, tissue composition and sequencing depth. 

      The Reviewer has correctly identified the limitations of using cell percentages in scRNA-seq analyses. However, these percentages do offer a general overview of the sequenced cell populations and highlight potential differences between samples. In addition, these percentages, as addressed by the Reviewer, not only emphasize the shortcommings of the dissociation methods but at the same time provide some explanation for the absence of particular cell populations, as we describe in the manuscript. In our future resubmission, we will acknowledge these limitations and inform readers of any potential biases introduced by relying on these numbers.

      - The Authors decided to validate only a few clusters and in many cases there are no positive controls (such as specific localization, specific function, changes between control and regenerating animals, co-stain) that could actually validate the cluster identity and the specificity of the selected marker. There is no validation of the trajectory analysis and there is no validation of the proliferating cluster with H3P or BrdU stainings. 

      We validated the seven clusters that were important to reach our conclusions. Six of these had controls of normal (uneviscerated) intestine.  Nonetheless we will increase the number of cluster validations and include the dividing cell cluster using BrdU.

      - It is not clear what is already known about holothurian intestine regeneration and what are the new findings in this manuscript. The Authors reference several papers throughout the whole result sectioning mentioning how the steps of regeneration, the proliferating cells, some of the markers and some of the cell composition of mesenteries and anlages was already known. 

      The manuscript presents several novel findings on holothurian intestine regeneration, including:

      - The integration of multiple cellular processes, reported for the first time within a single species, along with the identification of the specific mRNAs expressed by each involved cell population.

      - A comparative analysis of the sea cucumber anlage structure, highlighting its similarities to previously described blastemal structures.

      - The identification of the potential dedifferentiated cell populations that form the foundation of the anlage, serving as the epicenter for proliferating and differentiating cells.

      We will ensure that these and other significant findings are prominently emphasized in the resubmitted manuscript.

      Reviewer #2

      - The spatial context of the RNA localization images is not well represented, making it difficult to understand how the schematic model was generated from the data. In addition, multiple strong statements in the conclusion should be better justified and connected to the data provided.

      As explained above we will make an effort to provide a better understanding of the cellular/tissue localization of the labeled cells. Similarly, we will revise the conclusions so that the statements made are well justified.

      Reviewer #3

      - Possible theoretical advances regarding lineage trajectories of cells during sea cucumber gut regeneration, but the claims that can be made with this data alone are still predictive.

      We are conscious that the results from these lineage trajectories are still predictive and will emphasize this in the text. Nonetheless, they are important part of our analyses that provide the theoretical basis for future experiments.

      - Better microscopy is needed for many figures to be convincing. Some minor additions to the figures will help readers understand the data more clearly.

      As explained above we will make an effort to provide a better

      understanding of the cellular/tissue localization of the labeled cells.  Similarly, we will revise the conclusions so that the statements made are well justified.

    1. Author response:

      We sincerely appreciate the reviewers' time, effort, and thoughtful feedback, which have significantly contributed to our research.

      A key concern raised was the potential overinterpretation of our data. While the reviewers acknowledged our identification of a possible synchronization mechanism among active mitral and tufted cells (MTCs) that is distance-independent, they correctly pointed out that we did not provide direct evidence showing how ensemble MTCs synchronize. We concur with their assessment and will address this in our forthcoming response to ensure a precise interpretation of our findings.

      Another concern raised involves the interpretation of results obtained under Ketamine anesthesia. Since Ketamine is an NMDA receptor antagonist, which plays a crucial role in MTC-GC reciprocal synapses, this might impact our conclusions. To address this, we will include analyses demonstrating that optogenetic activation of granule cells (GCs) in an anesthetized state inhibits recorded MTCs during baseline but does not affect odor-evoked MTC firing rates. Additionally, we will thoroughly discuss the potential influence of Ketamine anesthesia on GC-MTC synapses and its implications for our findings.

      Lastly, in our detailed response to the reviewers' comments, we will discuss several recent studies that are particularly relevant to our research. We will also expand on our hypothesis that parvalbumin-positive cells in the olfactory bulb may serve as key mediators of the activity- and distance-dependent lateral inhibition observed in our findings.

    1. Author response:

      General comments, factual mistakes:

      Reviewer 1 - Summary: “This study builds on the observation that the kynurenine pathway is required in the conceptus, as HOO null embryos are sensitive to maternal deficiency of NAD precursors (vitamin B3) and tryptophan, and narrows the window of sensitivity to a 3-day period.”

      Correction:

      Vitamin B3 should not be in parentheses, because vitamin B3 and tryptophan are both NAD precursors. We also suggest that the second half of this sentence is changed to “…and narrows the window of sensitivity to a 3-day period from embryonic day 7.5 to E10.5.” Currently, it reads as if Haao-null embryos are sensitive to any 3-day period of maternal NAD precursor restriction.

      Reviewer 1 – Strengths: “Abnormalities develop under conditions of maternal vitamin B3 deficiency, indicating…”

      Correction:

      We suggest replacing “vitamin B3 deficiency” with “NAD deficiency”, as this is more accurate.

      Reviewer 2 – Strengths: “…and then re-analysis of RNA-seq datasets suggested the endoderm was the cell source of NAD synthesis.”

      Correction:

      We suggest re-phrasing this sentence to “…and then re-analysis of RNA-seq datasets suggested the yolk sac endoderm cells are the source of NAD de novo synthesis.”

      Reviewer 1 (Public Review):

      However, without analysis of embryos at later stages in this experiment it is not known how long is needed for NAD synthesis to be recovered - and therefore until when the period of exposure to insufficient NAD lasts. This information would inform the understanding of the developmental origin of the observed defects.

      We are currently seeking funds to investigate the developmental origin of the observed defects. This study includes assessing how the timing of maternal NAD precursor restriction corresponds to the timing of NAD deficiency in the embryo.

      More importantly, there is still a question of whether in addition to the yolk sac, there is HAAO activity within the embryo itself prior to E12.5 (when it has first been assayed in the liver - Figure 1C).

      We have additional data showing that at E11.5 the embryo has no HAAO activity. We also tested E14.5 embryos with their livers removed, and these also do not have HAAO activity. We are planning to include these data sets in the revised version of this manuscript.

      Reviewer 2 (Public Review):

      Page 4 and Table S4. The descriptors for malformations of organs such as the kidney and vertebrae are quite vague and uninformative. More specific details are required to convey the type and range of anomalies observed as a consequence of NAD deficiency.

      Kidney defects were classified as described in Cuny et al. 2020 PNAS (PMID:32015132). In brief, kidneys with a length (tip to tip) of ≤ 1.5 mm in length were counted as hypoplastic, because the average length of a normal kidney at E18.5 is 2.98 mm (2.75-3.375 mm). The one dysmorphic kidney we observed in our dataset had a cyst. We plan to include this information plus more details of the observed vertebral defects in the revised version of this manuscript.

      Can the authors define whether the role of the NAD pathway in a couple of tissue or organ systems is the same? By this I mean is the molecular or cellular effect of NAD deficiency is the same in the vertebrae and organs such as the kidney. What unifies the effects on these specific tissues and organs and are all tissues and organs affected? If some are not, can the authors explain why they escape the need for the NAD pathway?

      We agree that this is a very important question, but consider it beyond the scope of this manuscript. To elucidate the underlying cellular and molecular mechanisms in individual organs will require a multiomic approach because NAD is involved in hundreds of molecular and cellular processes affecting gene expression, protein levels, metabolism, etc. For details of NAD functions that have relevance to embryogenesis see Dunwoodie et al 2023 https://doi.org/10.1089/ars.2023.0349. Furthermore, organs develop at different times during embryogenesis with both distinct, but in some cases shared, molecular and cellular processes. Relating these to specific NAD functions is the challenge. We are currently seeking funds to investigate how NAD deficiency disrupts organogenesis.

      Page 5 and Figure 6C. The expectation and conclusion for whether specific genes are expressed in particular cell types in scRNA-seq datasets depend on the number of cells sequenced, the technology (methodology) used, the depth of sequencing, and also the resolution of the analysis. It is therefore essential to perform secondary validation of the analysis of scRNA-seq data. At a minimum, the authors should perform in situ hybridization or immunostaining for Tdo2, Amid, Kmo, Kanu, Haao, Qprt, and Nadsyn1 or some combination thereof at multiple time points during early mouse embryogenesis to truly understand the spatiotemporal dynamics of expression and NAD synthesis.

      We have tested antibodies against HAAO, KYNU, and QPRT in adult mouse liver samples (the main site of NAD de novo synthesis) which produced non-specific bands with western blotting. Therefore, in situ immunostaining  studies on embryonic tissues are not feasible. We will investigate the possibility of effectively localizing transcripts of NAD de novo synthesis enzymes using in situ hybridization.

      Absolute functional proof of the yolk sac endoderm as being essential and required for NAD synthesis in the context of CNDD might require conditional deletion of Haoo in the yolk sac versus embryo using appropriate Cre driver lines or in the absence of a conditional allele, could be performed by tetraploid embryo-ES cell complementation approaches. But temporal dietary intervention can also approximate the same thing by perturbing NAD synthesis Shen the yolk sac is the primary source versus when the liver becomes the primary source in the embryo.

      Reviewer 1 has a related comment. We have additional data showing that at E11.5 the embryo has no HAAO activity, like the placenta. Similarly, E14.5 embryos with their livers removed, do not have HAAO activity either. We believe this provides sufficient proof that the yolk sac endoderm is the only site of NAD de novo activity in the conceptus until the liver has formed and takes over this function.

    1. Author response:

      We are grateful to the reviewers for recognizing the importance of our work and for their helpful suggestions. We will revise our manuscript in the revised version. However, we’d like to provide provisional responses now to answer the key questions and comments from the reviewers.

      (1) Both reviewers asked why we chose 24-120 hpf to measure the apoptotic rates. We chose this time window based on the following two reasons: 1) Previous studies showed that although the motor neuron death time windows vary in chick (E5-E10), mouse (E11.5-E15.5), rat (E15-E18) and human (11-25 weeks of gestation), the common feature of these time windows is that they are all the developmental periods when motor neurons contact with muscle cells. The contact between zebrafish motor neurons and muscle cells occurs before 72 hpf, which is included in our observation time window. 2) Zebrafish complete hatching during 48-72 hpf, and most organs form before 72 hpf. More importantly, zebrafish start swimming around 72 hpf, indicating that motor neurons are fully functional.

      Thus, we are confident that this 24-120 hpf time window covers the time window during which motor neurons undergo programmed cell death during zebrafish early development. We frequently used “early development” in this manuscript to describe our observation. However, we missed “early” in our title. We will add “early” in the title in the revised version.

      (2) Both reviewers also asked about the neurogenesis of motor neurons. Previous studies have shown that the production of spinal cord motor neurons largely ceases before 48 hpf and then the motor neurons remain largely constant until adulthood. Our observation time window covers the major motor neuron production process. Therefore, we believe that neurogenesis will not affect our data and conclusions.

      (3) Both reviewers questioned the specificity of using the mnx1 promoter to label motor neurons. The mnx1 promoter has been widely used to label motor neurons in transgenic zebrafish. Previous studies have shown that most of the cells labeled in the mnx1 transgenic zebrafish are motor neurons. In this study, we observed that the neuronal cells in our sensor zebrafish formed green cell bodies inside of the spinal cord and extended to the muscle region, which is an important morphological feature of the motor neurons. Furthermore, a few of those green cell bodies turned into blue apoptotic bodies inside the spinal cord and changed to blue axons in the muscle regions at the same time, which strongly suggests that those apoptotic neurons are not interneurons. Although the mnx1 promoter might have labeled some interneurons, this will not affect our major finding that only a small portion of motor neurons died during zebrafish early development.

      (4) Reviewer 2 is concerned that the estimated 50% of motor neuron death was in limb-innervating motor neurons but not in body wall-innervating motor neurons. The death of motor neurons in limb-innervating motor neurons has been extensively studied in chicks and rodents, as it is easy to undergo operations such as amputation. However, previous studies have shown this dramatic motor neuron death does not only occur in limb-innervating motor neurons but also occurs in other spinal cord motor neurons. In our manuscript, we studied the naturally occurring motor neuron death in the whole spinal cord during the early stage of zebrafish development.

      (5) Reviewer 2 mentioned that we ignored the death of an identified motor neuron. Our study was to examine the overall motor neuron apoptosis rather than a specific type of motor neuron death, so we did not emphasize the death of VaP motor neurons. We agree that the dead motor neurons observed in our manuscript contain VaP motor neurons. However, there were also other types of dead motor neurons observed in our study. The reasons are as follows: 1) VaP primary motor neurons die before 36 hpf, but our study found motor neuron cells died after 36 hpf and even at 84 hpf. 2) The position of the VaP motor neuron is together with that of the CaP motor neuron, that is, at the caudal region of the motor neuron cluster. Although it’s rare, we did observe the death of motor neurons in the rostral region of the motor neuron cluster. 3) There is only one or zero VaP motor neuron in each hemisegment. Although our data showed that usually one motor neuron died in each hemisegment, we did observe that sometimes more than one motor neuron died in the motor neuron cluster. We will include this information in the revised manuscript.

      (6) For the morpholinos, we did not confirm the downregulation of the target genes. These morpholino-related data are a minor part of our manuscript and shall not affect our major findings. Thus, we didn’t think we missed “important” controls. We will perform experiments to confirm the efficiency of the morpholinos or remove these morpholino-related data from the revised version.

    1. Author Response:

      We would like to thank the editors and reviewers for the careful consideration of our manuscript and their many helpful comments. We would like to provide provisional author responses to address the public reviews.

      Response to Reviewer 1:

      Weaknesses:

      While this study convincingly describes the phenotype seen upon Drp1 loss, my major concern is that the mechanism underlying these defects in zygotes remains unclear. The authors refer to mitochondrial fragmentation as the mechanism ensuring organelle positioning and partitioning into functional daughters during the first embryonic cleavage. However, could Drp1 have a role beyond mitochondrial fission in zygotes? I raise these concerns because, as opposed to other Drp1 KO models (including those in oocytes) which lead to hyperfused/tubular mitochondria, Drp1 loss in zygotes appears to generate enlarged yet not tubular mitochondria. Lastly, while the authors discard the role of mitochondrial transport in the clustering observed, more refined experiments should be performed to reach that conclusion.

      It would be difficult to answer from this study whether Drp1 has a role beyond mitochondrial fission in zygotes. However, there are several possible reasons why the Drp1 KO zygotes differs from the somatic cell Drp1 KO models.  

      First, the reviewer mentions that the loss of Drp1 in oocytes leads to hyperfused/tubular mitochondria, but in fact, unlike in somatic cells, the EM images in Drp1 KO oocytes show enlarged mitochondria rather than tubular structures  (Udagawa et al. Current Biology 2014, Fig. 2C and Fig. S1B-D), as in the case of zygotes in this study. 

      These mitochondrial morphologies in Drp1-deficient oocytes/zygotes may be attributed to the unique mitochondrial architecture in these cells. Mitochondria in oocytes have the shape of a small sphere with an irregular cristae located peripherally or transversely. These structural features might be the cause of insensitivity or resistance to inner membrane fusion. In addition, in our previous study (Wakai et al., Molecular Human Reproduction 2014, Fig. 2), overexpression of mitochondrial fusion factors in oocytes resulted in mitochondrial aggregation when outer membrane fusion factor Mfn1/Mfn2 was overexpressed, while overexpression of Opa1 did not cause any morphological changes. Thus, while mitochondria in oocytes/zygotes divide actively, complete fusion, including the inner membrane, as seen in somatic cells, is unlikely to occur.

      As for mitochondrial transport, we do not entirely discard its role. Althogh mitochondrial intrinsic dynamics such as fission are of primary importance for the mitochondrial distribution and partitioning in embryos, the regulation of dynamics by the cytoskeletons may be important and thus needs further study, as the reviewer pointed out.

      Response to Reviewer 2:

      Weaknesses:

      The authors first describe the redistribution of mitochondria during normal development, followed by alterations induced by Drp1 depletion. It would be useful to indicate the time post-hCG for imaging of fertilised zygotes (first paragraph of the results/Figure 1) to compare with subsequent Drp1 depletion experiments.

      We will indicate the time after hCG as the reviewer pointed out. The only problem is that in this experiment, there may be a slight deviation from the actual mitochondrial distribution change (Fig. S1A) due to the manipulation time for Trim-Away (since it was performed outside of the incubator). Also, no significant delay in pronuclear formation or embryonic development was observed with Drp1 depleted zygotes.

      It is noted that Drp1 protein levels were undetectable 5h post-injection, suggesting earlier times were not examined, yet in Figure 3A it would seem that aggregation has occurred within 2 hours (relative to Figure 1).

      As the reviewer pointed out, the depletion of Drp1 is likely to have occurred at an earlier stage. In this study, due to the injection of various RNAs to visualize organelles such as mitochondria and chromosomes, observations were started after about 5 hours of incubation for their fluorescent proteins to be sufficiently expressed. Therefore, for the western blotting analysis, samples were taken into account their condition at the start of the observation.

      Mitochondria appear to be slightly more aggregated in Drp1 fl/fl embryos than in control, though comparison with untreated controls does not appear to have been undertaken. There also appears to be some variability in mitochondrial aggregation patterns following Drp1 depletion (Figure 2-suppl 1 B) which are not discussed.

      We would like to add quantitative data on mitochondrial aggregation in Drp1-depleted embryos.

      The authors use western blotting to validate the depletion of Drp1, however do not quantify band intensity. It is also unclear whether pooled embryo samples were used for western blot analysis.

      We would like to add the quantitative results of the intensity of the bands for the Western blot analysis. The number of embryos analyzed is described in Fig legends, from 20 (Fig. 4) to 30 (Fig. 2) pooled samples were used.

      Likewise, intracellular ROS levels are examined however quantification is not provided. It is therefore unclear whether 'highly accumulated levels' are of significance or related to Drp1 depletion.

      We will present to indicate quantitative results on the accumulation of ROS.

      In previous work, Drp1 was found to have a role as a spindle assembly checkpoint (SAC) protein. It is therefore unclear from the experiments performed whether aggregation of mitochondria separating the pronuclei physically (or other aspects of mitochondrial function) prevents appropriate chromosome segregation or whether Drp1 is acting directly on the SAC.

      It has been reported that Drp1 regulates meiotic spindle through spindle assembly checkpoint (SAC) (Zhou et al., Nature Communications 2022). We would like to mention the possibility pointed out in the discussion part.

      Response to Reviewer 3:

      Seemingly, there are few apparent shortcomings. Following are the specific comments to activate the further open discussion.

      - Line 246: Comments on cristae morphology of mitochondria in Drp1-depleted embryos would better be added.

      We would like to add a comment regarding cristae morphology.

      - Regarding Figure 2H: If possible, a representative picture of Ateam would better be included in the figure. As the authors discussed in line 458, Ateam may be able to detect whether any alterations of local energy demand occurred in the Drp1-depleted embryos.

      ATeam fluorescence is analyzed using a regular fluorescence microscope, not a confocal laser microscope, in order to analyze the intensity in the whole embryo (or the whole blastomere). Therefore, we are currently unable to obtain images of localized areas within the cell (e.g., around the spindle) as expected by the reviewer; as shown in the images in Figure 3-figure supplement 1C, there is a tendency to see high ATP levels at the cell periphery, but further analysis is needed for clear and definitive results.

      - Line 282: In Figure 3-Video 1, mitochondria were seemingly more aggregated around female pronucleus. Is it OK to understand that there is no gender preference of pronuclei being encircled by more aggregated mitochondria?

      Aggregated mitochondria are localized toward the cell center, but do not behave in such a way that they are preferentially concentrated near the female pronucleus.

      - Line 317: A little more explanation of the "variability" would be fine. Does that basically mean that the Ca2+ response in both Drp1-depleted blastomeres were lower than control and blastomere with more highly aggregated mitochondria show severer phenotype compared to the other blastomere with fewer mito?

      We assume that what the reviewer have pointed out is right. However, although we were able to show the bias in Ca2+ store levels between blastomeres of Drp1 depleted embryos, we did not stain mitochondria simultaneously, so we were unable to say details such as more Ca2+ stores in blastomere that inherited more mitochondria or less Ca2+ stores in blastomere with more aggregated mitochondria

      - Regarding Figure 5B (& Figure 1-figure supplement 1B): Do authors think that there would be less abnormalities in the embryos if Drp1 is trim-awayed after 2-cell or 4-cell, in which mitochondria are less involved in the spindle?

      The marked accumulation of mitochondria around the spindle is unique to the first cleavage and seems to be coincident with the migration of the pronuclei toward the center. Since the process of assembly of the male and female pronuclei is also an event unique to the first cleavage, abnormalities such as binucleation due to mitochondrial misplacement are thought to be a phenomenon seen only in the first cleavage. Therefore, if Drp1 is depleted at the 2-cell or 4-cell stage, chromosome segregation errors may be less frequent. However, since unequal partitioning of mitochondria is thought to occur, some abnormalities in embryonic development is likely to be observed.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      Strengths

      We thank the reviewer for recognizing the strengths of our in vivo Ca2+ measurements, super resolution microscopy and assessment of the secretory dysfunction in the Sjogrens syndrome mouse model.

      Weaknesses

      Point 1: The less restricted Ca2+ signal to the apical region of the acinar cell is not really relevant to the reduced activation of TMEM16a by a local signal at the apical plasma membrane.

      We agree that the spatially averaged Ca2+ signal is not indicative of the local Ca2+ signal that activates TMEM16a. The description of the disordered Ca2+ signal in the disease model was intended to simply convey that the Ca2+ signal is altered in the model. Whether or indeed how the altered spatial characteristics of the signal are deleterious is not known but we speculate in the discussion that this contributes to the ultrastructural damage observed.

      Point 2. Secretion is decreased but the amplitude of the globally averaged Ca2+ signals are increased. No proof is offered that the greater distance between IP3R and TMEM16a is the reason for decreased secretion in the face of this increased peak signal.

      We have now added new data that indicates that the local Ca2+ signal is indeed disrupted in the disease model. We show that in control animals, activation of TMEM16a by application of agonist occurs when the pipette is buffered with the slower buffer EGTA but not with the fast buffer BAPTA In contrast, in cells isolated from DMXAA -treated animals both EGTA and BAPTA abolish the agonist-induced currents (new Figure 6). These data are consistent with our super resolution data showing the distance between IP3R and TMEM16a being greaterand thus presumably is enough to allow buffering of Ca2+ release from IP3R such that it does not effectively activate TMEM16a. These data also would suggest that the increased amplitude of the spatially averaged Ca2+ signal is not sufficient to overcome this structural change.

      Point 3. Lack of evidence that the mitochondrial changes are associated with the defect in fluid secretion.

      We agree that a causal link between the decreased secretion and altered mitochondrial morphology and function is not established. Nevertheless, we feel it is reasonable to contend that profound changes in mitochondrial morphology observed at the light and EM level, together with changes in mitochondrial membrane potential and oxygen consumption are consistent with contributing to altered fluid secretion given that this is an energetically costly process. We have altered the discussion to reflect these caveats and ideas.

      Reviewer 2:

      We thank the reviewer for their assessment of our work and constructive comments.

      Reviewer 3:

      We thank the reviewer for their careful appraisal of our manuscript and insightful comments. 

      Point 1: Are all the effects of DMXAA mediated through the STING pathway?

      This is an important point because as noted DMXAA has been reported to inhibit NAD(P)H quinone oxireductase that could contribute to the phenotype reported here. In future studies we intend to test other STING pathway agonists such as MSA-2 and perhaps antagonists of the STING pathway. We have added text to the discussion indicating that all the effects observed may not be a result of activation of the STING pathway.

      Point 2: As noted, and clarified in the text, the driving force for ATP production is the electrochemical H+ gradient which establishes the mitochondrial membrane potential.

      Point 3:  The reviewer suggested there was a decrease mitochondrial membrane potential in the absence of a change in TMRE steady state.

      We apologize for the confusion generated from the presentation of the figure. We normalized TMRE fluorescence against Mitotraker green fluorescence but as shown, the figure does not reflect that the absolute TMRE fluorescence was indeed decreased. Supplemental figure 4 now shows the basal TMRE fluorescence.

      Point 4: Indications that the disruption to ER structure seen in Electron Micrographs contributes to the changes in Ca2+ signal and fluid secretion.

      We did not focus on the relative distance between ER and apical PM in the EMs primarily because the ER that projects towards the apical PM is a relatively minor component of the specialized ER expressing IP3R and is difficult to identify. We note that the disruption of the bulk ER as quantitated by altered ER-mitochondrial interfaces and fragmentation is consistent with our super resolution data and thus likely plays a role in the mechanism that results in dysregulated Ca2+ signals and reduced secretion.

      Recommendations to Authors:

      Reviewing Editor:

      (1) The Editor suggests that we should use the activity of TMEM16a to directly measure the [Ca2+] experienced by the channel.

      We now present new additional data.  First, we show an extended range of pipette [Ca2+] demonstrating identical Ca2+ sensitivity in DMXAA vs vehicle treated cells (Figure 5). Second, importantly, we now present data evaluating the ability of muscarinic stimulation to activate TMEM16a in the presence of either EGTA (slow Ca2+ buffer) or BAPTA (fast Ca2+ buffer). Notably, currents can be stimulated in control cells when the pipette is buffered with EGTA, but not in DMXAA treated cells. BAPTA inhibits activation in both situations (new Figure 6). These data are consistent with TMEM16a being activated by Ca2+ in a microdomain and that this is disrupted in the disease model.   

      (2) The Editor asks whether a decrease in IP3R3 in a subset of the samples could account for the decreased fluid secretion.

      We think this is unlikely given, as noted by the Editor, that a reduction only occurred in a subset of the samples and statistically there was no significant difference to vehicle-treated animals. Moreover, we would note that there is also no difference in the expression of IP3R2 between experimental groups and in studies of transgenic mice where either IP3R2 or IP3R3 were knocked out individually, there was no effect on salivary fluid secretion, indicating that expression of a single subtype can support stimulus-secretion coupling.

      (3) Absolute values for changes in fluorescence (over time) should be included together with SD images.

      These have been added in Figure 3.

      (4) DMXAA has additional effects to STING activation and thus other STING pathway modulators should be used.

      We agree that additional STING agonists should be explored in the future but believe that this is beyond the scope of the present studies. Additional text has been added to the discussion acknowledging the additional targets of DMXAA and that they could contribute to the phenotype.

      (5) No causal link between the observed Ca2+ changes and mitochondrial dysfunction.

      We agree that no experimental evidence is offered to directly support this contention. Nevertheless, dysregulated Ca2+ signals are well-documented to lead to altered mitochondrial structure and function and thus we feel it not unreasonable to speculate that this is a possibility.

      (6) The paper would be improved by directly assessing mechanistic connections between altered Ca2+ signaling and TMEM16a activation.

      We agree, please refer to point 1 and new figure 6.

      Reviewer 1:

      (1) Standard Deviation images should be explained and the location of ROI identified.

      We contend that Standard Deviation images provide an effective visualization (in a single image) of both the magnitude of the Ca2+ increase and the degree of recruitment of cells in the field of view during the entire period of stimulation.  We have added text to describe the utility of this technique. Nevertheless, we now show kinetic traces of the changes in fluorescence over time in both apical and basal regions in Figure 3. We also clarify that the traces shown in Figure 2 are averaged over the entire cell. 

      (2) The Authors should consider that reduced secretion is because cells are dying.

      We believe this is unlikely given the lack of morphological changes in glandular structure and the minor lymphocyte infiltration observed in this model. Nevertheless, we now add data showing that the mass of SMG is not altered in the DMXAA-treated animals compared with vehicle-treated (Figure 1E).

      (3) The role of mitochondria in the DMXAA phenotype is unclear. What is the effect of acutely de-energizing mitochondria on fluid secretion.

      Since fluid secretion is an energetically expensive undertaking, it is not unreasonable to suggest that compromised mitochondrial function may impact secretion. That being said this could occur at multiple levels- production of ATP to fuel the Na/K pump to establish membrane gradients or to provide energy to sequester Ca2+ among a multitude of targets. This will be a subject of ongoing experiments. We contend that experiments to acutely disrupt salivary mitochondria in vivo while assessing fluid secretion would be difficult experiments to perform and interpret given that local administration of agents to SMG would not effect the other major salivary glands and systemic administration would be predicted to have wide-ranging off target effects. 

      (4) Could a subset of cells with low IP3R numbers contribute to reduced fluid secretion?

      Please see the response to Reviewing Editors point 2. 

      (5) An attempt to estimate the effect of the spatial distruption of IP3R and TMEM16a localization should be made.

      Please see the response to Reviewing Editors point 1.

      Minor Points

      We have amended the statement form “Highly expressed” to increased.

      Regions of the cell have been labelled for orientation in the line scans.

      The molecular weight markers have been added in Figure 4.

      Reviewer 2:

      (1) Whether mitochondrial dysfunction is the initiator of the phenotype or a result of the dysregulated Ca2+ signal is unclear.

      We agree that our data does not clarify a classic “Chicken vs Egg” conundrum. We plan further experiments to address this issue. Future plans include repeating the mitochondrial and Ca2+ signaling experiments at earlier time points where we know fluid secretion is not yet impacted. This may potentially reveal the temporal sequence of events. Similarly, we plan experiments to mechanistically address why the global Ca2+ signal is augmented- reduced Ca2+ clearance or enhanced Ca2+ release/influx are possibilities. We speculate that reduced Ca2+ clearance, either because mitochondrial Ca2+ uptake is reduced or as a secondary consequence of reduced ATP levels on SERCA and PMCA is a likely possibility.

      (2) Measurement of ECAR and direct measurements of ATP and Seahorse methods.

      In a separate series of experiments, we monitored ECAR. These data were unfortunately very variable and difficult to interpret, although no obvious compensatory increase was observed. We plan in the future to directly monitor ATP levels in acinar cells using Mg-Green. To normalize for cell numbers in the Seahorse experiments, following centrifugation, cell pellets of equal volume were resuspended in equal volumes of buffer. Acinar cells were seeded onto Cell Tak coated dishes. This information is added to the Methods section.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):  

      (1) When introducing the different antibody clones recognizing Pan, oxidized, or reduced forms, please clearly indicate which clone number belongs to which form.  

      - We see where the original language could be confusing. Please see our new introduction to the antibodies used.

      “we evaluated the redox state of La in fusing osteoclasts using recently validated monoclonal α-La antibodies that recognize oxidized La (clone 7B6) or reduced La (clone 312B), or do not distinguish between these La species (Pan, clone 5B9)”

      (2) "Finding that the surface La pool, which promotes multinucleation in osteoclasts, is an oxidized species..." I would suggest rewording as "...is enriched in oxidized species".  

      - Agreed. We have edited the sentence as follows.

      “Finding that the surface La pool, which promotes multinucleation in osteoclasts, is enriched in an oxidized species raised the question”

      (3) Although not necessary to support the conclusions of the manuscript, it would be interesting to know if the application of La194-408 to osteoclast progenitors following NAC treatment results in the rescue of La staining at the cell surface, or if this exogenous La is acting independently from cell surface association.  

      - We agree that this is an interesting idea. We previously demonstrated that we could add La 1-375 to osteoclast progenitors following RANKL addition and promote osteoclast fusion. We also demonstrated that La 1-375 under these conditions enriched La surface staining (PMID: 36739273)

      - Therefore, we hypothesize that La 194-408 would act similarly.

      (4) Is the confirmation of La modified by the conversion of Cys 232 and 245 to alanine? What about the potential to form oligomers?  

      - To directly answer the Reviewer’s question – we simply do not know and do not have a simple way to test this. To speculate, the differential recognition of La that is reduced vs oxidized by the antibodies used here (specifically clone 312b vs clone 7b6) suggests that some conformational change is taking place when redox signaling modifies La in osteoclasts. Moreover, in Supp. Fig. 4b, we show that recombinant La 194-408 does form a small amount of dimer under our conditions while La 194-408 Cys 232 and 245 to Ala does not. These data together weakly support that La, when converted from reduced to oxidized forms or when we artificially Cys 232 and 245 to Ala, undergoes some conformational and oligomeric change. However, we are not comfortable making

      such claims in the manuscript currently and prefer to investigate this with more rigor and comment in the biological significance of these potential changes in the future.

      (5) "In conclusion, in this study, we identified redox signaling as a molecular switch that redirects La protein away from the nucleus, where it protects precursor tRNAs from exonuclease digestion, and towards its osteoclast-specific function at the cell surface..." I would suggest rewording this sentence given that there is no evidence that the function of oxidized La at the cell surface is osteoclast-specific. This phenomenon could be applicable to other cell types and other biological processes.  

      - The Reviewer makes a good point here, that we very much appreciate. We hoped to communicate that this was a unique function of La that was different from the well-recognized role this protein plays in RNA metabolism, but somewhat overstated past our intention. Please see where we have modified this statement to read:

      “In conclusion, in this study, we identified redox signaling as a molecular switch that redirects La protein away from the nucleus, where it protects precursor tRNAs from exonuclease digestion, and towards its separable function at the osteoclast surface, where La regulates the multinucleation and resorptive functions of these managers of the skeleton.”

      (6) In methods, the definition of TCEP is missing a closed parenthesis sign.  

      - Thank you, corrected.

      (7) In methods under "Cells" there is a missing superscript in 1x106 cells/ml. Presumably, this is 1x10e6.   

      - Thank you, corrected.

      (8) Please provide the sequences of primers used for RT-PCR in this study.  

      - Understood. Please see where a table of all primer sequences used has been added to the Methods under the Transcript Analysis section.

      (9) In methods, "Bone resorption" should be relabeled given that the osteoclasts are plated on calciumphosphate plates and not on a bone surface.  

      - Thank you. Please see where in the Methods both the title and all references to “bone resorption” in the method description have now been changed to “mineral resorption”.

      (10) In several figures, it would be more appropriate to correct for multiple comparisons in the statistical analyses.  

      - We appreciate this concern. Please see where Fig. 2b,c; Fig. 3 b,c; Fig. 4d; Fig. 5b,d; and Fig. 6d have been reanalyzed using paired one-way ANOVAs corrected for multiple comparisons. Now all data where t-tests are used to evaluate statistical significance are only evaluating  differences between 2 values and all experiments considering 3+ values are compared using one-way ANOVAs corrected for multiple comparisons.

      (11) Figure 5: Panels D and E are flipped relative to the legend. Please also define the reagent used for ROS signal in the legend.  

      - Thank you. D and E are now corrected and we added “(Grey = CellRox Dye)” to the end of the legend for Fig. 5a.

      (12) Supplemental Figure 5c: in the control condition, why are some nuclei not staining with the reduced La antibody?  

      - Great question, direct answer – we simply do not know.  

      Longer answer, this image is in fact representative and not exclusive to the reduced La antibody (clone 312b). When we look at La staining in mature, multinucleated osteoclast nuclei at later timepoints post fusion using even pan antibodies, we find that its localization to the nuclei of syncytial osteoclasts is not uniform, but that nuclear La preferentially enriches in some mature osteoclast nuclei and seems to be excluded from others. This may suggest that – akin to myonuclei in skeletal muscle – osteoclast nuclei in a syncytium are not all equal. However, we are far, far away from being able to make any conclusions from the data we have.

      (13) Figure 7 legend: consider breaking this legend up into multiple sentences.  

      - Thank you for the suggestion. The legend for Figure 7 has been rewritten.

      Reviewer #2 (Recommendations For The Authors):  

      (1) Can the authors use the official name of La protein in NCBI GENE and PROTEIN?  

      - While some in the field refer to lupus La protein as La protein, we choose to refer to it simply as La, as is common throughout the Lupus La Protein literature. It is our opinion that continuously referring to a protein as a name + the word protein throughout the manuscript is unnecessary and alters the flow of our manuscript’s points.

      Thanks. We have included the official name of human La in NCBI GENE ((SSB small RNA binding exonuclease protection factor La, Gene ID 6741, NCBI GENE)  into the revised text.  

      (2) The references 26 and 27 are not representative. The pioneering work from Mundy, Chambers, and Almeida (PBMID 2312718, 15528306, and 24781012) should be cited.  

      - Thanks. We have added these 3 references to better acknowledge these significant contributions.

      (3) It is hard to understand Figure 2. What are the white arrows in Figure 2a pointed to? In Figure 2b, what do the columns a-LA(Red), a-La (Pan), and a-La (Ox) mean, treatment, or staining? Figure 2c, the legend "conditions where surface proteins are oxidized (TCEP) seems to be "deoxidized.  

      - We agree. We now realized this legend was rather confusing. It has been edited to read

      “(a) Representative fluorescence and DIC confocal micrographs of primary human osteoclasts following synchronized cell-cell fusion where hemifusion inhibitor was left (Inhibition), removed (Wash) or removed but the α-La antibodies indicated were simultaneously added.

      Cyan=Hoechst Arrows=Multinucleated Osteoclasts (b) Quantification of a.” • Thanks. 2c has now been corrected to “reduced” rather than the errant “oxidized”.

      (4) How do authors normalize bone resorption, % of total area?  

      - We normalized to a separate, paired well where monocytes are differentiated to precursors (MCSF), but no RANKL is added. We have added this omitted information to the methods sections for our mineral resorption assay.

      (5) Figure 5. There are two legends (b). In Figure 5c RT-qPCR, the DC-STAMP or OC-STAMP and mature osteoclast marker calcitonin receptor should be included.

      - Thank you. There were several problems with Figure legend 5 that both you and Reviewer #1 brought our attention to. We have now corrected these errors.

      - We understand the Reviewer’s interest in these markers. However, our point is that the steadystate transcript levels of two well recognized osteoclast differentiation factors and the fusion regulator La, which our manuscript focuses on, are not significantly altered by NAC treatment at these later, fusion associated timepoints. While DC-STAMP, OC-STAMP, and Calcitonin would be interesting, we believe they are outside the scope of this manuscript.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this work, the authors continue their investigations on the key role of glycosylation to modulate the function of a therapeutic antibody. As a follow-up to their previous demonstration on how ADCC was heavily affected by the glycans at the Fc gamma receptor (FcγR)IIIa, they now dissect the contributions of the different glycans that decorate the diverse glycosylation sites. Using a well-designed mutation strategy, accompanied by exhaustive biophysical measurements, with extensive use of NMR, using both standard and newly developed methodologies, they demonstrate that there is one specific locus, N162, which is heavily involved in the stabilization of (FcγR)IIIa and that the concomitant NK function is regulated by the glycan at this site.

      Strengths:

      The methodological aspects are carried out at the maximum level.

      Weaknesses:

      The exact (or the best possible assessment) of the glycan composition at the N162 site is not defined.

      We will revise the Introduction to include previous findings from our laboratory regarding processing on YTS cells:

      “YTS cells, a key cytotoxic human NK cell line used for these studies, express FcγRIIIa with extensive glycan processing, including the N162 site with predominantly hybrid and complex-type glycoforms {Patel 2021}.”  

      Reviewer #2 (Public Review):

      Summary:

      The authors set out to demonstrate a mechanistic link between Fcgamma receptor (IIIA) glycosylation and IgG binding affinity and signaling - resulting in antibody-dependent cellular cytotoxicity - ADCC. The work builds off prior findings from this group about the general impact of glycosylation on FcR (Fc receptor)-IgG binding.

      Strengths:

      The structural data (NMR) is highly compelling and very significant to the field. A demonstration of how IgG interacts with FcgRIIIA in a manner sensitive to glycosylation of both the IgG and the FcR fills a critical knowledge gap. The approach to demonstrate the selective impact of glycosylation at N162 is also excellent and convincing. The manuscript/study is, overall, very strong.

      Weaknesses:

      There are a number of minor weaknesses that should be addressed.

      (1) Since S164A is the only mutant in Figure 1 that seems to improve affinity, even if minimally, it would be a nice reference to highlight that residue in the structural model in panel B.

      We will revise Figure 1B to include the S164 site.

      (2) It is confusing why some of the mutants in the study are not represented in Figure 1 panel A. Those affinities and mutants should be incorporated into panel A so the reader can easily see where they all fall on the scale.

      We thank the reviewer for this comment. We will restructure the Results section to highlight that a primary outcome of the experiment referenced was to map the contribution of interface residues to antibody binding affinity. These data were not previously available, highlighting hotspots at the interface. Figure 1A and B report these results.

      We then used a subset of mutations from this experiment, as well as a subset of mutations from an additional library containing mutations proximal to the interface, to build a small library for evaluation using ADCC. The complete binding data for all variants, binding to two different IgG1 Fc glycoforms, is presented in Supplemental Table 1. 

      T167Y in particular needs to be shown, as it is one of few mutants that fall between what seems to be ADCC+ and ADCC- lines. Also, that mutant seems to have a stronger affinity compared to wt (judged by panel D), yet less ADCC than wt. This would imply that the relationship between affinity and activity is not as clean as stated, though it is clearly important. Comments about this would strengthen the overall manuscript.

      We thank the reviewer for this particular insight. We agree that the lack of a clean correlation between ADCC potency and affinity implies additional factors that could have affected these experimental results. We will add the following sentence to the discussion. 

      “Notably, the ADCC potency for those high-affinity variants does not fall cleanly on a line, indicating that other factors affect our observations, which may include organization at the cell surface, changes to glycan composition, or receptor trafficking.”

      (3) This statement feels out of place: "In summary, this result demonstrates that the sensitivity to antibody fucosylation may be eliminated through FcγRIIIa engineering while preserving antibody-binding affinity." In Figure 2, the authors do indeed show that mutations in FcgRIIIa can alter the impact of IgG core fucosylation, but implying that receptor engineering is somehow translatable or as impactful therapeutically as engineering the antibody itself deflates the real basic science/biochemical impact of understanding these interactions in molecular detail. Not everything has to be immediately translatable to be important. 

      We agree and will remove the highlighted sentence.   

      (4) The findings reported in Figure 2, panel C are exciting. Controls for the quality of digestion at each step should be shown (perhaps in supplementary data). We agree.

      We will add an example of the digestions as Figure S2.  

      (5) Figure 3 is confusing (mislabeled?) and does not show what is described in the Results. First, there is a F158V variant in the graph but a V158F variant in the text.

      Please correct this. 

      Thank you for identifying this typo. We will correct Figure 3.

      Second, this variant (V158F/F158V) does not show the 2-fold increase in ADCC with kifunesine as stated. 

      Thank you for drawing our attention to this rounding error. We will revise the text to report a statistically significant 1.4-fold increase.

      Finally, there are no statistical evaluations between the groups (+/- kif; +/- fucose). 

      We provide the p values for +/-fuc and +/- Kifunensine for each YTS cell line in the figure. We did not provide a global comparison of p values that included all cell lines due to some cell lines experiencing a significant change and others not. However, we will add the raw data as Supplemental Table 2 should readers wish to perform these analyses.

      The differences stated are not clearly statistically significant given the wide spread of the data. This is true even for the wt variant.

      We agree that there are points that overlap in this figure between the different treatments. However, our use of the students T-test (two tailed) using three experiments collected on three different days (each with three technical replicates) provides enough resolution to determine the significance of difference of the means for the different treatments. This is, by our estimation, a highly rigorous manner to collect and analyze the data.  

      (6) The kifunensine impact is somewhat confusing. They report a major change in ADCC, yet similar large changes with trimming only occur once most of the glycan is nearly gone (Figure 2). Kifunensine will tend to generate high mannose and possibly a few hybrid glycans. It is difficult to understand what glycoforms are truly important outside of stating that multi-branched complex-type N-glycans decrease affinity.

      Note that Figure 2 does not evaluate the kifunensine-treated glycan, which is mostly Man8 and Man9 structures. In our previous work, these structures likewise provide increased binding affinity (see pubmed ID 30016589). We believe the most important message is that composition of the N162 glycan (removed with the S164A mutation) regulates NK cell ADCC. On cells, we are not able to modulate N162 glycan composition without affecting potentially every other N-glycan on the surface, so we do not have an ADCC experiments that is directly comparable to Figure 2. Thus, this increased ADCC resulting from kifunensine treatment is consistent with previously observed increases in binding affinity measurement.  

      (7) This is outside of the immediate scope, but I feel that the impact would be increased if differences in NK cell (and thus FcgRIIIA) glycosylation are known to occur during disease, inflammation, age, or some other factor - and then to demonstrate those specific changes impact ADCC activity via this mechanism.

      We agree completely. As mentioned in the Introduction, we know that N162 glycan composition varies substantially from donor to donor based on previous work from our lab. Curiously, little variability appeared between donors at the other four Nglycosylation sites. Thus, there is the potential that different NK cell N162 glycan compositions are coincident with different indications. This is an area we are quite interested in pursuing.

    1. Author response:

      (1) Clarification and Detailed Explanation in the Methods Section:

      - Regarding Reviewer 1's comments about the unclear explanation of the update process for pseudotime, T, and the selection of important genes/features at bifurcation points in the methods, we will provide a detailed description of the update process for pseudotime T and how high-weight genes important to the bifurcation process are selected.

      - Regarding Reviewer 2's comments concerning the impact of the initial pseudotime prediction method and the insufficient description of various parameters, we will add information about the differences in the initially used pseudotime prediction methods and provide detailed information on the techniques and parameters used in each analysis.

      - Regarding Reviewer 2's comments on the choice of kernel functions, we will explain the rationale for selecting rbf and polynomial kernels and why other options were discarded.

      (2) Performance Comparison and Data Presentation:

      - Regarding Reviewer 1's comments about using a few trajectory plots of the real-world data to visualize the results, we will include 1-2 trajectory plots of real-world datasets in the benchmark analysis to better visualize the results and assess accuracy.

      - Regarding Reviewer 2's comments concerning the lack of comparison results and discussion related to trajectory prediction methods based on deep learning, we will include a comparison with deep learning methods such as scTour and Tigon in the revision. Additionally, we will discuss the latest deep learning methods for bifurcation analysis and alternative trajectory inference methods such as CellRank.

      - Regarding Reviewer 2's comments on the impact of MURP, we will include an analysis on whether the number of MURPs affects the performance of the method and compare it with the random subsampling approach.

      (3) Article Calibration and Refinement:

      - Regarding Reviewer 2's comments on the discussion section, we will simplify the first three paragraphs to succinctly convey the background and implications of our contributions. Additionally, we will explain why HVG is considered as the entire feature space in our comparisons and analyses.

      - Regarding Reviewer 2's comments concernig the regulons in the microglia analysis, we will review the correct explanations and revise the article accordingly.

      - In response to the issues raised by both reviewers regarding grammatical errors, spelling mistakes, and inconsistencies between text and figures, we will review and correct any errors in the article. This includes providing explanations for all abbreviations upon their first appearance, ensuring the accuracy of text and figure descriptions, correcting equation numbering, improving image quality, and revising descriptions such as "the current manifold learning methods face two major challenges."

      (4) Enhancing Descriptions and Readability:

      - Regarding Reviewer 1's comments about the synthetic data, we will add a brief description in the main text on how synthetic data were generated.

      - Regarding Reviewer 1's comments on the survival analysis, we will provide a more detailed description of the computational steps and clarify whether key confounding factors such as age, clinical stage, and tumor purity were controlled.

      - Regarding Reviewer 2's comments on evaluation metrics, we will add detailed descriptions of the evaluation metrics and provide intuitive explanations of how different methods perform across various metrics in the comparison results.

      - Regarding Reviewer 2's comments on CD8+ T cells, we plan to compare MGPfact with Monocle3, in addition to Monocle2. This will help clarify the added value of MGPfact and provide a more comprehensive evaluation of its performance.

      - Regarding Reviewer 2's comments about consensus trajectorie, we will add detailed descriptions of the process of generating consensus trajectories.

      - Regarding Reviewer 2's comments on regulons, we will include additional information on the process of downstream trajectory analysis and clarify the roles of SCENIC, GENIE3, RCisTarget, and AUCell in the bifurcation analysis.

    1. Author response:

      We thank the reviewers for their insightful comments on our model and manuscript. In this provisional response, we would like to comment on some of the issues raised and how we plan to address them.

      First, the reviewers correctly pointed out that only a small part of the full model was openly available. We have now rectified this and the full model is available at: https://dataverse.harvard.edu/dataverse/sscx.

      Next, we would like to comment on the perceived lack of clarity of certain descriptions in the manuscript. We note that individual techniques and parts of the model have been developed, justified, and validated in previous publications. This left us with the question of how much of the contents of those papers we should re-describe. Too much, and the manuscript becomes overly long; too little, and the reader cannot gain a sufficient understanding of the model building process. The reviewers' comments made it clear that some aspects of the model should be described in more detail and we plan to address this in a revision. Crucially, one missing item raised by all reviewers was a comparison of local connection probabilities to the literature. This will be provided in the revision. Additionally, the reviewers questioned our decision to use a connectivity algorithm that is not based on direct parameterization of target connection probabilities. While this is a limitation of the algorithm we employed, it also has unique strengths, providing non-random aspects of connectivity that have been proven to be impossible to model with algorithms that enforce given connection probabilities or degree distributions. We plan to explain this better in a revision.

      We will also comment on the challenges associated with the interpretation of experimentally measured connection probabilities and employing them for the parameterization of a biophysically detailed model spanning millimeters.

      The reviewers also suggested several aspects of the model that could be improved. Whilst we see merit with all of them, we would like to briefly comment on model completeness in general. First, this model - and any model - can probably never be considered complete. Instead, the model has to be continuously refined, which one reviewer phrased as the "live nature" of the model. However, to demonstrate the model's utility and justify the expense of modeling, we also have to use the model in projects that explore specific scientific questions. To undertake and complete such a project, one must select and "freeze" a given version of the model-- otherwise the project will never conclude. Further, we believe that it is advantageous if several projects use the same version of the model. In that case, a reader who is already familiar with the model from one paper may find it easier to understand other papers using the same model. The goal of this manuscript is to describe the version of the model that we used in several ongoing and concluded follow-up projects, including its limitations and opportunities for refinement. As such, we do not plan to add further improvements to the model for this reviewed pre-print. We will, however, continue to refine the model outside of the scope of this publication. Since we believe the development and bottom up models are best done in a community driven manner, we encourage interested parties to participate.

      We invite anyone with ideas of how the model could be refined to contact us to discuss how we could integrate these changes into the model together using our tools.

    1. Author response:

      eLife assessment

      This important study reports numerous attempts to replicate reports on transgenerational inheritance of a learned behavior, pathogen avoidance, in C. elegans. While the authors observe parental effects that are limited to a single generation (also called intergenerational inheritance), the authors failed to find any evidence for transmission over multiple generations, or transgenerational inheritance. The experiments presented are meticulously described, making for compelling evidence that in the authors' hands transgenerational inheritance cannot be observed, although there remains the possibility that subtle differences in culture conditions or lab environment explain the failure to reproduce previous observations. Given the prominence of the original reports of transgenerational inheritance, the present study is of broad interest to anyone studying genetics, epigenetics, or learned behavior.

      Thank you for your considered reviews and advice on how to improve our manuscript. We appreciate that the editors and reviewers felt that our manuscript addressed an important issue and acknowledged the difficulty of publishing negative results. We will revise the manuscript and consider all the concerns raised by the editor and referees.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors report an inability to reproduce a transgenerational memory of avoidance of the pathogen PA14 in C. elegans. Instead, the authors demonstrate intergenerational inheritance for a single F1 generation, in embryos of mothers exposed to OP50 and PA14, where embryos isolated from these mothers by bleaching are capable of remembering to avoid PA14 in a manner that is dependent on systemic RNAi proteins sid-1 and sid-2. This could reflect systemic sRNAs generated by neuronal daf-7 signaling that are transmitted to F1 embryos. The authors note that transgenerational memory of PA14 was reported by the Murphy group at Princeton, but that environmental or strain variation (worms or bacteria) might explain the single generation of inheritance observed at Harvard. The Hunter group tried different bacterial growth conditions and different worm growth temperatures for independent PA14 strains, which they showed to be strongly pathogenic. However, the authors could not reproduce a transgenerational effect at Harvard. This important data will allow members of the scientific community to focus on the robust and reproducible inheritance of PA14 avoidance transmitted to F1 embryos of mothers exposed to PA14, which the authors demonstrate depends on small RNAs in a manner that is downstream of or in parallel to daf-7. This paper honestly and importantly alters expectations and questions the model that avoidance of PA14 is mediated by a bacterial ncRNA whose siRNAs target a C. elegans gene. Instead, endogenous C. elegans sRNAs that affect pathogen response may be the culprit that explains sRNA-mediated avoidance.

      Overall, this is an important paper that demonstrates that one model for transgenerational inheritance in C. elegans is not reproducible. This is important because it is not clear how many of the reported models of transgenerational inheritance reported in C. elegans are reproducible. The authors do demonstrate a memory for F1 embryos that could be a maternal effect, and the authors confirm that this is mediated by a systemic small RNA response. There are several points in the manuscript where a more positive tone might be helpful.

      We would like to correct the statement made in the second to last sentence. The demonstration of an F1 response to PA14 was first reported by Moore et al., (2019) and then by Pereira et al., (2020) using a different behavioral assay. We merely confirmed these results in our hands, and confirmed the observation, first reported by Kaletsky et al., (2020), that sid-1 and sid-2 are required for this F1 response; although we did find that sid-1 and sid-2 are not required for the PA14-induced increase in daf-7p::gfp expression in ASI neurons in the F1 progeny of trained adults, which had not been addressed in the published work.

      Yes, the intergenerational F1 response could be a maternal effect, but the in utero F1 embryos and their precursor germ cells were directly exposed to PA14 metabolites and toxins (non-maternal effect) as well as any parental response, whether mediated by small RNAs, prions,  hormones, or other unknown information carriers. While the F1 aversion response does require sid-1 and sid-2, we would not presume that the substrate is therefore an RNA molecule, particularly because the systemic RNAi response supported by sid-1 and sid-2 is via long double-stranded RNA. To date, no evidence suggests that either protein transports small RNAs, particularly single-stranded RNAs. 

      Strengths:

      The authors note that the high copy number daf-7::GFP transgene used by the Murphy group displayed variable expression and evidence for somatic silencing or transgene breakdown in the Hunter lab, as confirmed by the Murphy group. The authors nicely use single copy daf-7::GFP to show that neuronal daf-7::GFP is elevated in F1 but not F2 progeny with regards to the memory of PA14 avoidance, speaking to an intergenerational phenotype.

      The authors nicely confirm that sid-1 and sid-2 are generally required for intergenerational avoidance of F1 embryos of moms exposed to PA14. However, these small RNA proteins did not affect daf-7::GFP elevation in the F1 progeny. This result is unexpected given previous reports that single copy daf-7::GFP is not elevated in F1 progeny of sid mutants. Because the Murphy group reported that daf-7 mutation abolishes avoidance for F1 progeny, this means that the sid genes function downstream of daf-7 or in parallel, rather than upstream as previously suggested.

      The authors studied antisense small RNAs that change in Murphy data sets, identifying 116 mRNAs that might be regulated by sRNAs in response to PA14. Importantly, the authors show that the maco-1 gene, putatively targeted by piRNAs according to the Kaletsky 2020 paper, displays few siRNAs that change in response to PA14. The authors conclude that the P11 ncRNA of PA14, which was proposed to promote interkingdom RNA communication by the Murphy group, is unlikely to affect maco-1 expression by generating sRNAs that target maco-1 in C. elegans. The authors define 8 genes based on their analysis of sRNAs and mRNAs that might promote resistance to PA14, but they do not further characterize these genes' role in pathogen avoidance. The Murphy group might wish to consider following up on these genes and their possible relationship with P11.

      Weaknesses:

      This very thorough and interesting manuscript is at times pugnacious.

      We reiterate that we never claimed that Moore et al., (2019) did not obtain their reported results. We simply stated that we could not replicate their results using the published methods and then failed in our search to identify variable(s) that might account for our results. We will do better when revising the manuscript to make clear, unmuddied statements of facts and state that future investigations may provide independent evidence that supports the original claims and explains our divergent results.

      Please explain more clearly what is High Growth media for E. coli in the text and methods, conveying why it was used by the Murphy lab, and if Normal Growth or High Growth is better for intergenerational heritability assays.

      We used the standard recipes as described in Moore et al., (2021), and will include the recipes and some of the relevant commentary from the paragraphs below to the methods and text as appropriate. 

      Normal Growth (NG) media minimally supports OP50 growth, resulting in a thin lawn that minimally obscures viewing larvae and embryos. High Growth (HG) media contains 8X more peptone, which supports much higher OP50 growth, resulting in a thick bacterial lawn that supports larger worm populations. The thicker bacterial lawn can also compromise agar integrity, and the higher worm density encourages worm burrowing behavior, thus the HG plates also have 75% more agar to inhibit worm burrowing. 

      Our results (Figure 4) show that worms grown on OP50 seeded NG or HG plates show different choice responses (PA14 vs OP50). As for experimental “advice”, we would caution our colleagues to not assume that OP50 is a neutral food and to be aware that how you grow and store OP50 (or any bacterial culture that is to be used as food for worms) may have a significant effect on the phenotype you are studying. 

      Reviewer #2 (Public Review):

      This paper examines the reproducibility of results reported by the Murphy lab regarding transgenerational inheritance of a learned avoidance behavior in C. elegans. It has been well established by multiple labs that worms can learn to avoid the pathogen pseudomonas aeruginosa (PA14) after a single exposure. The Murphy lab has reported that learned avoidance is transmittable to 4 generations and dependent on a small RNA expressed by PA14 that elicits the transgenerational silencing of a gene in C. elegans. The Hunter lab now reports that although they can reproduce inheritance of the learned behavior by the first generation (F1), they cannot reproduce inheritance in subsequent generations.

      This is an important study that will be useful for the community. Although they fail to identify a "smoking gun", the study examines several possible sources for the discrepancy, and their findings will be useful to others interested in using these assays. The preference assay appears to work in their hands in as much as they are able to detect the learned behavior in the P0 and F1 generations, suggesting that the failure to reproduce the transgenerational effect is not due to trivial mistakes in the protocol. An obvious reason, however, to account for the differing results is that the culture conditions used by the authors are not permissive for the expression of the small RNA by PA14 that the MUrphy lab identified as required for transgenerational inheritance. It would seem prudent for the authors to determine whether this small RNA is present in their cultures, or at least acknowledge this possibility.

      We note that Kaletsky et al., (2020) (Figure 3L) showed that PA14 ΔP11 bacteria failed to induce an F1 avoidance response. Thus, the fact that we observed F1 avoidance implies that our culture conditions successfully induced P11 expression. We believe that this addresses the concern raised here. We thank the reviewer for raising this issue and we will add a statement to this effect in the revised manuscript.

      The authors should also note that their protocol was significantly different from the Murphy protocol (see comments below) and therefore it remains possible that protocol differences cumulatively account for the different results.

      We disagree. Our adjustments to the core protocol were minor and, where possible, were explicitly tested in side-by-side experiments. To discover the source(s) of discrepancy between our results and the published results we subsequently introduced variations to this core protocol to exclude likely variables (worm and bacteria growth temperatures, assay conditions, worm handling methods, bacterial culture and storage conditions, and some minor developmental timing issues). To substantiate these assertions, we will, upon revision, add the precise protocol we followed for the aversion assay to the supplemental documents, provide some additional experimental results supporting these claims, and further clarify which presented experiments included protocol variations (e.g. sodium azide or cold immobilization). It remains possible that we misunderstood the published protocol, but we were highly motivated to replicate the results and read every published version with extreme care.

      Reviewer #3 (Public Review):

      Summary:

      It has been previously reported in many high-profile papers, that C. elegans can learn to avoid pathogens. Moreover, this learned pathogen avoidance can be passed on to future generations - up to the F5 generation in some reports. In this paper, Gainey et al. set out to replicate these findings. They successfully replicated pathogen avoidance in the exposed animals, as well as a strong increase in daf-7 expression in ASI neurons in F1 animals, as determined by a daf-7::GFP reporter construct. However, they failed to see strong evidence for pathogen avoidance or daf-7 overexpression in the F2 generation. The failure of replication is the major focus of this work.

      Given their failure to replicate these findings, the authors embark on a thorough test of various experimental confounders that may have impacted their results. They also re-analyze the small RNA sequencing and mRNA sequencing data from one of the previously published papers and draw some new conclusions, extending this analysis.

      Strengths:

      (1) The authors provide a thorough description of their methods, and a marked-up version of a published protocol that describes how they adapted the protocol to their lab conditions. It should be easy to replicate the experiments.

      (2) The authors test the source of bacteria, growth temperature (of both C. elegans and bacteria), and light/dark husbandry conditions. They also supply all their raw data, so that the sample size for each testing plate can be easily seen (in the supplementary data). None of these variations appears to have a measurable effect on pathogen avoidance in the F2 generation, with all but one of the experiments failing to exhibit learned pathogen avoidance.

      (3) The small RNA seq and mRNA seq analysis is well performed and extends the results shown in the original paper. The original paper did not give many details of the small RNA analysis, which was an oversight. Although not a major focus of this paper, it is a worthwhile extension of the previous work.

      (4) It is rare that negative results such as these are accessible. Although the authors were unable to determine the reason that their results differ from those previously published, it is important to document these attempts in detail, as has been done here. Behavioral assays are notoriously difficult to perform and public discourse around these attempts may give clarity to the difficulties faced by a controversial field.

      Thank you for your support. Choosing to pursue publication of these negative results was not an easy decision, and we thank members of the community for their support and encouragement.

      Weaknesses:

      (1) Although the "standard" conditions have been tested over multiple biological replicates, many of the potential confounders that may have altered the results have been tested only once or twice. For example, changing the incubation temperature to 25{degree sign}C was tested in only two biological replicates (Exp 5.1 and 5.2) - and one of these experiments actually resulted in apparent pathogen avoidance inheritance in the F2 generation (but not in the F1). An alternative pathogen source was tested in only one biological replicate (Exp 3). Given the variability observed in the F2 generation, increasing biological replicates would have added to the strengths of the report.

      We agree that our study was not exhaustive in our exploration of variables that might be interfering with our ability to detect F2 avoidance. We also note that some of these variables also failed (with many more independent experiments) to induce elevated daf-7p::gfp expression in ASI neurons in F2 progeny. Our goal was not to show that variation in some growth or assay condition would generate reproducible negative results, the exploration was designed to tweak conditions to enable detection of a robust F2 response. Given the strength of the data presented in Moore et al., (2019) we expected that adjustment of the problematic variable would produce positive results apparent in a single replicate, which could then be followed up. If we had succeeded, then we would have documented the conditions that enabled robust F2 inheritance and would have explored molecular mechanisms that support this important but mysterious process.

      (2) A key difference between the methods used here and those published previously, is an increase in the age of the animals used for training - from mostly L4 to mostly young adults. I was unable to find a clear example of an experiment when these two conditions were compared, although the authors state that it made no difference to their results.

      We can state firmly that the apparent time delay did not affect P0 learned avoidance or, as documented in Table S1, daf-7p::gfp expression in ASI neurons. In our experience, training mostly L4’s on PA14 frequently failed to produce sufficient F1 embryos for both F1 avoidance assays or daf-7p::gfp measurements in ASI neurons and collection of F2 progeny. Indeed, in early attempts to detect heritable PA14 aversion, trained P0 and F1 progeny were not assayed in order to obtain sufficient F2’s for a choice assay. These animals failed to display aversion, but without evidence of successful P0 training or an F1 intergenerational response this was deemed a non-fruitful trouble-shooting approach. We will add to our supplemental figures P0 choice results from experiments using younger trained animals that failed to produce sufficient F1’s to continue the inheritance experiments. 

      The different timing between the two protocols may reflect the age of the recovered bleached P0 embryos. It is reasonable to assume that bleaching day 1 adults vs day 2 adults from the P-1 population could shift the average age of recovered P0 embryos by several hours. The Murphy protocol only states that P0 embryos were obtained by bleaching healthy adults. Regardless, if the hypothesis entertained here is true, that a several hour difference in larval/adult age during 24 hours of training affects F2 inheritance of learned aversion but does not affect P0 learned avoidance, then we would argue that this paradigm for heritable learned avoidance, as described in Moore et al, (2019, 2021), is not sufficiently robust for mechanistic investigations. 

      (3) The original paper reports a transgenerational avoidance effect up to the F5 generation. Although in this work the authors failed to see avoidance in the F2 generation, it would have been prudent to extend their tests for more generations in at least a couple of their experiments to ensure that the F2 generation was not an aberration (although this reviewer acknowledges that this seems unlikely to be the case).

      Citations

      Moore, R.S., Kaletsky, R., and Murphy, C.T. (2019). Piwi/PRG-1 Argonaute and TGF-beta Mediate Transgenerational Learned Pathogenic Avoidance. Cell 177, 1827-1841 e1812.

      Pereira, A.G., Gracida, X., Kagias, K., and Zhang, Y. (2020). C. elegans aversive olfactory learning generates diverse intergenerational effects. J Neurogenet 34, 378-388.

      Kaletsky, R., Moore, R.S., Vrla, G.D., Parsons, L.R., Gitai, Z., and Murphy, C.T. (2020). C. elegans interprets bacterial non-coding RNAs to learn pathogenic avoidance. Nature 586, 445-451.

      Moore, R.S., Kaletsky, R., and Murphy, C.T. (2021). Protocol for transgenerational learned pathogen avoidance behavior assays in Caenorhabditis elegans. STAR Protoc 2, 100384.

    1. Author response:

      We appreciate the time and effort that you and the reviewers have dedicated to providing valuable feedback on our manuscript. We are grateful to the reviewers for their insightful comments.

      Reviewer #1:<br /> We thank the reviewer for the positive comments made on our manuscript.

      Reviewer #2:<br /> We thank the reviewer for these positive remarks.

      Concerning the main weakness highlighted by the reviewer:

      We presented results in our submitted work both without noise and with a signal-to-noise ratio (SNR) equal to 50. Figure 5 shows exemplar posterior distributions obtained in a noise-free scenario, and Table 1 reports the number of degeneracies for each model on 10000 noise-free simulations. These results highlight that the presence of degeneracies is inherent to the model definition. Figures 3, 6 and 7 present results considering an SNR of 50. Results with lower SNR have indeed not been included into this work. We agree that adding a figure showing the impact of noise on the posterior distributions will be a good addition to this work. We will include an additional figure in the second version, as interestingly suggested.

    1. Author response:

      Reviewer #1 (Recommendations For The Authors):

      (1) Figure 3B was not cited in the manuscript.

      We have now included the citation for Figure 3B in the main text: “….whereas NSP13-R567A (lost ATP consumption) and NSP13-K345A/K347A (obstructed the nucleic acid binding channel) failed to inhibit YAP activity (Figure 3B).” (Please see the revised manuscript) 

      Reviewer #2 (Recommendations For The Authors):

      (2) In Figure 1, ciliated cells are marked as a separate cluster from "epithelial cells". Since ciliated cells are epithelial cells, I suggest changing the nomenclature of the clusters.

      We have updated the label from “Ciliated” to “Ciliated Epithelial” in Figure 1A, as suggested. (Please see the revised manuscript)

      (3) Outlines of planned revisions: 1) Reanalyze snRNA-seq and bulk RNA-seq data from Figure 1 to investigate YAP target genes related to innate immune response; 2) Employ ChIP-seq to determine whether NSP13 WT or mutants (K131, K345/K347, and R567) prevent YAP/TEAD complex from binding to DNA by occupying the TEAD DNA binding site, providing insights into the mechanism; 3) Validate NSP13 interacting proteins using Immunoprecipitation-Western Blot (IP-WB) assays based on mass spectrum results; 4) Perform bulk RNA sequencing in cells with or without NSP13 expression to assess endogenous YAP target genes expression.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      My main concern is still in place. It is unclear whether the proposed method can find actual goal states, and as a result it is unclear what states it finds. Table S1 mentions the model BIOMD0000000454, which is a small metabolic pathway with known equations given in "Example One" in "Metabolic Control Analysis: Rereading Reder". In this model the goal states can be calculated analytically.

      Regarding your statements below: I am not concerned that your method will be less efficient than random search (or any other search..) on small models, but I think it is important for the readers to have evidence that your method is able to discover true goal states at least in small networks, used in your study. You do show that your method scales to complex models. So, in my opinion, the missing part is to show that it is able to find true goal states.

      "...For simple models whose true steady-state distribution can be derived numerically and/or analytically, it is very likely that their exploration will be much simpler and this is not where a lot of improvement over random search may be found, which explains our focus on more complex models..."

      We thank you for your response and for your concerns on the lack of evidence that our method is able to re-discover the true goal states of simple models when these are known a priori. We acknowledge that adding these simple cases is useful for completeness. We did not include these simple models in our main study because in most cases a basic random search over the initial conditions will lead to the re-discovery of these goal states. For instance for the mentioned model BIOMD0000000454 described in the "Example One" from the "Metabolic Control Analysis: Rereading Reder" paper, several simplifying assumptions are made such that the system only has one steady state (x1=0.056, x2=0.769, x3=4.231) which can be found analytically as shown in the paper. In that simple case, this goal state is also straightforward to find with numerical simulation as any valid initial condition will converge to it.

      To address the concerns of the reviewer, we propose to add an additional "sanity check" figure in the supplementary of the revised paper (Figure S4), as well as a “sanity check” subsection in the “Methods”, to present additional experiments made on  simple models such as this one. The novel figure and subsection can be visualized on the paper’s interactive version available online https://developmentalsystems.org/curious-exploration-of-grn-competencies, and we plan to include them as such in the further revision.  We have also included the full code to reproduce this sanity check as a ‘sanity_check.ipynb’  jupyter notebook in the github repository (https://github.com/flowersteam/curious-exploration-of-grn-competencies/blob/main/notebooks/sanity_check.ipynb).

      In the novel figure S4-b, we show the results of our exploration pipeline on the suggested model BIOMD0000000454 as described in the "Example One" of the paper. These results provide evidence that the curiosity search is able to find back the correct unique goal state (x1=0.056, x2=0.769, x3=4.231), as expected.

      We also include a second sanity check on BIOMD0000000341 which models the dynamics of beta-cell mass, insulin and glucose dynamics. This model has two stable fixed points representing physiological (B=300, I=10, G=100) and pathological (B=0, I=0, G=600) steady states, which are the known ground truth steady states as described in Figure 3 of the "A Model of b-Cell Mass, Insulin, and Glucose Kinetics: Pathways to Diabetes" paper. Again, as expected, curiosity search is able to find back those two steady states (Figure S4-a).

      As stated in our previous answer, our main study focuses on more complex models that are not limited to one or few attractors that can easily be discovered with random initial conditions. Regarding the mentioned BIOMD0000000454, maybe something that has been confusing for the reviewer is that we indeed included it in our main study but, as specified in the caption of table S4, at the difference of what is done in the "example one" of the original paper, we let the metabolite concentrations y1,...,y5 evolve in time (instead of enforcing them as constants). When doing so, the resulting dynamics of the system are more complex and exhibit a spectrum of possible steady states (unknown a priori), which differ from the previous case with a single steady state. In that case, the new attractors are not analytically easy to find and the proposed curiosity search becomes interesting as it is able to uncover the distribution of possible steady states much more efficiently than a random search baseline, as shown in the new figures S4-c and S4-d.

      We hope that these new results will address the reviewer’s concerns and provide evidence to the readers on the validity of the approach on simple networks.

      eLife assessment

      This important study develops a machine learning method to reveal hidden unknown functions and behavior in gene regulatory networks by searching parameter space in an efficient way. The evidence for some parts of the paper is still incomplete and needs systematic comparison to other methods and to the ground truth, but the work will be of broad interest to anyone working in biology of all stripes since the ideas reach beyond gene regulatory networks to revealing hidden functions in any complex system with many interacting parts.

      We thank the editors and reviewers for their positive assessment and constructive suggestions. In our response, we acknowledge the importance of systematic comparison to other methods and to the ground truth, when available. However we also emphasize the challenges associated with evaluating such methods in the context of uncovering hidden behaviors in complex biological networks as the ground truth is often unknown. We hope that our explanations will clarify the potential of our approach in advancing the exploration of these systems.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary: This paper suggests to apply intrinsically-motivated exploration for the discovery of robust goal states in gene regulatory networks.

      Strengths:

      The paper is well written. The biological motivation and the need for such methods are formulated extraordinarily well. The battery of experimental models is impressive.

      We thank the reviewer for sharing interest in the research problem and for recognizing the strengths of our work.

      Weaknesses:

      (1) The proposed method is compared to the random search. That says little about the performance with regard to the true steady-state goal sets. The latter could be calculated at least for a few simple ODE (e.g., BIOMD0000000454, `Metabolic Control Analysis: Rereading Reder'). The experiment with 'oscillator circuits' may not be directly interpolated to the other models.

      The lack of comparison to the ground truth goal set (attractors of ODE) from arbitrary initial conditions makes it hard to evaluate the true performance/contribution of the method. A part of the used models can be analyzed numerically using JAX, while there are models that can be analyzed analytically.

      "...The true versatility of the GRN is unknown and can only be inferred through empirical exploration and proxy metrics....": one could perform a sensitivity analysis of the ODEs, identifying stable equilibria. That could provide a proxy for the ground truth 'versatility'.

      We agree with the reviewer that one primary concern is to properly evaluate the effectiveness of the proposed method. However, as we move toward complex pathways, knowledge of the “true” steady-state goal sets is often unknown which is where the use of machine learning methods as the one we propose are particularly interesting (but challenging to evaluate).

      For simple models whose true steady-state distribution can be derived numerically and/or analytically, it is very likely that their exploration will be much simpler and this is not where a lot of improvement over random search may be found, which explains our focus on more complex models. While we agree that it is still interesting to evaluate exploration methods on these simple models for checking their behavior, it is not clear how to scale this analysis to the targeted more complex systems.

      For systems whose true steady state distribution cannot be derived analytically or numerically, we believe that random search is a pertinent baseline as it is commonly used in the literature to discover the attractors/trajectories of a biological network. For instance, Venkatachalapathy et al. [1] initialize stochastic simulations at multiple randomly sampled starting conditions (which is called a kinetic Monte Carlo-based method) to capture the steady states of a biological system. Similarly, Donzé et al. [29] use a Monte Carlo approach to compute the reachable set of a biological network «when the number of parameters  is large and their uncertain range  is not negligible». For the considered models, the true steady-state goal set is unknown, which is why we chose comparison with random search. We added a “Statistics” subsection in the Methods section providing additional details about the statistical analyses we perform between our method and the random search baseline.

      (2) The proposed method is based on `Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning', which assumes state action trajectories [s_{t_0:t}, a_{t_0:t}], (2.1 Notations and Assumptions' in the IMGEP paper). However, the models used in the current work do not include external control actions, but rather only the initial conditions can be set. It is not clear from the methods whether IMGEP was adapted to this setting, and how the exploration policy was designed w/o actual time-dependent actions. What does "...generates candidate intervention parameters to achieve the current goal....", mean considering that interventions 'Sets the initial state...' as explained in Table 2?

      We thank the reviewer for asking for clarification, as indeed the IMGEP methodology originates from developmental robotics scenarios which generally focus on the problem of robotic sequential decision-making, therefore assuming state action trajectories as presented in Forestier et al. [65]. However, in both cases, note that the IMGEP is responsible for sampling parameters which then govern the exploration of the dynamical system. In Forestier et al. [65], the IMGEP also only sets one vector at the start (denoted ) which was specifying parameters of a movement (like the initial state of the GRN), which was then actually produced with dynamic motion primitives which are dynamical system equations similar to GRN ODEs, so the two systems are mathematically equivalent. More generally, while in our case the “intervention” of the IMGEP (denoted ) only controls the initial state of the GRN, future work could consider more advanced sequential interventions simply by setting parameters of an action policy  at the start which could be called during the GRN’s trajectory to sample control actions  where  would be the state of the GRN. In practice this would also require setting only one vector at the start, so it would remain the same exploration algorithm and only the space of parameters would change, which illustrates the generality of the approach.

      (3) Fig 2 shows the phase space for (ERK, RKIPP_RP) without mentioning the typical full scale of ERK, RKIPP_RP. It is unclear whether the path from (0, 0) to (~0.575, ~3.75) at t=1000 is significant on the typical scale of this phase space. is it significant on the typical scale of this phase space?

      The purpose of Figure 2 is to illustrate an example of GRN trajectory in transcriptional space, and to illustrate what “interventions” and “perturbations” can be in that context. To that end we have used the fixed initial conditions provided in the BIOMD0000000647, replicating Figure 5 of Cho et al. [56].

      While we are not sure of what the reviewer means with “typical” scale of this phase space, we would like to point reviewer toward Figure 8 which shows examples of certain paths that indeed reach further point in the same phase space (up to ~10 in RKIPP_RP levels and ~300 in ERK levels). However, while the paths displayed in Figure 8 are possible (and were discovered with the IMGEP), note that they may be “rarer” to occur naturally  in the sense that a large portion of the tested initial conditions with random search tend to converge toward smaller (ERK, RKIPP_RP) steady-state values similar to the ones displayed in Figure 2.

      (4) Table 2:

      a. Where is 'effective intervention' used in the method?

      b. in my opinion 'controllability', 'trainability', and 'versatility' are different terms. If their correspondence is important I would suggest to extend/enhance the column "Proposed Isomorphism". otherwise, it may be confusing.

      a) We thank the reviewer for pointing out that “effective intervention” is not explicitly used in the method. The idea here is that as we are exploring a complex dynamical system (here the GRN), some of the sampled interventions will be particularly effective at revealing novel unseen outcomes whereas others will fail to produce a qualitative change to the distribution of discovered outcomes. What we show in this paper, for instance in Figure 3a and Figure 4, is that the IMGEP method is particularly sample-efficient in finding those “effective interventions”, at least more than a random exploration. However we agree that the term “effective intervention” is ambiguous (does not say effective in what) and we have replaced it with “salient intervention” in the revised version.

      b) We thank the reviewer for highlighting some confusing terms in our chosen vocabulary, and we have clarified those terms in the revised version. We agree that controllability/trainability and versatility are not exactly equivalent concepts, as controllability/trainability typically refers to the amount to which a system is externally controllable/trainable whereas versatility typically refers to the inherent adaptability or diversity of behaviors that a system can exhibit in response to inputs or conditions. However, they are both measuring the extent of states that can be reached by the system under a distribution of stimuli/conditions, whether natural conditions or engineered ones, which is why we believe that their correspondence is relevant.

      I don't see how this table generalizes "concepts from dynamical complex systems and behavioral sciences under a common navigation task perspective".

      We have replaced the verb “generalize” with “investigate” in the revised version.

      Reviewer #2 (Public Review):

      Summary:

      Etcheverry et al. present two computational frameworks for exploring the functional capabilities of gene regulatory networks (GRNs). The first is a framework based on intrinsically-motivated exploration, here used to reveal the set of steady states achievable by a given gene regulatory network as a function of initial conditions. The second is a behaviorist framework, here used to assess the robustness of steady states to dynamical perturbations experienced along typical trajectories to those steady states. In Figs. 1-5, the authors convincingly show how these frameworks can explore and quantify the diversity of behaviors that can be displayed by GRNs. In Figs. 6-9, the authors present applications of their framework to the analysis and control of GRNs, but the support presented for their case studies is often incomplete.

      Strengths:

      Overall, the paper presents an important development for exploring and understanding GRNs/dynamical systems broadly, with solid evidence supporting the first half of their paper in a narratively clear way.

      The behaviorist point of view for robustness is potentially of interest to a broad community, and to my knowledge introduces novel considerations for defining robustness in the GRN context.

      We thank the reviewer for recognizing the strengths and novelty of the proposed experimental framework for exploring and understanding GRNs, and complex dynamical systems more generally. We agree that the results presented in the section “Possible Reuses of the Behavioral Catalog and Framework” (Fig 6-9) can be seen as incomplete along certain aspects, which we tried to make as explicit as possible throughout the paper, and why we explicitly state that these are “preliminary experiments”. Despite the discussed limitations, we believe that these experiments are still very useful to illustrate the variety of potential use-cases in which the community could benefit from such computational methods and experimental framework, and build on for future work.

      Some specific weaknesses, mostly concerning incomplete analyses in the second half of the paper:

      (1) The analysis presented in Fig. 6 is exciting but preliminary. Are there other appropriate methods for constructing energy landscapes from dynamical trajectories in gene regulatory networks? How do the results in this particular case study compare to other GRNs studied in the paper?

      We are not aware of other methods than the one proposed by Venkatachalapathy et al. [1] for constructing an energy landscape given an input set of recorded dynamical trajectories, although it might indeed be the case. We want to emphasize that any of such methods would anyway depend on the input set of trajectories, and should therefore benefit from a set that is more representative of the diversity of behaviors that can be achieved by the GRN, which is why we believe the results presented in Figure 6 are interesting. As the IMGEP was able to find a higher diversity of reachable goal states (and corresponding trajectories) for many of the studied GRNs, we believe that similar effects should be observable when constructing the energy landscapes for these GRN models, with the discovery of additional or wider “valleys” of reachable steady states.

      Additionally, it is unclear whether the analysis presented in Fig. 6C is appropriate. In particular, if the pseudopotential landscapes are constructed from statistics of visited states along trajectories to the steady state, then the trajectories derived from dynamical perturbations do not only reflect the underlying pseudo-landscape of the GRN. Instead, they also include contributions from the perturbations themselves.

      We agree that the landscape displayed Fig. 6C integrates contributions from the perturbations on the GRN’s behavior, and that it can shape the landscape in various ways, for instance affecting the paths that are accessible, the shape/depth of certain valleys, etc. But we believe that qualitatively or quantitatively analyzing the effect of these perturbations  on the landscape is precisely what is interesting here: it might help 1) understand how a system respond to a range of perturbations and to visualize which behaviors are robust to those perturbations, 2) design better strategies for manipulating those systems to produce certain behaviors

      (2) In Fig. 7, I'm not sure how much is possible to take away from the results as given here, as they depend sensitively on the cohort of 432 (GRN, Z) pairs used. The comparison against random networks is well-motivated. However, as the authors note, comparison between organismal categories is more difficult due to low sample size; for instance, the "plant" and "slime mold" categories each only have 1 associated GRN. Additionally, the "n/a" category is difficult to interpret.

      We acknowledge that this part is speculative as stated in the paper: “the surveyed database is relatively small with respect to the wealth of available models and biological pathways, so we can hardly claim that these results represent the true distribution of competencies across these organism categories”. However, when further data is available, the same methodology can be reused and we believe that the resulting statistical analyses could be very informative to compare organismal (or other) categories.

      (3) In Fig. 8, it is unclear whether the behavioral catalog generated is important to the intervention design problem of moving a system from one attractor basin to another. The authors note that evolutionary searches or SGD could also be used to solve the problem. Is the analysis somehow enabled by the behavioral catalog in a way that is complementary to those methods? If not, comparison against those methods (or others e.g. optimal control) would strengthen the paper.

      We thank the reviewer for asking to clarify this point, which might not be clearly explained in the paper. Here the behavioral catalog is indeed used in a complementary way to the optimization method, by identifying a representative set of reachable attractors which are then used to define the optimization problem. For instance here, thanks to the catalog, we 1) were able to identify a “disease” region and several possible reachable states in that region and 2) use several of these states as starting points of our optimization problem, where we want to find a single intervention that can successfully and robustly reset all those points, as illustrated in Figure 8. Please note that given this problem formulation, a simple random search was used as an optimization strategy. When we mention more advanced techniques such as EA or SGD, it is to say that they might be more efficient optimizers than random search. However, we agree that in many cases optimizing directly will not work if starting from random or bad initial guess, and this even with EA or SGD. In that case the discovered behavioral catalog can be useful to better initialize  this local search and make it more efficient/useful, akin to what is done in Figure 9.

      (4) The analysis presented in Fig. 9 also is preliminary. The authors note that there exist many algorithms for choosing/identifying the parameter values of a dynamical system that give rise to a desired time-series. It would be a stronger result to compare their approach to more sophisticated methods, as opposed to random search and SGD. Other options from the recent literature include Bayesian techniques, sparse nonlinear regression techniques (e.g. SINDy), and evolutionary searches. The authors note that some methods require fine-tuning in order to be successful, but even so, it would be good to know the degree of fine-tuning which is necessary compared to their method.

      We agree that the analysis presented in Figure 9 is preliminary, and thank the reviewer for the suggestion. We would first like to refer to other papers from the ML literature that have more thoroughly analyzed this issue, such as Colas et al. [74] and Pugh et al. [34], and shown the interest of diversity-driven strategies as promising alternatives.  Additionally, as suggested by the reviewer, we added an additional comparison to the CMA-ES algorithm in the revised version in order to complete our analysis. CMA-ES is an evolutionary algorithm which is self-adaptive in the optimization steps and that is known to be better suited than SGD to escape local minimas when the number of parameters is not too high (here we only have 15 parameters). However, our results showed that while CMA-ES explores more the solution space at the beginning of optimization than SGD does, it also ultimately converges into a local minima similarly to SGD. The best solution converges toward a constant signal (of the target b) but fails to maintain the target oscillations, similar to the solutions discovered by gradient descent. We tried this for a few hyperparameters (init mean and std) but always found similar results.  We have updated the figure 9 image and caption, as well as descriptive text, to include these novel results in the revised version. We also added a reference to the CMA-ES paper in the citations.

      Reviewer #1 (Recommendations For The Authors):

      I would suggest to conduct a more rigor analysis of the performance by estimating/approximating the ground truth robust goal sets in important GRNs.

      Also, the use of terminology from different disciplines can be improved. Please see my comments above. Specifically, the connection between controllability in dynamical control systems and versatility used in this paper is unclear.

      We hope to have addressed the reviewer's concerns in our previous answers.

      Reviewer #2 (Recommendations For The Authors):

      Fig 4b: I'm not sure if DBSCAN is the appropriate method to use here, as the visual focus on the core elements of the clusters downplays the full convex hull of the points that random sampling achieves in Z space. An analysis based on convex hulls or the ball-coverage from Fig. 3b would presumably generate plots that were more similar between random sampling and curiosity search. If the goal is to highlight redundancy/non-linearity in the mapping between Z and I, another approach might be to simply bin Z-space in a grid, or to use a clustering algorithm that is less stringent about core/noise distinctions.

      We thank the reviewer for the suggestion. This plot is intended to convey the reader an understanding of why a method that uniformly samples goals in Z (what the  IMGEP is doing), is more efficient than a method that uniformly samples parameters in I (what the random search is doing), in systems for which there is high redundancy/non-linearity in the mapping between I and Z. We agree that binning the Z-space in a grid and counting the number of achieved bins is a way to quantitatively measure this, which is by the way very close to what we do in Figure 3 for measuring the achieved diversity. We believe however that the clustering and coloring provides additional intuitions on why this is the case: it illustrates that large regions of the intervention space map to small regions in the outcome space and vice versa.

      Additional changes in the revised version:

      We added a sentence in the Methods section as well as in the caption of Table S1 providing additional details about the way we simulate the biological models from the BioModels website

      We fixed a wrong reference to Figure 4 in the Methods “Sensitivity measure” subsection with reference to Figure 5.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      The process of EMT is a major contributor to metastasis and chemoresistance in breast cancer. By using a modified PyMT model that allows the identification of cells undergoing EMT and their decedents via S100A4-Cre mediated recombination of the mTmG allele, Ban et al. tackle a very important question of how tumor metastasis and therapy resistance by EMT can be blocked. They identified that pathways associated with ribosome biogenesis (RiBi) are activated during transition cell states. This finding represents a promising therapeutic target to block any transition from E to M (activated during cell dissemination and invasion) as well as from M to E (activated during metastatic colonization). Inhibition of RiBi-blocked EMT also reduced the establishment of chemoresistance that is associated with an EMT phenotype. Hence, RiBi blockage together with standard chemotherapy showed synergistic effects, resulting in impaired colonization/metastatic outgrowth in an animal model. The study is of great interest and of high clinical relevance as the authors show that blocking the transition from E to M or vice versa targets both aspects of metastasis, dissemination from the primary tumor, and colonization in distant organs. 

      We appreciate the positive acknowledgment of our work.

      The study is done with high skill using state-of-the-art technology and the conclusions are convincing and solid, but some aspects require some additional experimental support and clarification. It remains elusive whether blocking of EMT/MET is necessary for the synergistic effect of standard chemotherapy together with RiBi blockage or whether a general growth disadvantage of RiBi-treated cells independent of blocking transition is responsible. 

      We appreciate the reviewer for raising the pertinent query regarding the interrelation between EMT/MET blocking by RiBi inhibition and its synergistic effect with chemotherapy drugs. Our experimental data suggests a potential consequence of these events. Specifically, when assessing the potency of RiBi inhibitors (BMH21 and CX5410), we observed a pronounced EMT/MET blocking effect at concentrations preceding the emergence of cytotoxic effects (refer to Fig. 4 and Supplementary Fig S8). Notably, the IC50 for BMH21 was approximately 200nM, which is a concentration surpassing those that manifested the EMT/MET blocking effects. Crucially, the enhanced synergy of RiBi inhibitors with chemotherapy drugs was predominantly seen at these lower concentrations (as illustrated in Supplementary Fig S10). Therefore, the EMT/MET blocking by RiBi inhibition, rather than the cytotoxic effect, is likely instrumental for the synergy with chemotherapy drugs. The result was highlighted in Page#16.

      How can specific effects on state transition by RiBI block be separated from global effects attributed to overall reduced protein biosynthesis, proliferation etc.? 

      We appreciate the reviewer's insightful query. We agree that RiBi activity and associated protein synthesis are fundamental processes for cell viability, making it challenging to clearly delineate the overall effects of RiBi blockage to the specific effects of EMT state transition. Our results showed an elevated RiBi activity during the EMT transitioning phases, concomitant with enhanced nascent protein synthesis, indicating a higher-than-normal requirement of new proteins for cells to switch their phenotype. This would provide us a chance to target the excessive activities of RiBi to block EMT/MET transition. Based on a similar consideration, we chose to apply shRNA instead of CRISPR technology to modulate RiBi gene expression. By comparing to scramble controls, the growth rates of the Rps knockdown cells (both RFP+ and GFP+ cells) were not significantly affected, while the EMT/MET transitioning was impaired (Supplementary Fig 9). These results may provide evidence of uncoupling the cell proliferation and EMT/MET status changes by inhibiting RiBi pathway.  

      Some other aspects are misleading or need extension. 

      Reviewer #1 (Recommendations For The Authors): 

      (1) The analysis of RiBi expression during EMT in Fig. 1K shows that transition states have high RiBi levels, whereas E and M states are low. Analyses of MET in Fig.2G indicate that M states have the lowest, transition states upregulate RiBi while E states have the highest levels of RiBi expression. This is puzzling and how can it be explained? It would be helpful to demonstrate how these two settings are related by combining results from Figs 1 and 2 in an E-Trans-M-Trans-E state graph (in a sequence of EMT/MET). Does it mean that the initial E state starts with lower RiBi and the final E state displays the highest RiBi expression? In other words, are the initial E state and the one after MET different? 

      Thank the reviewer for raising the concern about which EMT/MET state exhibits the highest RiBi activity. Following the reviewer's suggestions, we merged the scRNA-seq data of EMT and MET cells and performed the trajectory analysis. Similar epithelial-mesenchymal spectrums were detected from these cells (For reviewers Fig 1). Notably, the highest RiBi activity was detected in the early EMT transitioning or the late MET transitioning cells (revised For reviewers Fig 1D). Addressing the question of the reviewer, the initial E state (of EMT cells) did not show significant differences to the final E state (of MET cells) in comparisons of EMT pseudotime and RiBi activities. In addition, the analysis with merged cells also revealed:

      (1) Both the EMT (In_Vitro_Mix) and MET (In_Vivo_GFP) cells were generally divided into two major clusters representing epithelial and mesenchymal phenotypes (For reviewers Fig 1A, 1B).

      (2) The EMT and MET cells exhibited similar EMT spectrums (EMT/MET status, and pseudotime) in the trajectory analysis (For reviewers Fig 1C, 1D).

      (3) Cells with high RiBi activity were mostly from the transitioning cell during EMT (In_Vitro_Mix) cells (For reviewers Fig 1D).  

      (2) It needs to be elaborated on how the experiment in Fig. 4A was exactly done. Are there cells isolated directly from the autochthonous TriPyMT tumor in contrast to steady-state cultures from Fig. 1? Does the control graph represent 0d in culture or have the cells been cultured for the same amount of time as the treated samples? How do these observed 15% GFP+ cells are related to the 15% GFP+ cells obtained at day 0 and 34% at d7 control condition in Fig. 5A? 

      Following the reviewer’s suggestion, we have amended the figure legend to clarify the experiment settings. In Fig. 4A, we initiated the experiment with sorted RFP+/Epcam+ cells. The control cells were cultured for the same period of time (5 days) as drug-treated cells did. We apologize for the unclear description. The percentage of GFP+ cells in this experiment is not related to the experiment in Fig 5A, where the initial cell population comprised an unsorted mix of RFP/GFP cells. 

      (3) Fig. 4B: Since the bulk population is loaded in the WB, does that suggest that the epithelial state is stabilized/enhanced or does it reflect only different cell ratios? So, it would be important to show the WB for RFP+ and GFP+ cells separately. 

      Thank the reviewer for the query regarding Fig. 4B. We apologize for the unclear explanation. The experimental setup for Fig 4B was identical to that of Fig 4A, where the sorted RFP+ cells were utilized at the start. Indeed, the observed increase in epithelial markers and decrease in mesenchymal markers in cells treated with BMH and CX suggest a higher proportion of cells maintaining the RFP+ state. 

      Performing WB for RFP+ and GFP+ cells separately may not address the question we asked since the experiment was initialed with pure RFP+ cells. Also, the expression of the fluorescent markers is closely aligned with the EMT status of the cells with and without drug treatment.  

      (4) Figs. 4-6: The authors claim that there is less EMT under treatment. If the experiment was done over 5 days (as indicated in Fig.4b legend), it is necessary to rule out that shifts in E/M ratios are attributed to the effects of treatment on proliferation/survival affecting both populations differently. How do the same cells grow under treatment when injected orthotopically/subcutaneously? 

      We apologized for the unclear descriptions. The effect of blocking the transitioning of EMT with RiBi inhibitors were performed with purified RFP+/EpCam+ cells. All GFP+ cells in this experiment setting were transformed from RFP+ cells. Given the fluorescence switch was well correlated with EMT status of cells, RFP and GFP were used as EMT reporters. Similarly, we used purified GFP+/EpCam- cells as the initial population to study the MET process of tumor cells.

      To address the reviewer's concern regarding how RiBi inhibition may differentially affect the growth of RFP+ and GFP+ cells, we conducted a cell cycle assay using Tri-PyMT cells, which include both RFP+ and GFP+ populations. Our results demonstrated that both RFP+ and GFP+ cells exhibited a trend towards G2/M phase accumulation when treated with BMH21. It is important to note that the impact of BMH21 on the cell cycle was less pronounced than previously reported by Fu et al. (Oncol Rep, 2017). This is likely because the dose used for EMT inhibition in our study was approximately one-tenth of the dose known to inhibit cell growth (For Reviewers Fig 2). Also, no significantly differential impacts were detected between RFP+ and GFP+ cells. 

      We have previously characterized the proliferation rate of RFP+ and GFP+ populations (Lourenco et al 2020). RFP+ cells proliferate faster than GFP+ cells. Primary tumor cells derived from RFP+ cells also grew faster than GFP+ tumors (Lourenco et al 2020).

      (5) Fig. 6B: this image is puzzling. Only in the lower two panels the outline of the lung is visualized by DAPI staining. The upper two panels look like there is no lung tissue in ctrl (no DAPI+GFP-RFP- cells) or show almost exclusively DAPI+GFP-RFP- cells that are present in a clustered assembly. Do the latter represent lymphoid cell clusters or normal lung tissue? 

      To improve the clarity of fluorescent images in Fig 6B, we enlarged the merge images with higher contrast (Revised Fig. 6B). The DAPI+/RFP-/GFP- region represent normal lung tissue. Nodules with either RFP or GFP signals represent tumor lesions.  

      (6) Text: Several typos and sentences should be revised, including p. 3 "Le et al. discovered" which should read as "Li et al. discovered", p.8 "Vimten", p.10 "Cells were then classified cells into three main categories", GSEA should be spelled out as Gene Set Enrichment Analysis (not Assay), p. 13 "cells, suggesting the impaired MET capability with upon treatment". 

      We apologize for the typos. All were corrected in the revised manuscript.

      (7) Figures: Color gradient indicator in Fig. 1E does not reflect the colors of the cells, Fig. S5A+C are not referenced in the text, there is mislabeling of S5B,C,D in the legend, graph in Fig. 3D is placed two times and overlapping, Fig. 6C labeling needs adjustments, labeling of Fig. 6D should be similar to Fig. 6A: CTX blue and BMH21 green. 

      We apologize for these errors and made corrections. Color in Fig.1E represents the EMT status of tumor cells as indicated in the revised figure, red for more epithelial, and green for more mesenchymal features. Fig S5 is now Fig S6, and referred in the revised manuscript. Legend for figures were corrected. Labels of Fig 6 were adjusted. 

      Reviewer #2 (Public Review): 

      (1) The current manuscript by Ban et al describes that cells undergoing EMT have increased rRNA synthesis, as analyzed by RNA seq-based gene expression analysis, and that the increased rRNA synthesis provides a therapeutic opportunity to target chemoresistance. The cells utilized in this manuscript were isolated from the authors' Tri-PyMT EMT lineage tracing model published a few years ago which demonstrated that cells undergoing EMT are not the cells that are contributing to metastasis but rather to tumor chemoresistance (Fischer, Nature 2015). This in vivo model has since then been criticized for not capturing all relevant EMT events which the authors also acknowledge in the introduction. The authors therefore reason that they use this lineage tracing model to better understand the role of EMT in chemoresistance. 

      A major problem with the current manuscript is that the authors present many of their findings as a novel without the proper acknowledgment of previously published literature in particular, Prakash et al., Nature Communications, 2019 and Dermitt, Dev Cell, 2020. In the studies by Prakash, the authors demonstrate that maintaining ongoing rRNA biogenesis is essential for the execution of the EMT program, and thus the ability of cancer cells to become migratory and invasive. Further, Prakash et al showed that blocking rRNA biogenesis with a small molecule inhibitor, CX-5461 (which is also used in the study by Ban et al) specifically inhibits breast cancer growth, invasion, EMT, and metastasis in animal models without significant toxicity to normal tissues. As such a significant revision that is necessary at this time is a rewrite of the manuscript especially the introduction and the discussion to more accurately describe and cite previously published findings and then highlight the current work by Ban et al which nicely builds on the previously published literature as it highlights the contribution of EMT to chemoresistance rather than metastasis. The suggestion for the authors is that they therefore should focus on highlighting the chemotherapy resistance angle as their Tri-PyMT EMT lineage tracing was chosen to test this angle and as such focus on both primary tumor growth and metastasis. 

      We appreciate the reviewer’s insightful feedback. In response, we have revised a section in the discussion to better highlight how our study builds upon and extends the work of others. We acknowledge that the link between ribosome biogenesis (RiBi) and the epithelial-mesenchymal transition (EMT) pathway was noted by prior researches (Prakash et al. 2019; Ebright et al. 2020). In the revised manuscript, we have included extra discussion about the topic. Our findings, however, contribute to this knowledge by elucidating increased activities of RiBi during both EMT and mesenchymal-epithelial transition (MET) processes, thereby deepening our understanding of its role. Additionally, we have clarified our novel stance on EMT-targeting strategies. Rather than solely targeting the mesenchymal phenotype, we propose that inhibiting the phenotypic switching ability of tumor cells (a round trip encompassing both EMT and MET) could be more effective, as described in the introduction part.

      Additional major revisions: 

      (2) The authors use the FSP1-Cre Model which in the field has been questioned as to not capture all the relevant EMT events and therefore their findings should be corroborated by another EMT model system. 

      We agree with the reviewer that the Fsp1-Cre model could not capture ALL the relevant EMT events. However, the fidelity and accuracy of Fsp1-Cre model in reporting EMT process of Tri-PyMT cells have also been demonstrated in our previous studies (Lourenco et al. 2020). Also, we have included additional results to further characterize this model: 1) Continuous fluorescence switching from RFP+ to GFP+ was observed in Tri-PyMT cells (Supplementary Fig S1); 2) Bulk RNA-seq data showed the differential expression of EMT marker genes with the RFP+ and GFP+ cells (Supplementary Fig S2A); 3) Single-cell RNA-seq data showed the EMT spectrum and EMT status distributions according to Fsp1(S100a4)/Epcam, and Vim/Krt18 expression (revised Supplementary Fig S3B, 3C). Hope these results clarify the reviewer’s doubt about the Fsp1-Cre model in reporting EMT of tumor cells. Of note, the evaluation of EMT status with RiBi activity does not rely solely on the fluorescent marker switch but on the ETM-related transcriptome (EMTome) of the Tri-PyMT cells. 

      Again, we agree with the reviewer that the Tri-PyMT model does not report ALL relevant EMT events. In the manuscript, we have included experiments with MD-MB231-LM2 cells (Fig 6D) and analyzed the sequencing databases of breast cancer patients (revised Supplementary Fig S13, S14), to validate the findings of the association between EMT status and RiBi activity.

      (3) In the current version of the manuscript, there are no measurements of rRNA synthesis, but the gene expression profiles are used as a proxy for rRNA synthesis. The authors therefore need to include measurements of rRNA synthesis corroborating the RNA sequencing data to support their scientific findings and claims. This can be accomplished by qPCR, Northern blot, or EU staining of the respective sorted cell population. Quantification of rRNA synthesis is also needed for the CX5461/BMH-21 and silencing studies. 

      We agree that direct measure rRNA synthesis is important to validate the association of RiBi activity with the EMT/MET process. Following the reviewer’s suggestion, we performed EU incorporation assay with RFP+, Double+, and GFP+ Tri-PyMT cells with and without RiBi inhibitors. Under the treatment-naïve condition, the double+ (EMT-transitioning) cells exhibited highest activity of rRNA synthesis compared to either RFP+ (E) and GFP+ (M) cells (revised Supplementary Fig S7). Also, as expected, the treatment of BMH21 or CX-5461 could significantly inhibit the rRNA synthesis (revised Supplementary Fig S8B).

      (4) Currently, there is no mechanistic insight as to how rRNA synthesis is increased during EMT, which would also strengthen the manuscript. This could be done through targeted ChIP analysis. 

      The experimental data in the current manuscript suggest that the activation of RiBi is upstream of the EMT process, as the impaired RiBi pathway hinders the EMT of tumor cells. We are uncertain about the suggestion regarding ChIP analysis. If the reviewer refers to ChIP analysis with EMT transcription factors (i.e., Snail, Twist, and Zeb1), it may not elucidate the mechanisms by which the EMT process is associated with rRNA synthesis. Using sorted GFP/RFP double-positive Tri-PyMT cells, we found enhanced activations in the ERK and mTOR pathways in the EMT-transitioning cells (Figure 3A). It is well-documented that the ERK and mTOR pathways are key coordinators of EMT (Xie et al., Neoplasia 2004; Shin et al., PNAS 2019; Lamouille et al., J. Cell Sci. 2012; Roshan et al., Biochimie 2019). Interestingly, we also observed significantly higher phosphorylation of rpS6, a downstream indicator of mTOR pathway activation, in the Doub+ cells. As an indispensable ribosome protein, rpS6 phosphorylation could impact ribosome functions of protein translation (Bohlen et al., Nucleic Acid Res. 2021; Mieulet et al., 2007).

      (5) rRNA synthesis has canonically been linked to the cell cycle therefore it will be necessary for the authors to determine the cell cycle state of their respective cell populations throughout the manuscript. 

      Following the reviewer's suggestion, we analyzed the cell cycles of RFP+, GFP+, and Doub+ Tri-PyMT cells. Our analysis revealed that the proportion of proliferating RFP+ cells (in the S phase) was higher than that of proliferating GFP+ cells. Interestingly, the Doub+ cells also exhibited a higher ratio of proliferation, which was significantly greater compared to both RFP+ and GFP+ cells (revised supplementary Figure S1B).

      (6) Statistics and quantifications are currently missing in several figures and need to be better explained throughout the manuscript to strengthen the scientific rigor of the studies. 

      We have improved the clarity of our manuscript. Proper statistics descriptions of experiments have been carefully reviewed and adequate information was edited in the revised manuscript.

      (7) Only metastasis studies are shown in the current version of the manuscript. These studies should be complemented with primary tumor studies as the main focus of the paper is the contribution of EMT to chemoresistance. 

      We appreciate the reviewer's suggestion regarding the primary tumor studies. We apologize for not stating clearly in our manuscript. In response, we have revised the manuscript to outline the rationale for establishing a competitive model by injecting a mixture of RFP+ and GFP+ cells in a 1:1 ratio via the tail vein. This model is designed to study of both EMT and MET processes under chemotherapy at a distal site, where tumor cells need phenotypic switches (both EMT and MET) to adapt to and overcome chemo/environmental challenges in this context. Indeed, we have studied the primary tumor growth with the pre-EMT (RFP+) and postEMT (GFP+) cells. Their differential contribution to tumor growth was published in another paper (Lourenco etal. Cancer Res 2020). 

      Reviewer #2 (Recommendations For The Authors): 

      Figure 1 and associated supplementary figure panels 

      Fig. 1A. More details are needed about the Tri-PyMT model and the induction of EMT in vitro. The authors mention that when growing the isolated cells they spontaneously undergo EMT when grown in 10% FBS. What is the timeline for this transition and how reproducible is it? This information is not clear from Supp. 1. When were cells taken for analysis and also how long is plasticity maintained? According to Supp 1. cell generation 15-21 seems to have a stable cell population of green, red, and yellow cells. Are these cell populations changing if one stimulates the whole cell population with a pro-EMT stimulus? Since cell proliferation is linked to rRNA synthesis the authors also need to include markers of cell cycle for the individual cell population to identify which cell cycle state each sorted cell population is associated with. 

      We thank the reviewer for recommending further analysis of the cell cycle among RFP+, GFP+, and Doub+ cells. As illustrated in the revised Supplementary Figure 1B, an increased proportion of RFP+ cells was observed in the S phases in comparison to GFP+ cells. Conversely, Doub+ cells demonstrated a proliferation rate even higher than to that of RFP+ cells.

      Upon sorting, RFP+ cells were found to spontaneously undergo epithelial-mesenchymal transition (EMT) when cultured in 10% FBS media, thereby converting to GFP+. We quantified the GFP+ cell percentage within the total cell population, noting a consistent transition of a certain proportion of RFP+ cells to EMT, leading to an accumulation of GFP+ cells. This accumulation stabilizes as approximately 60-70% of the entire population become GFP+. Remarkably, re-sorting RFP+ cells from this balanced tumor cell population resulted in a similar fluorescent transition pattern as observed in the parental population. The mechanisms by which tumor cells regulate the EMT phenotypes across the entire population remain unclear. Nevertheless, the equilibrium between RFP+ and GFP+ cells may be attributed in part to the more rapid proliferation of RFP+ cells and the limited proportion of tumor cells undergoing EMT.

      We conducted repeated long-term cultures (up to 20 passages) of the Tri-PyMT cells, yielding consistent results. The fluorescence transition pattern in Tri-PyMT cells proved highly reliable. Further details regarding the Tri-PyMT cells have been incorporated into the Methods section.

      Fig. 1B. The loading control is not even and quantification is missing, in the text, it states Vimten instead of Vimentin. 

      The less loading with Doub+ cells was due to the limited number of EMT transitioning cells we could purify by flow sorting. Even though, the expression of both epithelial and mesenchymal markers in the Doub+ cells were clear. In the revised manuscript, we have quantified the Western blot results. We also apologize for the type errors and have corrected the spelling of "Vimentin."

      Fig. 1K. In this figure, the authors write: 'It is worth noting that with the 2-phase classifications (Epi or Mes), the elevated RiBi activity was associated with the transitioning cells still exhibiting overall epithelial phenotypes; RiBi activities diminished as cells completed their transition to the mesenchymal phase'. But in Fig. 1K, the Ribi activity is already at a peak during the epithelial state and starts declining already at the beginning of the transition, can the authors please explain this data a bit more? The finding that ribosome biogenesis diminishes once the cells have completed their transition was shown in Prakash et al, Fig. 1 J, I, and accordingly their scientific findings should be discussed in the context of published work. 

      We acknowledge the reviewer's concerns regarding the comparison of the timeline for EMT in our model with that in Prakash's study. In our model, EMT-transitioning cells are identified by their EMT marker genes and fluorescence expression. We enriched the EMT transitioning cells by sorting the Doub+ cells. Due to the RFP protein's half-life, cells remain RFP+ for 2-3 days after the reporter cassette has switched to GFP expression. In Prakash's study, the EMT transitioning phase was defines by the duration of TGF-β stimulation.

      In Figure 1K, cells are categorized based on their EMT pseudotime, calculated from their expression of EMT marker genes in the EMTome. Ribosome biogenesis (RiBi) activity is highest in cells transitioning between phase 1 (Red) and phase 2 (Green), with both phases displaying predominantly epithelial phenotypes (Figures 1C, 1D, and 1E). RiBi activity declines in cells in phases 4, 5, and 3, which exhibit a mesenchymal phenotype. We have expanded the discussion to include more details in comparison with Prakash's study in the revised manuscript.

      Supp Fig S4. The authors should provide a rationale for how and why the specific marker genes were selected to calculate the AUC values. 

      We have chosen the specific EMT marker genes based on their overall expression levels in Tri-PyMT cells, ensuring consistency with the reported associations of their expression patterns to epithelial or mesenchymal phenotypes in the literature. We provide a detailed rationale for the selection of these genes in the Method of revised manuscript (Page #7).

      Figure 2 and associated supplementary figure panel. In this figure, rRNA synthesis needs to be evaluated in the cells isolated from the lungs to corroborate the RNA sequencing findings. 

      Following the reviewer’s suggestion, we performed an RT-PCR of Ribi related genes including Bop1, Gemin4, Its1, Its2, Npm1, Rpl8, Rpl29, Rps9, Rps24, Rps28, Polr1a, Setd4, Utp6, and Xpo1. Consistent with the bulk and single cell RNA sequencing, relatively higher expression of Ribi related genes were detected in Doub+ cells compared to that of RFP+ and GFP+ cells (revised Supplementary Fig S5). 

      Fig 2C, as per figure Supp Fig S4 please explain the rationale for how and why the specific marker genes were selected. 

      The same marker genes used for the calculation of the EMT AUC value as in Fig. 1. These marker genes were selected because their overall expression levels are readily detectable in Tri-PyMT cells, their expression patterns are consistent with their epithelial or mesenchymal phenotypes, and the associations between expression of marker genes and phenotypes are in line with the previous reports in literature. Description of AUCell value quantification was included in the revised manuscript (Page #7).

      Fig. 2G. The high Ribi during the epithelial state is most likely due to the resumption of cell proliferation of these cells. The authors should check the cell cycle states of these different sets of cells. 

      We agree with the reviewer that higher Ribi activity could be related to the resumption of cell proliferation of mesenchymal tumor cells. To clarify this, we revisited the scRNAseq data, and project the S phase score to the scatter plot of Ribi activity/MET pseudotime. Indeed, cells in the far mesenchymal state show low S phase score, while the proliferating cells were mostly detected in the MET transitioning phase and epithelial phase (revised Supplementary Figure S6D).

      Suppl Fig. 5 Please correct the figure legends as there is no figure D. 

      We apologize for the mislabeling. We have corrected the figure legend accordingly.

      Figure 3. Please explain the rationale for stimulating cells with FBS for the selected time points. 

      Fig. 3A. The loading control is not even, and quantification is missing. In addition, the authors should explain why the different time points were chosen and why FBS was chosen as a stimulus. In addition, from which passage of cells were these cells? 

      The RFP+ Tri-PyMT cells underwent EMT and switched their expression of fluorescent marker to GFP+ when cultured with FBS. To investigate the response of cells at varying EMT statuses to an FBS-enriched environment, we isolated RFP+, Doub+, and GFP+ cells from the 4th and 5th passages of Tri-PyMT cells and probed downstream signaling pathways after FBS stimuli. The timeline for stimulation was informed by the innate activation profile of these phosphorylation-dependent signals, spanning from 10 minutes to 1 hour. We noted that ERK signaling activation in RFP+ cells occurred within minutes of FBS exposure and diminished within approximately one hour. This ERK signal was more pronounced and persisted longer in Doub+ cells. In contrast, GFP+ cells exhibited a more transient and lower ERK activation (see revised Fig 3A). To address concerns regarding potential uneven loading in our previous assays, we have now included the quantification of Western blots in the revised Fig 3A.

      How and why were ERK and mTORC1 pathways chosen for analysis downstream of increased rRNA synthesis? ERK and mTORC1 have mostly been investigated in the role of cell proliferation which is why the cell cycle status of these cell populations will be important to consider in the context of their findings. 

      The regulation of ribosome biogenesis (RiBi) is mediated by multiple pathways, including the myelocytomatosis oncogene (Myc), mammalian targets of rapamycin (mTOR), and noncoding RNAs, as detailed by Jiao et al. in Signal Transduction and Targeted Therapy (2023). There was no significant difference in Myc expression between tumor cells with epithelial and mesenchymal phenotypes. We thus investigated the activation of the mTOR pathway in sorted RFP+, Doub+, and GFP+ cells. Additionally, given the recognized role of the ERK/MAPK signaling pathway in regulating protein synthesis and cell proliferation, we also analyzed the activation of ERK signals. 

      In alignment with the reviewer's observation regarding the potential correlation between cell proliferation rate and RiBi activation, we further characterized the cell cycle distributions of RFP+, Doub+, and GFP+ cells. Notably, the Doub+ cells exhibited a higher ratio of cells in the proliferative state (including S and G2/M phases) compared to RFP+ and GFP+ cells. Also, higher percentage of S phase cells were detected in RFP+ cells than GFP+ cells (revised Supplementary Figure S1B).

      Figure 3 B, C, D. Please provide more information about which cells are analyzed in this figure. 

      We apologize for the previous ambiguity regarding the cells analyzed in these figures. To clarify, the figure legend has been revised to specify that Tri-PyMT cells from the 5th to 10th passages were the subjects of analysis for cell size and nascent protein synthesis, utilizing flow cytometry.

      Figure 3D. The selected images show enlarged nucleoli/ fibrillarin which is an indicator of increased rRNA synthesis however, the authors need to show an increase in rRNA transcripts by q-PCR or Northern blot and also show EU staining in these different cell states to support their claim. 

      We appreciate the reviewer's recommendation to further validate the enhanced ribosome biogenesis (RiBi) in Doub+ cells. In response, we conducted RT-PCR analysis of several RiBi-related genes (revised Supplementary Fig S5). Additionally, we carried out an EU incorporation assay to illustrate the rRNA transcription activity within these cells. The new results have been incorporated into the revised manuscript (Supplementary Fig S7).

      Figure 4 and associated supplementary. In this figure, the authors show that using small molecule Pol I assembly inhibitors (BMH-21 and CX-5461) reduces the expression of mesenchymal proteins. As mentioned in previous comments these results should be put in the context of published work by Prakash et al which demonstrate that upon CX-5461 and genetic silencing of Pol I EMT is hampered as demonstrated by gene expression profiles as well as functional assays. 

      We revised the description of our experiments with Pol I inhibitors in the revised manuscript by including the citation context (Prakash et al Nat Commun, 2019) as mentioned above.  

      Figure 4A. Please provide an explanation of how the doses of Pol I assembly inhibitors were determined and also the selected time points. The Pol I assembly inhibitors should have an effect within a few hours (Drygin, Cancer Research, 2011, Peltonen, Cancer Cell, 24). The authors also need to show that the BMH-21 and CX5461 at selected doses are indeed inhibiting rRNA synthesis in the selected cell populations. The data would also be strengthened by performing ChIP analysis demonstrating that indeed the Pol I complex is disassociated from the rDNA genes upon inhibition. 

      In addition, why are there only 2 reports and how were the statistics done? Were the data normalized to the total number of cells? The graph visually shows a difference in cell numbers. Are cells dying at this concentration? More controls must be included including markers for cell stress, p53, autophagy, and apoptosis. 

      The dose of Pol inhibitors was selected based on prior studies, as noted by the reviewer. Peltonen et al. demonstrated that BMH-21 inhibits growth across a wide spectrum of cancer cell lines, achieving a mean half-maximal inhibition of cell proliferation (GI50) at 160 nM (Peltonen K., et al. Cancer Cell. 2014). Consistently, in our experiments, the growth inhibitory effect of BMH-21 on Tri-PyMT cells fell within this range, at approximately 200 nM (Fig 5B, Supplementary Fig S10). 

      To address the reviewer's suggestion and verify that RiBi inhibitor effectively inhibits rRNA synthesis in our study, we conducted an EU incorporation assay. This assay revealed significant inhibition of rRNA synthesis by BMH-21 and CX5461 in Tri-PyMT cells (revised Supplementary Fig S8B). Furthermore, to enhance the robustness of our findings, we repeated the BMH-21 treatment on sorted RFP+ Tri-PyMT cells across three biological replicates, which yielded consistent results.

      Figure 4B. How many replicates were done for this experiment and please provide quantification as per previous comments on WB experiments. The authors should provide a rationale for why Snail and Vimentin were chosen for these studies. Also, the authors should provide a functional assay and demonstrate that cells are less migratory post-treatment and not only markers. 

      Western blots with sorted Tri-PyMT cells were performed twice. We have added the quantification of these blot in the revised manuscript. Snail and Vimentin were chosen as mesenchymal markers to indicate EMT phenotype switches as those were well-studied and commonly used mesenchymal markers of EMT. The association of fluorescent marker switch and

      EMT phenotype such as cell migration was well established in our previous study (Fischer et al., 2015, Lourenco et al., 2020). The morphology and migration property of GFP+ were well distinguished from RFP+ counterparts. Also, following reviewer’s suggestion, we performed migration assay with BMH21 treatment (revised Supplementary Fig 8C). Indeed, the treatment with BMH21 or CX5461 inhibited cell migration as expected.

      Supplementary figure 7. The authors need to provide a rationale as to why the two Rps were chosen to inhibit ribosome biogenesis. 

      The two Rps targets were chosen based on their differential expression in Doub+ cells compared with RFP+ and GFP+ cells. Also, we considered the overall expression level of these genes in Tri-PyMT cells. We have edited the according text in the revised manuscript.

      Figure S7B. In the images shown there does not appear to be a significant change in the number of nucleoli however the cells seem to be smaller. This should be explained. 

      We agree with the reviewer that the box plot does not clearly show the nucleoli differences between these cells. We present the data with a violin plot, which more clearly exhibit the result (revised Supplementary Fig S9B). It was also true that the sizes of the Rps knockdown cells were relatively smaller than control cells. This is consistent with the finding that the EMT transitioning cell size was bigger than the non-transitioning cells (Fig 3B)

      .

      Figure 5 and Supp 8. The authors should provide the background as to why the specific chemotherapeutic drugs were chosen. 

      The chemotherapeutic agents employed in this study are widely used in the treatment of breast cancer. For instance, Cyclophosphamide (CTX) hampers both DNA replication and RNA transcription; Doxorubicin inhibits DNA replication by disrupting topoisomerase activity; Paclitaxel prevents cell division by stabilizing microtubules; and 5-Fluorouracil (5-FU), a pyrimidine analog, blocks thymidylate synthase, thereby disrupting DNA synthesis. Additionally, some of these agents, such as CTX and 5-FU, may directly or indirectly affect RNA polymerase, prompting us to investigate the synergistic effects of these drugs when used in combination with BMH21. We have included the information in revised manuscript. 

      Fig 5B/Supp 8. Can the authors please explain why only 2 replicates were done and provide a rationale for future statistics? 

      Using serial concentrations of drugs tested—6 doses for BMH21 and 8 doses for CTX—it is logical to arrange the experiment in duplicates on 96-well plates. For the statistical analysis, we conducted dose-response analysis to ascertain the IC50 values for each drug alone and in combination. Additionally, we calculated the synergy score to assess the interactions between the drugs. The methodology section of the manuscript has been enhanced to provide a clearer description of these processes in the revised version.

      Figure 6. The authors should provide a rationale of why tail veins were chosen as their in vivo model system as the EMT cells do not cause metastasis and if chemoresistance is the main focus of their studies both primary and secondary tumors should be considered. Why was not the MMTVPyMT mouse model chosen where the cells were originally isolated from to test the role of the dual treatment? How was the drug concentration decided and the interval of treatments? 

      We acknowledge the reviewer's concerns regarding the choice of experimental setup for our metastasis model. Certainly, utilizing the original MMTV-PyMT mice for the combination therapy experiment would be the ideal scenario. However, there are potential drawbacks to using these transgenic mice: 1) The occurrence of multiple primary tumors that develop simultaneously but without synchronized timelines (in mice aged 6-9 weeks), and the unsynchronized development of lung metastasis (from 10-16 weeks of age). This leads to uncontrollable variations in the experimental setup, particularly when establishing multiple treatment groups; 2) Gathering a sufficient number of female transgenic mice of a similar age poses another challenge; 3) The absence of tumor cell labeling complicates the focus on assays for EMT/MET phenotype changes during tumor progression. Consequently, we have chosen to employ our Tri-PyMT model for this experiment. The drug treatment protocol was established after reviewing literature on the in vivo application of CTX and BMH21 treatment (Peltonen etal. Cancer Cell 2014; Jacobs etal. JBC 2022).

      Figure 6B, C. The authors should provide quantification for these data, how many mice were analyzed, and how many sections were stained and analyzed. 

      We have improved the quality of these fluorescent images and clarify the methodology, including the mouse/section numbers per group, for obtaining these fluorescent images in the legend. To quantify the differential impact of BMH21 on RFP+ and GFP+ tumor cells, we performed flow cytometry (revised Supplementary Fig S11). We have also changed the presentation of these flow data to improve the clarity of these results. 

      Fig 6D. How were the treatment timeline and dosing chosen? LM2 cells are derived from a metastatic site, so they are not transitioning cells they are stably mesenchymal why was this chosen as their in vivo model? 

      LM2 cells were derived from the lung metastasis of MDA-MB-231 cell line. These cells exhibit predominantly mesenchymal phenotype in culture. While growing into metastasis in the lung, expressions of epithelial markers such as E-cad were upregulated (Supplementary Fig S12), suggesting a MET process may be involved the outgrowth of lung metastasis. Therefore, we choose the LM2 cells as our experimental model for assessing the effect of RiBi inhibitor on MET. The treatment timeline was determined based on previous studies of BMH21 and chemotherapy applications in vivo (Peltonen etal. Cancer Cell 2014; Jacobs etal. JBC 2022).  

      Reviewer #3 (Public Review): 

      Summary: 

      Ban et al. investigated the role of ribosome biogenesis (RiBi) in epithelial-to-mesenchymal transition (EMT) and its contribution to chemoresistance in breast cancer. They used a Tri-PyMT EMT lineage-tracing model and scRNA-seq to analyze EMT status and found that RiBi was elevated during both EMT and mesenchymal-to-epithelial transition (MET) of cancer cells. They further revealed that nascent protein synthesis mediated by ERK and mTOR signaling pathways was essential for the completion of RiBi. Inhibiting excessive RiBi impaired EMT and MET capability. More importantly, combinatorial treatment with RiBi inhibitors and chemotherapy drugs reduced metastatic outgrowth of both epithelial and mesenchymal tumor cells. These results suggest that targeting the RiBi pathway may be an effective strategy for treating advanced breast cancer with EMT-related chemoresistance. 

      Strengths: 

      The conclusions of this study are generally supported by the data. However, some weaknesses still exist as mentioned below. 

      Weaknesses: 

      (1) The study predominantly focused on RiBi as a target for overcoming EMT-related chemoresistance. Thus, it will be necessary to provide some canonical outcomes after upregulating ribosome biogenesis, such as translation activity. I would suggest ribosome profiling or puromycin-incorporation assay, or other more suitable experiments. 

      EU incorporation assay (revised Supplementary Fig S7) and puromycin incorporation assay (Fig 3C) were performed.

      (2) The results were basically obtained from mice and in vitro experiments. While these results provide valuable insights, it will be valuable to validate part of the findings using some tissue samples from patients (e.g. RiBi activity) to determine the clinical relevance and potential therapeutic applications.  

      We agree. We have added the analyses on the correlation between patients’ survival and RiBi activation (revised Supplementary Fig S13, S14).

      (3) The results revealed that mTORC1 and ERK mediated RiBi activation. How about mTORC2? It will be informative to evaluate mTORC2 signaling. 

      We investigated the role of the mTORC1 pathway in regulating RiBi activation. It is pertinent to acknowledge that the mTORC1 complex is known to positively regulate protein synthesis through the phosphorylation of ribosomal protein S6 kinase, among other mechanisms. Additionally, Rps6 is recognized as an essential component of the 40S subunit in the ribosome. We agree with the reviewer that mTORC2 may also be involved in RiBi activity, as its activation is mediated through ribosome association (Zinzalla et al., Cell 2011; Prakash et al., Nat Comm 2019). However, this association is more likely to be downstream of RiBi activation, as the RiBi inhibitor CX5461 can block the translocation of Rictor into the nucleus (Prakash et al., Nat Comm 2019).

      We also revisited our sequencing data of RFP+, GFP+, and Doub+ cells. While there was no significant change in the expression of either Rptor or Rictor among these cells, the LSMean (overall expression level) of Rptor was higher than that of Rictor; for example, 163.77 vs 29.95 in RFP+ cells. This suggests that mTORC1 may play a dominant role in regulating RiBi activity in our model.

      Furthermore, we analyzed how Rapamycin (an mTORC1 inhibitor) affects the EMT process in TriPyMT cells. As expected, Rapamycin-treated cells exhibited higher expression of the epithelial marker E-cadherin (Ecad) and lower expression of the mesenchymal markers Snail and Vimentin (Vim) compared to the control (For Reviewers Figure 3).

      (4) The results also demonstrated promising synergic effects of Pol I inhibitor (BMH21) and chemotherapy drug (CTX) on chemo-resistant metastasis. How about using the inhibitors of mTORC1 together with CTX? 

      Several mTOR inhibitors (e.g., sirolimus, temsirolimus, ridaforolimus) have demonstrated antitumor activity. The combination of mTOR inhibitors with various targeted therapies or chemotherapies is being examined in numerous clinical trials, showing promising results. Although the combination therapy of mTORC inhibitors and CTX is beyond the scope of our study, we analyzed how mTOR inhibitors may affect the EMT process in our model, as mentioned above. Western blot analysis of EMT markers (E-cadherin, Snail, and Vimentin) showed that rapamycin treatment inhibited the EMT transition of Tri-PyMT cells. (For Reviewers Figure 3).

      (5) While the results demonstrate the potential efficacy of RiBi inhibitors in reducing metastatic outgrowth, other factors and mechanisms contributing to chemoresistance may exist and need further investigation. I would suggest some discussion about this aspect. 

      Following reviewer’s suggestion, we have edited the discussion section with more future directions. 

      Reviewer #3 (Recommendations For The Authors): 

      (1) Please provide the quantified data for all western blots, rather than solely show some representative blots. 

      We quantified the western blot images as shown in the revised figures. Thanks for reviewer’s suggestion.  

      (2) Please add a graphic abstract or schematic to help the readers understand the whole story. 

      We have summarized a schematic graph of our findings in the revised manuscript (Supplementary Fig S15).

      (3) It is hard to read the numbers inside all plots of flow cytometry. 

      High-resolution figures of flow plots are included in the revised manuscript.

      (4) Please provide high-resolution figures for all the synergy plots.

      High-resolution figures of synergy plots are included in the revised manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this manuscript, Wang et al. demonstrate that knockdown of DYRK1A results in reduced cell size, which is mediated by mTORC1 activity. They found that DYRK1A interacts with TSC1/TSC2 proteins which leads to the phosphorylation of TSC2 at T1462. Phosphorylation of TSC2 at T1462 inhibits TSC2 activity leading to the activation of mTORC1. The authors complement their findings by demonstrating that overexpression of RHEB (positive regulator of mTORC1) rescues the phenotype of DYRK1A (mnb in flies) mutation in the NMJ.

      The authors' findings on the regulation of cell size and mTORC1 activity by DYRK1A reflect the previous findings of Levy et al. (PMID: 33840455) that cortical deletion of Dyrk1a in mice causes decreased neuronal size associated with a decreased activity of mTORC1 that can be rescued by the inhibition of Pten or supplementation of IGF1.

      The authors demonstrate that T1462 phospho-site at TSC2 is phosphorylated in response to the overexpression of WT but not kinase-dead DYRK1A. However, the authors do not provide any evidence that the regulation of mTORC1 is mediated via phosphorylation of this site. In addition, T1462 site is known to be phosphorylated by Akt. There is a possibility that Akt was co-purified with TSC1/TSC2 complex and DYRK1A promotes phosphorylation of TSC2 indirectly via the activation of AKT that can be tested by using AKT depleted cells.

      We thank the reviewer for reviewing this manuscript and the critical comments. Various groups have reported the significance of the Phosphorylation of TSC2 T1462, along with four other phosphorylation sites, in regulating mTORC1, and therefore, we did not deal with this in the current manuscript (Manning et al. PMID: 12150915, Inoki et al. PMID: 12172553, Zhang et al. PMID: 19593385). Regarding co-purification of AKT with TSC1/TSC2 - AKT phosphorylates T1462, S939 and S1387 (Manning et al. PMID: 12150915, Inoki et al. PMID: 12172553, Zhang et al. PMID: 19593385). However, in in vitro kinase assay, signal intensities of anti-TSC2 S939 and S1387, with or without ATP, showed no significant difference, suggesting that AKT is not pulled down with TSC1 or TSC2. DYRK1A and Kinase dead DYRK1A were expressed and purified from bacteria.  Moreover, multiple studies have purified TSC1 and TSC2 and reported no AKT co-purified (Menon et al. PMID: 24529379, Chong-kopera et al. PMID: 16464865).

      RHEB is the most proximal regulator of mTORC1 and can activate mTORC1 even under amino acid starvation. The fact that RHEB overexpression rescues the cell size under DYRK1A depletion or mnb (DYRK1A in Drosophila) mutant phenotype does not prove that DYRK1A regulates the cell size via TSC1 as it would rescue any inhibitory effects upstream to mTORC1.

      We agree with the reviewer that overexpression of RHEB may rescue any inhibitory effects upstream to mTORC1.  In the results and discussion sections (Page number 7, last 3 lines), we mentioned that Rheb overexpression only supports our suggestion that DYRK1A likely works upstream to RHEB. We, however, have performed another experiment to strengthen our hypothesis. We show that increased cell size phenotype due to DYRK1A overexpression can be suppressed by inhibiting the TORC1 pathway, suggesting that mTORC1 is necessary for DYRK1A-mediated cell growth.  These results are presented in Supplementary Figure 4. The results of two reciprocals of experiments (Suppression of DRYK1A/Mnb loss of function phenotypes by RHEB overexpression and suppression of rescue of DYRK1A Gain of function phenotypes) along with and regulation of TSC phosphorylation by DYRK1A strongly suggests that DYRK1A positively regulates TSC pathway.

      Reviewer #2 (Public Review):

      This study aims to describe a physical interaction between the kinase DYRK1A and the Tuberous Sclerosis Complex proteins (TSC1, TSC2, TBC1D7). Furthermore, this study aims to demonstrate that DYRK1A, upon interaction with the TSC proteins regulates mTORC1 activity and cell size. Additionally, this study identifies T1462 on TSC2 as a phosphorylation target of DYRK1A. Finally, the authors demonstrate the role of DYRK1A on cell size using human, mouse, and Drosophila cells.

      This study, as it stands, requires further experimentation to support the conclusions on the role of DYRK1A on TSC interaction and subsequently on mTORC1 regulation. Weaknesses include, 1) The lack of an additional assessment of cell growth/size (eg. protein content, proliferation), 2) the limited data on the requirement of DYRK1A for TSC complex stability and function, and 3) the limited perturbations on the mTORC1 pathway upon DYRK1A deletion/overexpression.

      We thank the reviewer for reviewing this manuscript and the comments. We have previously analyzed the effect of DYRK1A knockdown in the proliferation of THP cells (human leukemia monocytic cell line) (Li Shanshan et al. PMID: 30137413) and have shown that DYRK1A knockdown negatively affects cell proliferation. Other studies have also shown a role for DYRK1A in cell proliferation, including in foreskin fibroblasts (Chen et al. PMID: 24119401) and HepG2 cells (Frendo-Cumbo et al. PMID: 36248734). mTORC1 regulates several pathways, including protein synthesis, lipid synthesis, nucleotide synthesis, autophagy, and stress responses. We have not done the protein content as this parameter is directly affected by TORC1 activation and may not be a suitable measure for cell growth. A large number of studies involving mTORC1 regulation analyze the levels of S6K and S6 phosphorylation, as these are direct readouts of mTORC1 function   (Prentzell et al. PMID: 33497611,  Zhang et al. PMID: 17052453, Ben-Sahra et al, PMID: 23429703, Düvel et al. PMID: 20670887,  Zhang et al. PMID: 2504303). Therefore, we used these markers to assess the status of the mTORC1 pathway.

      (2) ..the limited data on the requirement of DYRK1A for TSC complex stability and function,

      We agree with this limitation in our study. We have not seen a significant difference in TSC1 or TSC2 protein levels in DYRK1A knockdown or overexpressing cells, so we did not follow up on this aspect.

      ..and 3) the limited perturbations on the mTORC1 pathway upon DYRK1A deletion /overexpression.

      We have performed an additional experiment where we overexpressed DYRK1A and showed that increased cell size phenotype due to DYRK1A overexpression can be suppressed by inhibiting the TORC1 pathway, suggesting that mTORC1 is necessary for DYRK1A-mediated cell growth.  These results are presented in Supplementary Figure 4. The results of two reciprocals of experiments (Suppression of DRYK1A/Mnb loss of function phenotypes by RHEB overexpression and suppression of Rescue of DYRK1A Gain of function phenotypes) along with and regulation of TSC phosphorylation by DYRK1A suggests that DYRK1A positively regulates TSC pathway.

      Finally, this study would benefit from identifying under which nutrient conditions DYRK1A interacts with the TS complex to regulate mTORC1. The interaction described here is highly impactful to the field of mTORC1-regulated cell growth and uncovers a previously unrecognized TSC-associated interacting protein. Further characterization of the role that DYRK1A plays in regulating mTORC1 activation and the upstream signals that stimulate this interaction will be extremely important for multiple diseases that exhibit mTORC1 hyper-activation.

      We agree that identifying nutrients (or physiological conditions) that affect DYRK1A-mediated TSC regulation will be important to understanding the additional complexity in context-dependent mTORC1 activation/deactivation. This study has not addressed those issues, particularly due to DYRK1A's pleiotropic nature. DYRK1A has many substrates, and both overexpression and loss of DYRK1A lead to multiple phenotypes. Identifying nutrient conditions or growth factors that can regulate the activation of DYRK1A is not yet known and would require an independent investigation.

      Reviewer #3 (Public Review):

      The manuscript describes a combination of in vitro and in vivo results implicating Dyrk1a in the regulation of mTORC. Particular strengths of the data are this combination of cell and whole animal (drosophila) based studies. However, most of the experiments seem to lack a key additional experimental condition that could increase confidence in the authors' conclusions. Overall some tantalizing data is presented. However, there are several issues that should be clarified or otherwise addressed with additional data.

      We thank the reviewer for reviewing and commenting on this manuscript.

      (1) In Figure 1G, why not test overexpression levels of Dyrk1a via western rather than only looking at the RNA levels?

      Induced overexpression of DYRK1A was probed by analyzing mRNA levels, as the concentration of Doxycycline used (0-100 ng/ml) did not produce enough protein that could be detected by anti-flag antibody in a western blot. We have modified the sentence (page 5, paragraph 1).

      (2) In Figure 2, while there is clearly TSC1 protein in the Dyrk1a and FLAG-Dyrk1a IPs that supports an interaction between the proteins, it would be good to see the reciprocal IP experiment wherein TSC1 or TSC2 are pulled down and then the blot probed for Dyrk1a.

      In the revised manuscript, we have provided evidence that TSC1 and TSC2 can interact with endogenous DYRK1A. We have performed immunoprecipitation of affinity-tagged TSC1 or TSC2 and have probed for the enrichment of DYRK1A (Supplementary Figure S2).

      (3) Figures 3 A and D tested the effects of Dyrk1a knockdown using different methods in different cell lines. This is a reasonable approach to ascertain the generalizability of findings. However, each experiment is performed differently. For example, in 3A, the authors found no difference in baseline pS6, so they did a time course of treatment to induce phosphorylation and found differences depending on Dyrk1a expression. In 3D, they only show baseline effects from the CRISPR knockdown. Why not do the time course as well for consistency? Also, why the an inconsistency in approaches wherein one shows baseline effects and the other does not? The authors could also consider the pharmacologic inhibition of Dyrk1a activity as well.

      We agree that different methods were used in different cell lines to assess the effect of DYRK1A. Since DYRK1A is a pleiotropic gene, its manipulation has diverse effects on different cell lines. Also, not all cell types have similar levels of mTORC activity. Hence, we had to adapt to different strategies in different cell types, which accounted for the inconsistency in the methodology.  However, various groups have used these methods to determine the activity of mTORC1 by S6 and S6K phosphorylation by both starvations, followed by the stimulation and direct estimation methods in cycling cells (Prentzell et al. PMID: 33497611,  Zhang et al. PMID: 17052453, Ben-Sahra et al, PMID: 23429703, Düvel et al. PMID: 20670887,  Zhang et al. PMID: 25043031). ShRNA-mediated knockdown in HEK293 cells does not change S6 or S6K phosphorylation levels in actively growing cells, whereas cycling NIH3T3 cells shows a significant reduction in S6 and S6K phosphorylation. As suggested, we used pharmacological inhibition of DYRK1A and 1uM Harmine to treat the HEK293 cells and perform starvation. However, cells treated and starved start to float and die in large numbers. Thus, we did not follow this experiment further.

      (4) In Figure 4, RHEB overexpression increases cell size in both Dyrk1a wt and Dyrk1a shRNA treated cells, although the magnitude of the effect appears reduced in Dyrk1a shRNA cells. However, there is the possibility here that RHEB acts independently of Dyrk1a. Why not also do the experiment of Figure 1 wherein Dyrk1a is overexpressed and then knockdown RHEB in that context? If the hypothesis is supported, then RHEB knockdown should eliminate the cell size effect of Dyrk1a overexpression.

      We thank the reviewer for suggesting this experiment.  We have overexpressed DYRK1A using the inducible HEK293A-Flag-DYRK1A overexpression system and treated cells with mTOR inhibitors (Rapamycin or Torin1). The results are added to the supplementary figure S4. Our results show that the increased cell size phenotype due to DYRK1A overexpression can be suppressed by inhibiting the TORC1 pathway. This suggests that mTORC1 is necessary for DYRK1A-mediated cell growth. This data further supports the hypothesis that DYRK1A is a positive regulator of the mTORC1 pathway.

      (5) The discussion should incorporate relevant findings from other models, such as Arabidopsis. Barrada et al., Development (2019), 146 (3).

      We have incorporated the findings from Arabidopsis (Barrada et al., Development (2019), 146 (3) PMID: 30705074) in the last paragraph of the discussion section.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) To demonstrate that DYRK1A can phosphorylate T1462 phospho-site at TSC2 in the absence of Akt using genetic and pharmacological approaches (by using pan-Akt small molecule inhibitors).

      We have performed in vitro kinase assay using recombinant DYRK1A, and affinity purified TSC1/TSC2 from HEK293 cells. However, we have not been able to perform this experiment by overexpression of DYRK1A in human cells, as 1) strong overexpression of DYRK1A leads to cell cycle exit, as demonstrated by various laboratories (Soppa et al. PMID: 24806449, Hämmerle et al PMID: 21610031,  Najas et al. PMID: 26137553, Park et al. PMID: 20696760) and our observations, and 2) T1462 Antibody signal is weak and cannot be seen in cellular extracts. We have attempted this experiment with at least three different batches of T1462 antibody from CST without success.

      (2) To demonstrate that endogenous phosho-mutant/mimetic substitution of T1462 phospho-site at TSC2 is sufficient to prevent the regulation of cell size/NMJ phenotype in Drosophila by DYRK1A (mnb).

      This is an interesting experiment, and we thank the reviewer for this suggestion. However, we are skeptical about interpreting the possible results. Since T1462 substitution will also block the regulation by other kinases, e.g., Akt, and it may constitutively suppress the mTORC1, any interpretation will be confusing.

      Reviewer #2 (Recommendations For The Authors):

      (1) In section 2.1 the authors claim that DYRK1A down-regulation enhances cell growth. An additional assessment of cell growth or size would strengthen this statement. Is total protein content also increased upon DYRK1A overexpression? Does DYRK1A KD also increase cell proliferation? In Figure 1, providing the median or mean size of cells in each condition will help the reader understand the impact of DYRK1A on cell size. In Supplementary Figure 1, the important statistical differences should be highlighted.

      We have not claimed that down-regulation of DYRK1A enhances cell growth. We have not tested the protein content in a cell directly. Knockdown of DYRK1A leads to a reduction in cell proliferation, as shown by various groups, including ours (Shanshan Li PMID: 30137413, Luna et al. PMID: 30343272). Cell size is a very dynamic process and is variable within the population. All the studies measuring cell size show the size using assays on a population of cells. We have not been able to figure out a way to display the median or mean cell size that accurately reflects the cell size of the whole population. 

      (2) In section 2.2 the authors describe the interaction between DYRK1A and the TSC proteins. Do the DYRK1A mutants impact interaction with TSC2 and TBC1D7 or is this specific to TSC1?

      We have not tested this possibility.

      (3) In section 2.3, more detailed perturbations of the mTORC1 pathway are needed. Is the mTORC1 activation observed sensitive to rapamycin treatment? Since mTORC1 regulates cell size via S6 ribosomal protein and transcription via 4EBP1, phosphorylation of 4EBP1 should also be considered. In Figure 3A, what is the level of DYRK1A down-regulation? It is unclear how many shRNA constructs were used or whether these were pooled constructs or single clones. If one shRNA/sgRNA is used, it would be very helpful to validate some of the key findings of this study with at least one more clone.

      Many research studies have measured the activity of various mTORC1 substrates, the most commonly used being the phosphorylation of S6 and S6K. We agree that analyzing 4EBP1 would make the study more comprehensive, but to complete the study with our limited resources and in a limited time, we have not attempted to establish the 4EBP1 phosphorylation status. We have used a previously described and validated DYRK1A shRNA (as mentioned in the methods section).

      (4) In section 2.3 is T1462 an activating or inhibiting phosphorylation event? If DYRK1A phosphorylates and activates mTORC1 via RHEB, shouldn't that result in the inhibition of mTORC1?

      Multiple laboratories have demonstrated that T1462 phosphorylation leads to a reduced TSC complex activity and, hence, increased mTORC1 activity (Manning et al. PMID: 12150915, Inoki, PMID: 12172553, Zhang PMID: 19593385).

      (5) In section 2.4, what is the status of AKT phosphorylation? Would an AKT inhibitor be useful in this scenario?

      AKT phosphorylates T1462, S939 and S1360, as demonstrated by others. However, in our in vitro assay kinase assay, the following facts suggest that AKT is not involved in T1462 phosphorylation we observed:

      (1) Signal intensities of anti-TSC2 S939 and S1387 with or without ATP, do not show any significant differences, suggesting that AKT is not pulled down with TSC1 or TSC2.

      (2) Multiple studies have performed phosphorylation studies of TSC1 and TSC2 and have not reported any co-purification of AKT.

      (6) Very minor grammar errors were observed, mostly at the beginning of the manuscript.

      We tried our best to fix grammatical errors.

    1. Author response:

      Reviewer #1 (Public Review):

      In this manuscript, Yang et al. conduct a comprehensive investigation to demonstrate the role of adipose tissue Mir802 in obesity-associated inflammation and metabolic dysfunction. Using multiple models and techniques, they propose a mechanism where elevated levels of Mir802 in adipose tissue (both in mouse models and humans) trigger fat accumulation and inflammation, leading to increased adiposity and insulin resistance. They suggest that increased Mir802 levels in adipocytes during obesity result in the downregulation of TRAF3, a negative regulator of canonical and non-canonical NF-κB pathways. This downregulation induces inflammation through the production of cytokines/chemokines that attract and polarize macrophages. Concurrently, the NF-κB pathway induces the lipogenic transcriptional factor SREBP1, which promotes fat accumulation and further recruits pro-inflammatory macrophages. While the proposed model is supported by multiple experiments and consistent data, there are areas where the manuscript could be improved. Some improvements can be addressed in the text, while others require additional controls, experiments, or analyses.

      1) The manuscript should provide measurements of lipid droplet/adipocyte size for all models, both in vitro and in vivo. In vivo studies should also include fat weight measurements. This is crucial to determine whether Mir802, TRAF3, and SREBP1 promote adiposity/fat accumulation across all models.

      Thank you for your careful reviewing. As suggested, we have measured the size of lipid droplet and adipocyte (1J, 2A, S2I, 3F, 3L, S3L, 5I), this modification can make you and other readers understand our manuscript more clearly. In vivo studies have included fat weight measurements (Figure 2K, L; Figure 3C, D; Figure 5N). Our results determined that adipose-selective overexpression Mir802 induced adipogenesis during high fat diet induced.

      2) The rationale for co-culture experiments using WAT SVF is unclear, given that Mir802 is upregulated by obesity in adipocytes, not in the stromal-vascular fraction. These experiments would be more relevant if performed using isolated adipocytes or differentiated WAT SVF.

      Thank you for this important point. We are sorry for our inaccurate expression. In our study, we used differentiated WAT SVF to co-culture with primary macrophage, we illustrated it in the methods of Migration and invasion assays. We have revised it in the Flowchart of the co-culture experiments (Figure 4A). We hope that this modification will enhance readers' comprehension of our manuscript.

      3) Figures 1G and 1H lack a control group (time 0 or NCD). Without this control, it is impossible to determine if inflammation precedes Mir802 upregulation.

      Thank you for this insightful comment. In the previous study, we have tested the 0 weeks high fed diet treatment group of the Figures 1I and 1J, now we have added this data in the manuscript, we hope this modification can enhance our conclusion that inflammation precedes Mir802 upregulation.

      4) The statement, "The knockout of Mir802 in adipose tissue did not alter food intake, body weight, glucose level, and adiposity (data not shown)," needs more detail regarding the age and sex of the animals. These data are important and should be reported, perhaps in a supplementary figure.

      Thank you for your careful reviewing. To enhance our conclusions, we have added the data of food intake, body weight, glucose level, and adiposity about Mir802 KO mice treated with normal chow diet (NCD, Supplementary Figure 3E-I).

      ….The knockout of Mir802 in adipose tissue did not alter food intake, body weight, glucose levels, and adiposity compared with their WT littermates in both males and females when they were fed with NCD (Figure S3E-I)……

      5) The terms "KO" (knockout) and "KI" (knock-in) are misleading for AAV models, as they do not modify the genome. "KD" (knockdown) and "OE" (overexpression) are more accurate.

      Thank you for your good advice. We are sorry for our inaccurate expression. According to your advice, we have rewritten it. AAV models for Mir802 knockdown (Figure 3) and Traf3 overexpression (Figure 5) have changed to KD and OE respectively.

      6) The statement, "Mir802 expression was unaffected in other organs (Figure S3O)," should clarify that this is except for BAT.

      We appreciate the you for this insightful comment. We have clarified that Mir802 expression was unaffected in other organs except for BAT (Figure S3T, revised manuscript).

      By addressing these points, the manuscript would present a more robust and clear demonstration of the role of Mir802 in obesity-associated inflammation and metabolic dysfunction.

      Thanks for your positive comments. As suggested, we have modified all point.

      Reviewer #2 (Public Review):

      Yang et al. investigated the role of Mir802 in the development of adipose tissue (AT) inflammation during obesity. The authors found Mir802 levels are up-regulated in the AT of mouse models of obesity and insulin resistance as well as in the AT of humans. They further demonstrated that Mir802 regulates the intracellular levels of TRAF3 and downstream activation of the NF-kB pathway. Ultimately, controlling AT inflammation by manipulating Mir802 affected whole-body glucose homeostasis, highlighting the role of AT inflammatory status in whole-body metabolism. The study provides solid evidence on the role of adipocyte Mir802 in controlling inflammation and macrophage recruitment. However, how lipid mobilization from adipocytes and how engulfment of lipid droplets by macrophages control inflammatory phenotype in these cells could be better explored. The findings of this study will have a great impact in the field, contributing to the growing body of evidence on how microRNAs control the inflammatory microenvironment of AT and whole-body metabolism in obesity.

      Thanks for your positive comments.

      Reviewer #3 (Public Review):

      Mir802 appears to accumulate before macrophage numbers increase in adipose tissue in both mice and humans. The phenotype of Mir802 overexpression and deletion in vivo is sticking and novel. Deletion of Mir802 in adipose tissue after obesity onset also attenuated Adipose inflammation and improved systemic glucose homeostasis. Understanding how Mir802 affects the crosstalk between macrophage and adipocyte is a major point. For example, does Mir802 change the inflammatory of macrophages as it increases Traf3 expression in adipocytes? This is important because macrophages are the input if inflammatory mediators that will activate the TNFR receptor signaling pathway, potentially Traf3, resulting in impaired insulin stimulated Glut4 translocation and glucose uptake. Also, modulation of Mir802 levels in vivo leads to alterations in adiposity. Here, what is a direct effect of Mir802 and what is a result of simply reduced adiposity? One point that os ket is what triggers Mir802 expression, especially in obesity.

      Thanks for your important suggestions. According to your suggestions, we have addressed additional data in the revised manuscript to enhance our conclusion.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment:

      This important study details an enrichment of the IL-6 signaling pathway in human tendinopathy and applies transcriptional profiling to an advanced in vitro model to test IL-6 specific phenotypes in tendinopathy. Overall, the strength of evidence is solid yet incomplete, as transcriptomic measurements provide clarity, though functional studies including analysis of proliferation are needed to confirm these findings. This work will be of interest to stem cell biologists and immunologists.

      To functionally assess the effect of IL-6 on Scx+ fibroblast proliferation in an acute injury, we repeated the in vivo studies with an EdU staining and a newly established IL-6 KO x ScxGFP+ mouse line. We found no evidence for this effect in acute injuries and acknowledge this in the revised manuscript.

      We further added data collected by combining fluorescence microscopy with human patient-derived tissue to strengthen the link between IL-6, IL-6R, and proliferation of CD90+ cells in chronic injuries.

      See comment 1.1.

      See comment 2.4.

      Changes:

      - Title

      - Abstract

      - Figure 2 and 3 (new data)

      - Figure 7 (new data)

      - Results

      - Discussion

      Reviewer 1

      (1.1) First, the experimental approach does not directly assess proliferation, as such the conclusions regarding proliferation are not well supported. In the ex-vivo model, the use of cell counting approaches is somewhat acceptable since the system is constrained by the absence of potential influx of new cells. However, given the nearly unlimited supply of extrinsically derived cells in vivo (vs. the explant model), assessment of actual proliferation (e.g. Edu, BrdU, Ki67) is critical to support this conclusion.

      To assess the effect of IL-6 on Scx+ fibroblast proliferation in an acute injury, we repeated the in vivo studies with an EdU staining and a newly established IL-6 KO x ScxGFP+ mouse line to combat the considerable background noise of currently available Scx antibodies.

      Under the improved design of these experiments, we could detect no effect of IL-6 on ScxGFP+ cells in an acute injury in vivo. We have therefore replaced figure 5 with the new results in figure 7 and moved figure 5F to the supplementary materials (Supplementary figure 9).

      We acknowledge and discuss this in the discussion section.

      See comment 2.4.

      See comment 2.11.

      Changes:

      - Title

      - Abstract

      - Figure 7 (new data)

      - Supplementary Figure 9

      - Results

      - Discussion

      (1.2) Second, the justification for the use of Scx-GFP+ cells as a progenitor population is not well supported. Indeed, in the discussion, Scx+ cells are treated as though they are uniformly a progenitor population, when the diversity of this population has been established by the cited studies, which do not suggest that these are progenitor populations. Additional definition/ delineation of these cells to identify the subset of these cells that may actually display other putative progenitor markers would support the conclusions. As it stands, the study currently provides important information on the impact of IL6 on Scx+ cells, but not tendon progenitors.

      We further delineated the extrinsic cell populations isolated from mouse Achilles tendons of ScxGFP+ mice using flow cytometric analysis and RT-qPCR. We used tendon population markers suggested by sc-RNA-seq of mouse Achilles tendons.

      (De Micheli et al., Am. J. Physiol. - Cell Physiol., 2020, 319(5), DOI: 10.1152/ajpcell.00372.2020)

      While a small subpopulation of these cells expressed typical progenitor markers (i.e. CD45 and CD146), we could detect no overlap with Scx+ cells. As suggested by the reviewer, we therefore replaced occurrences of “progenitor” in the manuscript with “fibroblast” and performed additional experiments with human patient-derived tissue sections and the fibroblast marker CD90.

      See comment 2.1.

      Changes:

      - Title

      - Abstract

      - Figure 2 (new data)

      - Figure 3 (new data)

      - Supplementary Figure 6 (new data)

      - Results

      - Discussion

      (1.3) Clarity regarding the relevance of the 'sheath-like' component of the assembloid would provide helpful context regarding which types of tendons are likely to have this type of communication vs. those that do not, and if there are differences in tendinopathy prevalence. Understanding why/how this communication between structures is relevant is important.

      Our assembloid concept is inspired by the structure of unsheathed tendons (i.e. biceps, semitendinosus, gracilis) and not sheathed tendons like the flexor tendons.

      We agree that clarity regarding the tendon type having this type of communication is important, so we sharpened previously blurry text passages in the revised manuscript.

      Text changes:

      - Introduction, page 3

      - Results, page 4

      - Results, page 8

      - Results, page 9

      - Results, page 11

      - Discussion, page 25

      - Discussion, page 26

      - Experimental section, page 28

      - Figure 1

      - Figure 2

      - Figure 3

      - Supplementary Table 1

      - Supplementary Figure 3

      - Supplementary Figure 4

      (1.4) Minor: in the text for Figure 6 (2nd paragraph), the comma in 19,694 is superscripted.

      Corrections were made throughout the manuscript.

      Text changes:

      - Results, page 4

      - Results, page 12

      - Results, page 19

      - Results, page 21

      (1.5) Minor: The inclusion of the Scx-GFP mouse should be included in the schematic Figure 5.

      The results presented in the previous draft did not feature tissues from ScxGFP mice but used a Scx-antibody to visually detect Scx+ cells. In anticipation of the revision process, we bred a new IL-6 KO x ScxGFP+ mouse line and repeated the experiment. As suggested by the reviewer, the new schematic figure 7 as well as the former figure 5 moved to the supplementary material now includes this mouse.

      Figure changes:

      - Supplementary Figure 9 (former figure 5)

      - Figure 7

      Reviewer 2

      (2.1) One question that comes to mind is whether the fibroblast progenitors in the extrinsic sheath of Achilles tendon is similar to those surrounding the tail tendon. The similarity of progenitors between different tendons is assumed with this model. I would consider this to be a minor issue.

      Tail tendon fascicles are thought to have a low number of reparative fibroblasts / progenitor cells because they lack a developed extrinsic compartment. Achilles tendons are supposed to have a higher number of reparative fibroblasts / progenitor cells, as their fascicles are surrounded by an extrinsic compartment.

      To verify this here, we added a better characterization and comparison of the cell populations isolated from the tail tendon fascicles and the Achilles tendons.

      First, we added representative light microscopy images of these cells at different timepoints after being cultured on tissue-culture plastic.

      Second, we performed flow cytometric analysis not only on the freshly digested tail tendon fascicles and Achilles tendons, but also on the cultured cells at the timepoint when they would have been embedded into the assembloids.

      Third, we compared the expression of population-specific markers in cells derived from tail tendon fascicle and Achilles tendons.

      As expected, tail tendon fascicle-derived cell populations appeared to be more elongated than Achilles tendon-derived populations shortly after isolation. Similarly, the “maintenance” fibroblasts in healthy tendons are more elongated than the reparative fibroblasts in diseased ones. After culture and priming in tendinopathic niche conditions, both populations assumed a more roundish, reparative phenotype.

      This was consistent with the flow cytometric analysis, which revealed a large difference between freshly isolated populations, that disappeared after extended culture and priming in tendinopathic niche conditions. Gene expression in tail tendon fascicle-derived and Achilles tendon-derived cells was similar after extended culture and priming in tendinopathic niche conditions.

      See comment 1.2.

      See comment 2.10.

      Changes:

      - Supplementary Figure 6 (new data)

      - Results, page 11

      (2.2) The authors use core tendons from IL-6 knockout mice and progenitors from wild-type mice. The reasoning behind this approach was a little confusing... is IL-6 expressed solely in the tendon core compared to the extrinsic sheath?

      Insights gained from human patient-derived tissues (Figure 2) suggest that in a healthy tendon, most of the IL-6 is located in the extrinsic compartment but distributed over compartments in the tendinopathic ones.

      Our assembloid design mimicks this by embedding wildtype fibroblasts into the extrinsic compartment. Our hypothesis was that a wildtype core in tendinopathic niche conditions attracts reparative fibroblasts through IL-6, while an IL-6 knock-out core does not. Therefore, it was important to establish IL-6 gradients close to what they seem to be in vivo.

      Nevertheless, we have to acknowledge that the amount of IL-6 secreted by extrinsic fibroblasts in isolation is quite small compared to what is secreted by a wildtype core (Supplementary Figure 7). Attributing IL-6 in the supernatant of a WT core // WT fibroblast assembloid to the correct cell population is challenging but could be part of future research.  

      Changes:

      - Figure 2 (new data)

      - Supplementary Figure 7 (new data)

      - Results, page 12

      (2.3) Is a co-culture system for 7 days appropriate to model tendinopathy without the supplementation of exogenous inflammatory compounds? The transcriptomic differences in Figure 3 seem to be subtle, and may perhaps suggest that it could be a model that more closely resembles steady state compared to tendinopathy. If so, is IL-6 still relevant during steady state?

      The collective experience in our lab is that core explants exposed to tendinopathic niche conditions (i.e. serum, 37°C, high oxygen, and high glucose levels) assume a disease-like phenotype. (i.e. Wunderli et al., Matrix Biology, 2020, Volume 89 https://doi.org/10.1016/j.matbio.2019.12.003 and Blache et al., Sci. Rep., 2021, 11(1), DOI 10.1038/s41598-021-85331-1).

      Specifically for our core // fibroblast co-culture system, we have reported the emergence of exaggerated tendinopathic hallmarks in a previous publication (Stauber et al., Adv. Healthc. Mater., 2021, 10(20), https://doi.org/10.1002/adhm.202100741).

      We clarified the use of previously validated tendinopathic niche conditions in this manuscript.

      Changes:<br /> - Introduction, page 3<br /> - Results, page 12

      (2.4) The results presented in Figures 4 and 5 are impressive, demonstrating a link between IL-6 and fibroblast progenitor numbers and migration. Their experimental design in these figures show strong evidence, using Tocilizumab and recombinant IL-6 to rescue shown phenotypes. I would reduce the claims on proliferation, however, unless a proliferation-specific marker (e.g., Ki67, BrdU, EdU) is included in confocal analyses of Scx+ progenitors.

      As reviewer 1 pointed out as well, it is important to use a proliferation-specific marker “given the nearly unlimited supply of extrinsically derived cells in vivo (vs. the explant model)”.

      To assess the effect of IL-6 on Scx+ fibroblast proliferation in vivo, we repeated those experiments with a proliferation-specific EdU staining and a newly established IL-6 KO x ScxGFP+ mouse line.

      Under this improved design, we could not detect an effect of IL-6 on proliferation in an acute injury in vivo.

      We have therefore replaced figure 5 with the new results in figure 7 and moved figure 5F to the supplementary materials (Supplementary figure 9).

      We acknowledge and discuss this in the discussion section and softened our statements in the title and the abstract.

      See comment 1.1.

      See comment 2.11.

      Changes:

      - Title

      - Abstract

      - Figure 7 (new data)

      - Supplementary Figure 9

      - Results

      - Discussion

      (2.5) I think it would significantly strengthen the study if they could measure tendon healing in IL-6 knockouts or in wild-type mice treated with IL-6 inhibitors, since conventional ablation of IL-6 may lead to the elevation of compensatory IL-6 superfamily ligands that could activate STAT signaling. The authors claim that reducing IL-6 signaling decreases transcriptomic signatures of tendinopathy, but IL-6 may be necessary to promote normal healing of the tendon following injury. It is supposed that a lack of Scx+ progenitor migration would delay tendon healing.

      Indeed, another study using the same IL-6 knock-out strain showed that a lack of IL-6 signaling resulted in slightly inferior mechanical properties in healing patellar tendons (Lin et al., J. Biomech., 39(1), 2006 https://doi.org/10.1016/j.jbiomech.2004.11.009)

      Also, it might be due to the elevation of compensatory IL-6 superfamily ligands that we found no effect of IL-6 on the proliferation of Scx+ cells in an acute injury in vivo.

      Therefore, assessing the effects of IL-6 inhibitors on tendon healing following an acute injury would have been of great interest to us. Unfortunately, getting the necessary permission from the animal experimentation office for a new invasive treatment protocol was outside of our scope due to the severity degree and time limitations.

      We incorporated and acknowledged these important points in the discussion.

      Text changes:

      - Introduction, page 3

      - Discussion, page 26

      (2.6) Do IL-6 knockout mice and/or mice treated with IL-6 inhibitors have delayed healing following Achilles tendon resection? Please provide experimental evidence.

      See comment 2.5.

      (2.7) I would suggest reducing claims on proliferation, or include a proliferation specific marker (e.g., Ki67, BrdU, EdU) in confocal analyses of Scx+ progenitors.

      See comment 1.1.

      See comment 2.4.

      (2.8) Supplementary Figures 1 and 2: the authors removed outliers. Please specify exactly which outliers were removed in the figures, and provide additional information on the criteria used to identify these outliers.

      To address this comment, we sharpened our criteria for identifying outliers and re-did the analysis depicted in figure 1.

      Briefly, we excluded 5 normal and 5 tendinopathic samples from sheathed tendons which have a different compartmental structure than unsheathed tendons.

      A complete separate analysis of the sheathed tendons would have been beyond the scope of this manuscript, but early screening suggested that IL-6 transcripts are not increased in sheathed tendinopathic tendons.

      We made text changes throughout the manuscript and to the supplementary table 1 and supplementary figure 2 to clearly state our criteria for excluding samples / outliers.

      Changes:

      - Introduction, page 3

      - Results, page 4

      - Results, page 8

      - Results, page 9

      - Results, page 11

      - Discussion, page 25

      - Discussion, page 26

      - Experimental section, page 28

      - Figure 1,

      - Figure 2,

      - Figure 3,

      - Supplementary table 1,

      - Supplementary figure 2,

      - Supplementary figure 3,

      - Supplementary figure 4,

      (2.9) Whenever "positive enrichment" is mentioned in the text, please specify in what group. It is presumed that the enrichment, for example, in the first figure is associated with tendinopathy samples compared to controls, though it is a bit unclear.

      The direction of the enrichment was added to the text.

      Text changes:

      - Abstract, page 1

      - Introduction, page 3

      - Results, page 4

      - Results, page 6

      - Results, page 12

      - Results, page 14

      - Results, page 19

      - Results, page 21

      - Discussion, page 25

      - Discussion, page 26

      - Discussion, page 27

      - Figure 1

      - Figure 5

      - Figure 8

      - Figure 9

      - Supplementary figure 3

      - Supplementary figure 4

      - Supplementary figure 6

      - Supplementary figure 8

      - Supplementary figure 11

      - Supplementary figure 12

      - Supplementary figure 14

      (2.10) Are tail tendon progenitors similar to Achilles tendon progenitors? Please provide a statement that shows similarity (in function, transcriptome, etc.) to support the in vitro tendon model.

      See comment 1.2.

      See comment 2.1.

      (2.11) Are the results in Figure 5F significant? It seems that your pictures show a dramatic change in migration, but the quantification does not?

      We repeated the in vivo studies with a newly established IL-6 KO x ScxGFP+ mouse line to combat the considerable background noise of currently available Scx antibodies.

      Under the improved design of these experiments, we could not detect an effect of IL-6 on ScxGFP+ cells migration in an acute injury in vivo.

      We have therefore replaced figure 5 with the new results in figure 7 and moved figure 5F to the supplementary materials (Supplementary figure 9)

      We acknowledge and discuss this in the discussion section.

      See comment 1.1.

      See comment 2.4.

      Changes:

      - Title

      - Abstract

      - Figure 7 (new data)

      - Supplementary Figure 9

      - Results

      - Discussion

      (2.12) Please provide additional discussion points on cis- versus trans-IL6 signaling in your results found in mouse. Do you think researchers/clinicians would want to target trans-IL6 signaling based on your results? Please support these statements with the expression of IL6R on cells found in the tendon core and external sheath progenitors.

      To address this comment, we performed flow cytometric analysis on Achilles tendon-derived fibroblasts expanded in 2D and digested sub-compartments of the assembloids (Supplementary Figure 7).

      These data suggest that IL6R is neither expressed by core nor extrinsic fibroblasts, but mainly comes from core-resident CD45+ tenophages.

      Human samples co-stained for IL6R and CD68 (an established human macrophage marker) confirmed macrophages as a source of IL-6R in vivo. However, human samples co-stained for IL6R and CD90 (an established marker of reparative fibroblasts in humans) also detected IL6R on CD90+ cells, which have not yet been reported to express IL6R themselves.

      Overall, it is likely that trans-IL-6 signaling is more important for the activation of reparative fibroblasts than cis-IL-6 signaling. We added these statements to the manuscript.

      Changes:

      - Results, page 9

      - Results, page 12

      - Discussion, page 25

      - Discussion, page 26

      - Figure 3 (new data)

      - Supplementary figure 7 (new data)

      (2.13) Please provide more detail on collagen isolation from rat tail in the methods section.

      We provided more details on collagen isolation from rat tail in the experimental section (page 29)

      Changes:

      - Experimental section, page 29

      (2.14) Please comment on whether your in vitro system resembles tendinopathy or a steady state tendon. If it models more of a steady state system, would IL-6 still be relevant?

      See comment 2.3.

      Detailed feedback:

      Reviewer 1:

      This work by Stauber et al. is focused on understanding the signaling mechanisms that are associated with tendinopathy development, and by screening a panel of human tendinopathy samples, identified IL-6/JAK/STAT as a potential mediator of this pathology. Using an innovative explant model they delineated the requirement for IL-6 in the main body of the tendon to alter the dynamics of cells in the peritendinous synovial sheath space.

      The use of a publicly available existing dataset is considered a strength since this dataset includes expression data from several different human tendons experiencing tendinopathy. This facilitates the identification of potentially conserved regulators of the tendinopathy phenotype.

      The clear transcriptional shifts between WT and IL6-/- cores demonstrates the utility of the assembloid model, and supports the importance of IL6 in potentiating the cell response to this stimuli.

      Reviewer 2:

      The authors of this study describe a goal of elucidating the signaling pathways that are upregulated in tendinopathy in order to target these pathways for effective treatments. Their goal is honorable, as tendinopathy is a common debilitating condition with limited treatments. The authors find that IL-6 signaling is upregulated in human tendinopathy samples with transcriptomic and GSEA analyses. The evidence of their initial findings are strong, providing a clinically-relevant phenotype that can be further studied using animal models.

      Along these lines, the authors continue with an advanced in vitro system using the mouse tail tendon as the core with progenitors isolated from the Achilles tendon as the external sheath embedded in a hydrogel matrix. One question that comes to mind is whether the fibroblast progenitors in the extrinsic sheath of Achilles tendon is similar to those surrounding the tail tendon. The similarity of progenitors between different tendons is assumed with this model. I would consider this to be a minor issue, and would consider the in vitro system to be an additional strength of this study.

      In order to address the IL-6 signaling pathway, the authors use core tendons from IL-6 knockout mice and progenitors from wild-type mice. The reasoning behind this approach was a little confusing... is IL-6 expressed solely in the tendon core compared to the extrinsic sheath? Furthermore, is a co-culture system for 7 days appropriate to model tendinopathy without the supplementation of exogenous inflammatory compounds? The transcriptomic differences in Figure 3 seem to be subtle, and may perhaps suggest that it could be a model that more closely resembles steady state compared to tendinopathy. If so, is IL-6 still relevant during steady state?

      Nevertheless, the results presented in Figures 4 and 5 are impressive, demonstrating a link between IL-6 and fibroblast progenitor numbers and migration. Their experimental design in these figures show strong evidence, using Tocilizumab and recombinant IL-6 to rescue shown phenotypes. I would reduce the claims on proliferation, however, unless a proliferation-specific marker (e.g., Ki67, BrdU, EdU) is included in confocal analyses of Scx+ progenitors. The Achilles tendon injury model provides a nice in vivo confirmation of Scx-progenitor migration to the neotendon.

      Given their goal to elucidate signaling pathways that could be targeted in the clinic, I think it would significantly strengthen the study if they could measure tendon healing in IL-6 knockouts or in wild-type mice treated with IL-6 inhibitors, since conventional ablation of IL-6 may lead to the elevation of compensatory IL-6 superfamily ligands that could activate STAT signaling. The authors claim that reducing IL-6 signaling decreases transcriptomic signatures of tendinopathy, but IL-6 may be necessary to promote normal healing of the tendon following injury. It is supposed that a lack of Scx+ progenitor migration would delay tendon healing.

      Overall, the authors of this study elucidated IL-6 signaling in tendinopathy and provided a strong level of evidence to support their conclusions at the transcriptomic level. However, functional studies are needed to confirm these phenotypes and fully support their aims and conclusions. With these additional studies, this work has the potential to significantly influence treatments for those suffering from tendinopathy.

    1. Author response:

      (1) First, we wish to point out that there has not been a model for quantifying genetic drift in multi-copy gene systems.  Hence, the first attempt using the Haldane model is not expected to be familiar and readily acceptable. Nevertheless, the standard WF (Wright-Fisher) model cannot handle drift in multi-copy gene systems, such as viruses, due to the two levels of genetic drift – within individuals as well as between individuals of the population.

      [Point 1 responds to the comments that we did not engage with the literature, in particular, publications like the Canning model, which are extensions of the WF model. As pointed out above, models based on the WF sampling cannot handle the two levels of genetic drift.]

      (2) A crucial aspect of the study is the nature of rRNA gene cluster, which is also a multi-copy gene system. It is easy to see some multi-copy gene systems, like viral particles or mtDNAs, to have a sub-population of genes within each individual. It is less obvious that tandem arrays of gene copies like rRNA genes can be treated as sub-populations that are subjected to drift. Nevertheless, rRNA gene copies frequently transfer mutations among copies in the same cell via the homogenization process. Hence, rRNA genes do not have the property of "locus" of single-copy genes as they move about as well (a bit like transposons but via different mechanisms). Indeed, the collection of rRNA genes in a cell is referred to as the “community of genes” as cited in Fig. 1. Over hundreds of generations, rRNA genes are effectively a small gene pool like mtDNAs within cells.  Furthermore, the copy number of rRNA genes also changes rapidly among individuals. For these reasons, genetic drift is operative within cells and this study aims to determine its strength (see Response 3 below).

      [Point 2 of the response addresses questions of Review #1 such as "(whether) the authors are referring to diversity in a single copy of an rRNA gene (or) diversity across the entire array of rRNA genes" or "(whether) the discussion of heterozygosity at rRNA ... is diversity per single copy locus or after collapsing loci together". The answer should be "the genetic diversity of the population of rRNA genes in the cell", noting that the single gene locus does not apply here. Similarly, a question like "Alignment to a single reference genome would likely lead to incorrect and even failed alignment for some reads'" from Review #2 appears to be based on the homology concept of a rRNA gene locus.  All rRNA gene copies are aligned against the consensus of the population of genes of the species. The consensus nucleotide nearly always accounts for > 90% of the gene copies in the population.]

      (3) We now clarify the meaning of C*, the effective copy number of rRNA genes. We apologize that the abstract is indeed unclear, and even misleading. In the abstract, we did not use different notations for the actual copy number (C) and the effective copy number (C*) of rRNA genes. Instead, we use the letter C to designate both.  Furthermore, in the main text, the presentation of the effective number, C*, is overly complicated (in order to be realistic).  We apologize. Slight modifications of the abstract should have removed all the mis-understandings, as shown below.

      "On average, rDNAs have C ~ 150 - 300 copies per haploid in humans. While a neutral mutation of a single-copy gene would take 4N (N being the population size) generations to become fixed, the time should be 4NC generations for rRNA genes where 1<< C (C being the effective copy number; C > C or C <C will depend on the strength of drift). However, the observed fixation time in mouse and human is < 4N, implying the paradox of C < 1. Genetic drift that encompasses all random neutral evolutionary forces appears as much as 100 times stronger for rRNA genes as for single-copy genes, thus reducing C* to < 1."

      [Point 3 responds to the key criticisms.  From Review #1 " The authors frame the number of rRNA genes as roughly equivalent to expanding the population size, ... a mutation can spread among rRNA gene copies is fundamentally different   …". Indeed, the abstract can be very misleading when it uses CN interchangeably with C*N, essentially by allowing C to mean both. 

      From Review #2 "In Eq (1), although C is defined as the "effective copy number", it is unclear what it means in an empirical sense…".  From the slightly revised text quoted above, it should be clear that the fixation time as well as the level of polymorphism represent the empirical measures of C".

      (4) Lastly, we shall address the mis-understood "reproductive success" of rRNA genes, which is the number of progeny, K, in the Haldane model. K should be more accurately referred to as the transmission speed. For single-copy genes, reproductive success and transmission both mean the same thing, K. But the term reproductive success is not appropriate for rRNA genes even though the formulae for K are the same for all gene systems

      [Point 4 responds to all criticisms using the term "reproductive success"]

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this manuscript, Lee et al. compared encoding of odor identity and value by calcium signaling from neurons in the ventral pallidum (VP) in comparison to D1 and D2 neurons in the olfactory tubercle (OT).

      Strengths:

      They utilize a strong comparative approach, which allows the comparison of signals in two directly connected regions. First, they demonstrate that both D1 and D2 OT neurons project strongly to the VP, but not the VTA or other examined regions, in contrast to accumbal D1 neurons which project strongly to the VTA as well as the VP. They examine single unit calcium activity in a robust olfactory cue conditioning paradigm that allows them to differentiate encoding of olfactory identity versus value, by incorporating two different sucrose, neutral and air puff cues with different chemical characteristics. They then use multiple analytical approaches to demonstrate strong, low-dimensional encoding of cue value in the VP, and more robust, high-dimensional encoding of odor identity by both D1 and D2 OT neurons, though D1 OT neurons are still somewhat modulated by reward contingency/value. Finally, they utilize a modified conditioning paradigm that dissociates reward probability and lick vigor to demonstrate that VP encoding of cue value is not dependent on encoding of lick vigor during sucrose cues, and that separable populations of VP neuros encode cue value/sucrose probability and lick vigor. Direct comparisons of single unit responses between the two regions now utilize linear mixed effects models with random effects for subject,

      Weaknesses:

      The manuscript still includes mention of differences in effect size or differing "levels" of significance between VP and OT D1 neurons without reports of a direct comparisons between the two populations. This is somewhat mitigated by the comprehensive statistical reporting in the supplemental information, but interpretation of some of these results is clouded by the inclusion of OT D2 neurons in these analyses, and the limited description or contextualization in the main text.

      We think the reviewer is mistaken and have clarified the text.  Each pairwise comparison between VP, OTD1 and OTD2, for each odor across days is shown as a heatmap in supplementary figure 8B, with further details in table 37. Absolute diff 3H no statistics

      Reviewer #2 (Public Review):

      We appreciate the authors revision of this manuscript and toning down some of the statements regarding "contradictory" results. We still have some concerns about the major claims of this paper which lead us to suggest this paper undergo more revision as follows since, in its present form, we fear this paper is misleading for the field in two areas. here is a brief outline:

      (1) Despite acknowledging that the injections only occurred in the anteromedial aspect of the tubercle, the authors still assert broad conclusions regarding where the tubercle projects and what the tubercle does. for instance, even the abstract states "both D1 and D2 neurons of the OT project primarily to the VP and minimally elsewhere" without mention that this is the "anteromedial OT". Every conclusion needs to specify this is stemming from evidence in just the anteromedial tubercle, as the authors do in some parts of the the discussion.

      We have clarified in multiple locations that we are recorded from the anteromedial OT, including the abstract, and further clarified this in the conclusions throughout the results and discussion. We refrain stating “anteromedial OT” at every mention of the OT, but think we have now made it abundantly clear that our observations are from the anteromedial OT. It is worth noting that retrograde tracing from the VTA did not label any neuron in any part of the OT, suggesting that the conclusion may well extend beyond the anteromedial portion. Though, we acknowledge further work is needed to comprehensively characterize the OT outputs.

      (2) The authors now frame the 2P imaging data that D1 neuron activity reflects "increased contrast of identity or an intermediate and multiplexed encoding of valence and identity". I struggle to understand what the authors are actually concluding here. Later in discussion, the authors state that they saw that OT D1 and D2 neurons "encode odor valence" (line 510). 

      The point we aim to make is that valence encoding is different between the OT and VP. We do not think the reward modulated activity in OT is valence encoding, at least not as it is in the VP.  We do observe some valence encoding at the population level, which is different from individual valence encoding neurons. The ability of classifiers to segregate population activity based on reward might be considered valence encoding, but we contrast it with that in VP where individual neurons signal reward prediction. This is more robust than that in the OT data where few neurons robustly encode valence. The increased response of the OTD1 neurons after reward association, is more consistent with contrast enhancement than valence encoding.  We believe this distinction is important and reflects a transformation between two reward-related brain areas. For clarification of the sentence in question we have changed it to reflects “increased contrast of iden-ty or an intermediate encoding of valence that also encodes iden-ty.” (line 488)

      We appreciate the authors note that there is "poor standardization" when it comes to defining valence (line 521). We are ok with the authors speculating and think this revision is more forthcoming regarding the results and better caveats the conclusions. I suggest in abstract the authors adjust line 14/15 to conclude that, "While D1 OT neurons showed larger responses to rewarded odors, in line with prior work, we propose this might be interpreted as identity encoding with enhanced contrast." [eliminating "rather than valence encoding" since that is a speculation best reserved for discussion as the authors nicely do.

      We accept this suggestion and have modified the abstract sentence to say, “Though D1 OT neurons showed larger responses to rewarded odors than other odors, consistent with prior findings, we interpret this as iden-ty encoding with enhanced contrast.”  We believe this is appropriately qualified as an interpreta-on, and should not be confusing.

      The above items stated, one issue comes to mind, and that is, why of all reasons would the authors find that the anteromedial aspect of the tubercle is not greatly reflecting valence. the anteromedial aspect of the tubercle, over all other aspects of the tubercle, is thought my many to more greatly partake in valence and other hedonic-driven behaviors given its dense reception of VTA DAergic fibers (as shown by Ikemoto, Kelsch, Zhang, and others). So this finding is paradoxical in contrast to if the authors would had studied the anterolateral tubercle or posterior lateral tubercle which gets less DA input.

      We agree that this seems surprising.  This is why we focused on the anteromedial expecting to find valence encoding.  It remains possible that other parts of the OT, or more dorsal aspects of the anteromedial OT encode valence, as has been reported by Murthy and colleagues.  However, it remains unclear if their recordings are in the OT or VP.  Nonetheless our findings indicate that more work is required to understand the contribution of the OT to valence encoding.  It is also important to note that our conclusions are drawn in comparison to the VP, which has more robust valence encoding than the OT. Thus, in comparison the OT sample in our recordings lack robust valence signaling.  We think this comparison is important, due to the lack of clear framework for defining valence that may create misleading statements in past OT work.  

      Reviewer #3 (Public Review):

      Summary:

      This manuscript describes a study of the olfactory tubercle in the context of reward representation in the brain. The authors do so by studying the responses of OT neurons to odors with various reward contingencies and compare systematically to the ventral pallidum. Through careful tracing, they present convincing anatomical evidence that the projection from the olfactory tubercle is restricted to the lateral portion of the ventral pallidum.

      Using a clever behavioral paradigm, the authors then investigate how D1 receptor- vs. D2 receptor-expressing neurons of the OT respond to odors as mice learn different contingencies. The authors find that, while the D1-expressing OT neurons are modulated marginally more by the rewarded odor than the D2-expressing OT neurons as mice learn the contingencies, this modulation is significantly less than is observed for the ventral pallidum. In addition, neither of the OT neuron classes shows conspicuous amount of modulation by the reward itself. In contrast, the OT neurons contained information that could distinguish odor identities. These observations have led the authors to conclude that the primary feature represented in the OT may not be reward.

      Strengths:

      The highly localized projection pattern from olfactory tubercle to ventral pallidum is a valuable finding and suggests that studying this connection may give unique insights into the transformation of odor by reward association.

      Comparison of olfactory tubervle vs. ventral pallidum is a good strategy to further clarify the olfactory tubercle's position in value representation in the brain.

      Weaknesses:

      The study comes to a different conclusion about the olfactory tubercle regarding reward representations from several other prior works. Whether this stems from a difference in the experimental configurations such as behavioral paradigms used or indeed points to a conceptually different role for the olfactory tubercle remains to be seen.

      We acknowledge that our results lead us to conclusions that are different from that of prior work.  But we note that our results are not directly at odds, as we see similar reward modulation of D1 OT neurons as has been reported previously. Our conclusion is different because we contrast our OT responses with that in the VP where valence is more robustly encoded at the single neuron level. We also note, that many of the past studies do not define valence as stringently as we do.  Thus, increased activity with reward, as observed in our data and past studies, seems more like reward modulation than valence.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This work explored intra and interspecific niche partitioning along spatial, temporal, and dietary niche partitioning between apex carnivores and mesocarnivores in the Qilian Mountain National Park of China, using camera trapping data and DNA metabarcoding sequencing data. They conclude that spatial niche partitioning plays a key role in facilitating the coexistence of apex carnivore species, spatial and temporal niche partitioning facilitate the coexistence of mesocarnivore species, and spatial and dietary niche partitioning facilitate the coexistence between apex and mesocarnivore species. The information presented in this study is important for wildlife conservation and will contribute substantially to the current understanding of carnivore guilds and effective conservation management in fragile alpine ecosystems.

      Strengths:

      Extensive fieldwork is evident in the study. Aiming to cover a large percentage of the Qilian Mountain National Park, the study area was subdivided into squares, as a geographical reference to distribute the sampling points where the camera traps were placed and the excreta samples were collected.

      They were able to obtain many records in their camera traps and collected many samples of excreta. This diversity of data allowed them to conduct robust analyses. The data analyses carried out were adequate to obtain clear and meaningful results that enabled them to answer the research questions posed. The conclusions of this paper are mostly well supported by data.

      The study has demonstrated the coexistence of carnivore species in the landscapes of the Qilian Mountains National Park, complementing the findings of previous studies. The information presented in this study is important for wildlife conservation and will contribute substantially to the current understanding of carnivore guilds and effective conservation management in fragile alpine ecosystems.

      Weaknesses:

      It is necessary to better explain the methodology because it is not clear what is the total sampling effort. In methodology, they only claim to have used 280 camera traps, and in the results, they mention that there are 319 sampling sites. However, the total sampling effort (e.g. total time of active camera traps) carried out in the study and at each site is not specified.

      Thanks a lot for this detailed review! We apologize for not offering a distinct description of the overall sampling effort. In this study, we deployed 280 camera trappings, and these cameras were active for approximately 4 to 6 months. We visited each camera 2 to 3 times annually to download photos and check the batteries. In case some cameras failed to capture the targeted carnivore, we would relocate the positions of those cameras. Eventually, we collected 322 camera trapping sites, among which 3 cameras malfunctioned due to loss. As a result, we analyzed data from 319 camera sites and obtained 14,316 independent detections over 37,192 trap-days.

      We have added this information as follows in lines 132 to lines 143: “Taking into account the fact that mammalian communities are sensitive to seasonality, we used camera traps to monitor animals with an extensive survey effort from December 2016 to February 2022, covering the activity of animal species in different seasons, which can reflect the overall distribution of carnivores. We placed a total of 280 infrared cameras at the study site, set them to be active for 4 to 6 months, and considered possible relocation to another position based on animal detection in an effort to improve estimates of the occupancy and detection rates for both common and rare species (Figure 1) (Kays et al., 2020). The camera trap was set to record the time and date on a 24 hr clock when triggered, and to record a 15s video and 1 photo with an interval of 2 minutes between any two consecutive triggers. The sum of camera trap effective days was defined by the total amount of trapping effort during the sampling period, which was calculated from the time the camera was placed in operation to the time the last video or photograph was taken. We visited each camera 2 to 3 times a year to download photos and check batteries.” and lines 228 to lines 232: “A total of 322 camera trap sites were surveyed after relocating infrared cameras that did not capture any target carnivore species. A total of 3 cameras were considered to have failed due to loss. We analyzed data from 319 camera sites and obtained 14,316 independent detections during a total effort of 37,192 effective camera trap days. We recorded wolf in 26 sites, snow leopard in 109 sites, Eurasian lynx in 36 sites, red fox in 92 sites, and Tibetan fox in 34 sites.”

      Reviewer #2 (Public Review):

      Summary:

      The study entitled "Different coexistence patterns between apex carnivores and mesocarnivores based on temporal, spatial, and dietary niche partitioning analysis in Qilian Mountain National Park, China" by Cong et al. addresses the compelling topic of carnivores' coexistence in a biodiversity hotspot in China. The study is interesting given it considers all three components affecting sympatric carnivores' distribution and co-occurrence, namely the temporal, the spatial, and the dietary partition within the carnivore guild. The authors have found that spatial co-occurrence is generally low, which represents the major strategy for coexistence, while there is temporal and dietary overlap. I also appreciated the huge sampling effort carried out for this study by the authors: they were able to deploy 280 camera trapping sites (which became 322 in the result section?) and collect a total of 480 scat samples. However, I have some concerns about the study on the non-consideration of the human dimension and potential anthropogenic disturbance that could affect the spatial and temporal distribution of carnivores, the choice of the statistical model to test co-occurrence, and the lack of clearly stated ecological hypotheses.

      Strengths:

      The strengths of the study are the investigation of all three major strategies that can mitigate carnivores' coexistence, therefore, the use of multiple monitoring techniques (both camera trapping and DNA metabarcoding) and the big dataset produced that consists of a very large sampled area with a noteworthy number of camera trap stations and many scat samples for each species.

      Weaknesses:

      I think that some parts of the manuscript should be written better and more clearly. A clear statement of the ecological hypotheses that could affect the partitioning among the carnivore guild is lacking. I think that the human component (thus anthropogenic disturbance) should have been considered more in the spatial analyses given it can influence the use of the environment by some carnivores. Additionally, a multi-species co-occurrence model would have been a more robust approach to test for spatial co-occurrence given it also considers imperfect detection.

      Thank you very much for your valuable comments and suggestions. We checked and edited the manuscript, and we thought the English level was improved.

      (1) According to your suggestion, we added the competitive exclusion and niche differentiation hypothesis with space, time and diets axis to explain co-occurrence relationship among species in the introduction as follow: “The competitive exclusion principle dictates that species with similar ecological requirements are unable to successfully coexist (Hardin, 1960; Gause, 1934). Thus, carnivores within a guild occupy different ecological niches based on a combination of three niche dimensions, i.e. spatial, temporal, and trophic (Schoener, 1974). Spatially, carnivore species within the same geographic area exhibit distinct distributions that minimize overlap in resource use and competition. For example, carnivores can partition habitats based on habitat feature preferences and availability of prey (De Satgé et al., 2017; Garrote and Pérez De Ayala, 2019; Gołdyn et al., 2003; Strampelli et al., 2023). Temporally, differences in seasonal or daily activity patterns among sympatric carnivores can reduce competitive interactions and facilitate coexistence. For example, carnivores can exhibit temporal segregation in their foraging behaviors, such as diurnal versus nocturnal activity, to avoid direct competition (Finnegan et al., 2021; Nasanbat et al., 2021; Searle et al., 2021). Trophically, carnivore species can diversify their diets to exploit different prey species or sizes, thereby reducing competition for food resources. For example, carnivores can exhibit dietary specialization to optimize their foraging efficiency and minimize competitive pressures (Steinmetz et al., 2021).”

      (2) In addition to distance from roads, we included human dimension as covariates influencing occupancy rates based on the number of independent photos or videos of herders and livestock detected by infrared cameras (named human disturbance and is represented by hdis). According to the results of occupancy models, we found red fox occupancy probability displayed a significant positive relationship with hdis. Moreover, the detection probability of snow leopard and Eurasian lynx decreased with increasing hdis.

      We have incorporated these results into the Results as follow: “According to the findings derived from single-season, single-species occupancy models, the snow leopard demonstrated a notably higher probability of occupancy compared to other carnivore species, estimated at 0.437 (Table 1). Conversely, the Eurasian lynx exhibited a lower occupancy probability, estimated at 0.161. Further analysis revealed that the occupancy probabilities of the wolf and Eurasian lynx declined with increasing Normalized Difference Vegetation Index (NDVI) (Table 2, Figure 2). Additionally, wolf occupancy probability displayed a negative relationship with roughness index and a positive relationship with prey availability. Snow leopard occupancy probabilities exhibited a negative relationship with distance to roads and NDVI. In contrast, both red fox and Tibetan fox demonstrated a positive relationship with distance to roads. Moreover, red fox occupancy probability increased with higher human disturbance and greater prey availability. The detection probabilities of wolf, snow leopard, red fox, and Tibetan fox exhibited an increase with elevation (Table 2). Moreover, there was a positive relationship between the detection probability of Tibetan fox and prey availability. The detection probabilities of snow leopard and Eurasian lynx declined as human disturbance increased.”

      (3) We appreciate the suggestion to use a multi-species co-occurrence model to test spatial co-occurrence. We attempted a multispecies occupancy modeling to analysis the five species in our study followed the method of Rota et al. (2016). Initially, we simplified the candidate models by adopting a single-season, single-species occupancy model. We selected occupancy covariates from the best model as the best covariates for each species and used them to establish multispecies occupancy models. Unfortunately, the final model results did not converge. We are investigating potential solutions to resolve this problem.

      Rota CT, Ferreira MAR, Kays RW, Forrester TD, Kalies EL, McShea WJ, Parsons AW, Millspaugh JJ. 2016. A multispecies occupancy model for two or more interacting species. Methods Ecol Evol 7:1164–1173. doi:10.1111/2041-210X.12587

      Temporal and dietary results are solid and this latter in particular highlights a big predation pressure on some prey species such as the pika. This implies important conservation and management implications for this species, and therefore for the trophic chain, given that i) the pika population should be conserved and ii) a potential poisoning campaign against small mammals could be incredibly dangerous also for mesocarnivores feeding on them due to secondary poisoning.

      Thank you for your thoughtful comments. We appreciate your recognition of the temporal and dietary findings, particularly the highlighted predation pressure on prey species like the pika. These observations indeed underscore critical implications for conservation and management. The necessity to conserve the pika population is paramount for its role in maintaining the stability of the trophic chain within its ecosystem. As you rightly pointed out, any disruption to this delicate balance, including through predation or indirect threats like poisoning campaigns, could have far-reaching consequences. Regarding the potential risks associated with poisoning campaigns targeting small mammals, we acknowledge the significant concerns raised about secondary poisoning affecting mesocarnivores. This underscores the need for careful consideration in pest control strategies and the adoption of measures that minimize unintended ecological impacts. Our findings suggest several practical implications for conservation and management. Conservation efforts should focus on vulnerable prey populations such as the pika, while management strategies could include regulatory frameworks and community education to mitigate risks associated with pest control methods. We believe our study contributes valuable insights into the complexities of predator-prey dynamics and the broader implications for ecosystem health. By integrating these findings into conservation practices, we can work towards ensuring the sustainability of natural systems and the species that depend on them.

      Reviewer #1 (Recommendations For The Authors):

      To better explain the methodology and the sampling effort I recommend reviewing e.g. Kays et al. 2020. An empirical evaluation of camera trap study design: How many, how long, and when?. Methods in Ecology and Evolution, 11(6), 700-713. https://besjournals.onlinelibrary.wiley.com/doi/epdf/10.1111/2041-210X.13370.

      Thank you for this valuable suggestion! According to this reference, we have added this information to explain the methodology and the sampling effort as follow: “Taking into account the fact that mammalian communities are sensitive to seasonality, we used camera traps to monitor animals with an extensive survey effort from December 2016 to February 2022, covering the activity of animal species in different seasons, which can reflect the overall distribution of carnivores. We placed a total of 280 infrared cameras at the study site, set them to be active for 4 to 6 months, and considered possible relocation to another position based on animal detection in an effort to improve estimates of the occupancy and detection rates for both common and rare species (Figure 1) (Kays et al., 2020). The camera trap was set to record the time and date on a 24 hr clock when triggered, and to record a 15s video and 1 photo with an interval of 2 minutes between any two consecutive triggers. The sum of camera trap effective days was defined by the total amount of trapping effort during the sampling period, which was calculated from the time the camera was placed in operation to the time the last video or photograph was taken. We visited each camera 2 to 3 times a year to download photos and check batteries.”

      Reviewer #2 (Recommendations For The Authors):

      I have some concerns about the manuscript.

      I find that the manuscript should be written more clearly: some sentences are not straightforward to understand given the presence of structural errors that make the text hard to read; the paragraphs should be written in a more harmonic way (without logical leaps) with a smoother change of topic between paragraphs, especially in the introduction.

      We appreciate your constructive comments, which have helped us improve the clarity and coherence of the manuscript. We have revised the introduction to provide a clearer outline of the paper's structure and objectives. Specifically, we have rephrased complex sentences and removed ambiguities to ensure that each idea is communicated more straightforwardly. We providing clearer links between ideas and avoiding abrupt shifts in topics to ensure that a smoother transition between paragraphs.

      I feel like the strength of merging the two techniques (camera trapping and DNA metabarcoding) is not brought up enough, while the disadvantage of this approach is not even mentioned (e.g., the increasing costs).

      Thanks a lot for this valuable comment! We have added this information to the Discussion (L356-L363) as follow: “Our study highlights the effectiveness of combining camera trapping with DNA metabarcoding for detecting and identifying both cryptic and rare species within a sympatric carnivore guild. This integrated approach allowed us to capture a more comprehensive view of species presence and interactions compared to traditional visual surveys. whereas, it is important to acknowledge the challenges associated with this technique, including the high costs of equipment and the need for specialized training and computational resources to manage and analyze the large volumes of sequence data. Despite these challenges, the benefits of this combined method in improving biodiversity assessments and understanding species coexistence outweigh the drawbacks.”

      The structure of the manuscript does not follow the structure of the journal (Intro, Material and Method, Results, Discussion instead it reports the methods at the end of the main manuscript), and, most critically, I found that a clear explanation of the research hypothesis is missing: authors should clearly state they ecological hypotheses. What are your hypotheses on the co-occurrence relationship among species? What would specifically affect and change the sympatric relationships among carnivores?

      Thank you for this valuable suggestion! We have revised the manuscript, that is integrated the methods section appropriately within the main body of the manuscript to ensure that it aligns with the standard sections (Introduction, Materials and Methods, Results, Discussion.

      We state our main ecological hypotheses concerning the co-occurrence relationships among carnivore species is based on niche differentiation hypothesis. We hypothesize that differentiation along one or more niche axes is beneficial for the coexistence of carnivorous guild in the Qilian Mountains. We expected that spatial niche differentiation promotes the coexistence of large carnivores in the Qilian Mountain region, as they are more likely than small carnivores to spatially avoid interspecific competition (Davis et al., 2018). Mesocarnivores may coexist either spatially or temporally due to increased interspecific competition for similar prey (Di Bitetti et al., 2010; Donadio and Buskirk, 2006). Nutritional niche differentiation may be a significant factor for promoting coexistence between large and mesocarnivore species due to differences in body size (Gómez-Ortiz et al., 2015; Lanszki et al., 2019). We have added ecological hypotheses in lines 101 to 110.

      Another concern is that all pictures with people have been removed from the dataset, but I think that this could be a bit biased as human presence (or also the presence of livestock) could affect the spatial or temporal presence of carnivores, changing their co-occurrence dynamics. On one side, humans can be perceived as a source of disturbance by carnivores and, therefore, can cause a shift in distribution towards locations with lower human presence (or lower anthropogenic disturbance) that could further concentrate the presence of carnivores increasing the competitive interaction. Conversely, mesocarnivores could take advantage of an increasing human presence - following the human shield hypotheses - finding a refugium from larger body carnivores. From this perspective, important information on the potential anthropogenic pressure is lacking in the description of the study area: how effective is the protection effort of the park? How intense is the potential human disturbance in and around the park? Is there poaching? Intensive livestock grazing? Resources extractions? These are all factors that could affect the interactions among carnivores. Do not forget the possibility and risk of being retaliatory killed by humans due to the presence of livestock in the area. I think that incorporating the human dimension is important because it could strongly affect how carnivores perceive and use the environment. Here only the distance to the closest road has been considered. However, for example, recent research (Gorczynski et al 2022, Global Change Biology) has indeed found that co-occurrece of ecologically similar species differed in relation to increasing human density. Therefore, I think that anthropogenic disturbance is an aspect to be reckoned with and more variables as proxy of human disturbance should be considered.

      Thanks a lot for this valuable comment! We acknowledge that humans can act as both a disturbance factor, potentially driving carnivores away from highly populated areas, and as a source of indirect refuge for mesocarnivores, thereby affecting competitive interactions among carnivores. We understand that poaching and resource extraction are prohibited and livestock grazing is a significant human activity within the study area. Therefore, we added human dimension as covariates influencing occupancy rates based on the number of independent photos or videos of herders and livestock detected by infrared cameras (named human disturbance and is represented by hdis). According to the results of occupancy models, we found red fox occupancy probability displayed a significant positive relationship with hdis. Moreover, the detection probability of snow leopard and Eurasian lynx decreased with increasing hdis.

      In the statistical analyses section, I don't find that the statistical procedure is well described: it is not clear which occupancy model has been used (probably a single-species single-season occupancy model for each target species?), which covariates have been tested for each species and following which hypotheses. Additionally, I think that when modelling the spatial distribution of subordinate species, it should be important to include information on the spatial distribution of apex species given this could affect their occurrence on the territory. This could have been done by using the Relative Abundance Index of the apex predators as a covariate when modelling the distribution of subordinate species. Additionally, why haven't the authors used prey as a covariate for occupancy? I think that prey distribution should affect the occupancy probability more than the detection rate. Also, the authors used the Sørensen similarity index to measure associations between species. However, this association metric has been criticized (see the recent paper of Mainali et al 2022, Science Advances). I am therefore wondering: given the authors are using the occupancy framework, why don't they use a multi-species co-occurrence model that allows them to directly estimate both single-species occupancy and the co-occurrence parameter as a function of covariates (examples are Rota et al. 2016, Methods Ecol. Evol. Or Tobler et al. 2019, Ecology)? For the temporal overlap, I think that adding Figure S2 (pairwise temporal overlap) in the main text would help deliver the results of the temporal analyses more straightforwardly.

      Thanks a lot for this valuable comment!

      (1) The current manuscript utilizes a single-species single-season occupancy model for each target species. Additionally, we have added prey and human disturbance as occupancy covariables. We have revised the statistical analyses section to explicitly state this model choice and clarify the covariates tested for each species from lines 153 to lines170. The details are as follows: “To investigate the spatial distribution of carnivores, as well as the influence of environmental factors on the site occupancy of species in the study area, we performed single-season, single-species occupancy models to estimate carnivores’ occupancy (ψ) and detection (Pr) probability (Li et al., 2022b; MacKenzie, 2018; Moreno-Sosa et al., 2022). To ensure capture independence, only photo or video records at intervals of 30 min were was included in the data analysis (Li et al., 2020). We created a matrix recording whether each carnivore species was detected (1) or not (0) across several 30-day intervals (that is 0-30, 31-60, 61-90, 91-120, 121-150, >150 days) for each camera location. Based on the previous studies of habitat use of carnivores (Greenspan and Giordano, 2021; Alexander et al., 2016; Gorczynski et al., 2022), we selected terrain, vegetation, biological factors and disturbance to construct the model. Terrain is a fundamental element of wildlife habitat and closely linked to other environmental factors (Chen et al., 2024). Terrain variables include elevation (ele) and roughness index (rix). Vegetation variables include normalized difference vegetation index (ndvi), and provide information on the level of habitat concealment. Biological variables include prey abundance (the number of independent photos of their preferred prey based on dietary analysis in this study, wolf and snow leopard: artiodactyla including livestock; Eurasian lynx and Pallas’s cat: lagomorpha; red fox and Tibetan fox: lagomorpha and rodentia) and reflect habitat preference and distribution patterns of carnivores. Disturbance variables include distance to roads (disrd) and human disturbances (hdis, the number of independent photos of herdsman and livestock) and can provide insight into the habitat selection and behavior patterns of carnivores.”

      (2) Thank you for your valuable suggestions. We acknowledge the importance of considering apex species in models of subordinate species' spatial distributions.

      Nonetheless, considering the consistency of covariates for each species and the lack of interspecies interactions in single-species occupancy models, we did not include the Relative Abundance Index of the apex predators as a covariate affecting the occupancy of mesopredators. As you recommended, multi-species occupancy models that account for interspecies interactions are a robust approach. However, we attempted to use the multi-species occupancy method of Rota et al. (Rota et al., 2016), the final model results did not converge. Specifically, we selected occupancy covariates from the best model by single-species model as the best covariates for each species and used them to establish multispecies occupancy models. We are investigating potential solutions to resolve this problem.

      (3) We used the Sørensen similarity index to measure associations between species based on support from previous literature. As counted by Mainali et al., the Sørensen index has been used in more than 700 papers across journals such as Science, Nature, and PNAS. We believe this index holds broad applicability in describing relationships between species.

      (4) We agree that presenting pairwise temporal overlap in the main text would enhance clarity. We revised the manuscript to include Figure S2 in the main text and ensure that the temporal analyses are more straightforwardly presented.

      Regarding the sampling collection of the scats, I'm just curious to know why you decided to use silica desiccant instead of keeping the samples frozen. I'm not familiar with this method and I guess it works fine because the environment is generally freezing cold. Yet, I would like to know more. How fresh do scat samples need to be in order to be suitable for DNA metabarcoding analyses? Additionally, what do you mean by "scats were collected within camera trapping area", could you be more specific? Have you specified a buffer around camera stations?

      Thanks a lot for this specific inquiry! We refer to the scat collection method mentioned in the study of Janecka et al (2008; 2011). Silica is used to dry the scats to minimize DNA degradation. Due to the limitation of field environmental conditions, there is no suitable equipment to freeze samples during sampling, the collected scat samples should be kept dry and cool in shade, and transferred to the laboratory as soon as possible after sampling. We selected relatively fresh samples based on the color of the scat as well as broken off bits and pieces from the outside part of the scat including pieces not directly in the sun. Collect scat material about the size of a pinkie nail in the tube. If over fill the tube it will likely not dry and lead to DNA degradation.

      The study area was subdivided into sample squares of 25 km2 (5×5 km) as a geographical reference for placing camera survey sites and collecting scat samples. Camera traps were set in areas believed to be important to and heavily used by wildlife, such as the bottoms of cliffs, sides of boulders, valleys and ridges along movement corridors. Also, we focused on sites with known or suspected carnivore activity to maximize probability of detection for scat samples. Therefore, transects were set around the infrared camera to collect scat samples. Length of each transect was determined by terrain, amount of scat, and available time. Each transect should have collected about 18 samples or covered 5 km of terrain to avoid uneven representation among transects and ensure that the team has sufficient time to return to base camp (Janečka et al., 2011).

      Janecka J, Jackson R, Yuquang Z, Li D, Munkhtsog B, Buckley-Beason V, Murphy W. 2008. Population monitoring of snow leopards using noninvasive collection of scat samples: A pilot study. Animal Conservation 11:401–411. doi:10.1111/j.1469-1795.2008.00195.x

      Janečka JE, Munkhtsog B, Jackson RM, Naranbaatar G, Mallon DP, Murphy WJ. 2011. Comparison of noninvasive genetic and camera-trapping techniques for surveying snow leopards. J Mammal 92:771–783. doi:10.1644/10-MAMM-A-036.1

      Kays R, Arbogast BS, Baker‐Whatton M, Beirne C, Boone HM, Bowler M, Burneo SF, Cove MV, Ding P, Espinosa S, Gonçalves ALS, Hansen CP, Jansen PA, Kolowski JM, Knowles TW, Lima MGM, Millspaugh J, McShea WJ, Pacifici K, Parsons AW, Pease BS, Rovero F, Santos F, Schuttler SG, Sheil D, Si X, Snider M, Spironello WR. 2020. An empirical evaluation of camera trap study design: How many, how long and when? Methods Ecol Evol 11:700–713. doi:10.1111/2041-210X.13370

      Regarding the discussion, the authors have information for 1) spatial distribution, 2) temporal overlap, 3) dietary requirement, they should use this information to support the discussion. Instead, sometimes it feels that authors go by exclusion or make a suggestion. For example: the authors have found dietary and temporal overlap between two apex predators (i.e., wolf and snow leopard), and they said that this suggests that spatial partitioning is responsible for their successful coexistence in this area (lines 195-196). But why "suggesting", what the co-occurrence metric says? Another example: "Apex carnivores and mesocarnivores showed substantial overlap in time overall, indicating that spatial and dietary partitioning may play a large role in facilitating their coexistence" (lines 241 - 242). However, this should not be a suggestion: your Sørensen similarity index is low proving spatial divergence. So, when data supports the hypotheses, the authors should be firmer in their discussion. Generally, when reading the discussion, it felt that a figure summarizing the partitioning would be much needed to digest which type of partitioning strategy the species are using.

      Thank you for your thoughtful comments and suggestions.

      (1) We appreciate your insights on the discussion section, particularly concerning the interpretation of our findings on spatial distribution, temporal and dietary overlap. We acknowledge the need for clearer interpretation of our findings. We have revised the discussion section to provide more direct support. For example, in line 294-295, we modify it as “We found dietary and temporal overlap among apex carnivores, showing that spatial partitioning is responsible for their successful coexistence in this area.” In line 341-342, we modify it as “Apex carnivores and mesocarnivores exhibited considerable overlap in time overall, showing that spatial and dietary partitioning may play a large role in facilitating their coexistence.”

      (2) We appreciate your suggestion regarding the inclusion of a figure summarizing partitioning strategies among species discussed. In our study, we organized the overlap index of space, time, and diet among carnivores in Table 3, which directly reflects the overlap of carnivore species in these three dimensions by summarizing them in a single table. Additionally, Figure 3 illustrates the activity patterns and overlap among species, while Figure 4 displays the primary prey of carnivores and the frequency of food utilization.

      About lines 228 - 229, just as a side note, the Pallas's cat, as the red fox, selects the environment according to a greater distribution of prey species, while also selecting primarily meadows and natural environment (Greco et al. 2022, Journal of wildlife management) additionally it is not strictly diurnal (Anile et al. 2020, Wildlife Research; Greco et al. 2022, Journal of wildlife management). Regarding the Pallas's cat and its exclusion from the temporal and spatial analyses, can you specify how many independent detection events you had?

      Thanks a lot for this valuable comment!

      (1) We appreciate the references to recent studies highlighting its habitat preferences and activity patterns. We have revised the manuscript to acknowledge these points and provide context regarding its habitat selection strategies. Specifically, we modify it as follow: “Pallas’s cat hunts during crepuscular and diurnal periods, inhabits meadow with greater prey abundance (Anile et al., 2021; Greco et al., 2022; Ross et al., 2019).”

      (2) The low detection rate of Pallas's cat (0.072) identified by single-species occupancy model raised concerns regarding the reliability of the results. The estimated high standard errors for each environmental variable and the wide confidence intervals around the detection rate further indicated potential bias or randomness. Consequently, we made the decision to exclude the Pallas's cat data from further analysis. Upon closer examination of the Pallas's cat data, it became evident that out of 319 camera sites surveyed, only 27 sites detected the presence of Pallas's cat. Notably, only 3 out of 193 sites in Gansu Province recorded detections, while Qinghai Province had 24 detections out of 126 sites. This skewed distribution of data likely contributed to the unsatisfactory outcomes observed in our models.

      About the diet and results of scat analyses, have you found any sign of intra-guild predation (i.e., apex predators that kill and sometimes consume subordinate carnivores to reduce competition), this could actually represent proof of competition and spatial overlap.

      Thanks a lot for your thoughtful comments!

      We observed intraguild predation in the diet of wolves and snow leopards. Specifically, we found the presence of Pallas’s cat, red fox, and Tibetan fox in the diet of wolfs, and Pallas’s cat, Eurasian Badger and Tibetan fox in the diet of snow leopard. However, these intraguild predation events accounted for only 1.89% of the diet composition of apex carnivores. We suggest that the rarity of these observations may be influenced by various factors and does not necessarily provide sufficient evidence of competition and spatial overlap. Therefore, further data collection and in-depth research are needed to better understand this phenomenon.

      Some minor comments: Figure 2 is really nice, while some abbreviations are missing in the caption of Table 2.

      Thank you for your feedback and positive comments on Figure 2. Unfortunately, we have removed Figure 2 from the manuscript. Due to the inclusion of prey abundance and human disturbance as occupancy covariates, these variables were derived solely from infrared camera trap data and did not encompass a comprehensive dataset across the entire national park. Therefore, we were unable to accurately spatially project for carnivore species occupancy probability in nature park.

      We apologize for the oversight that the abbreviations missing in the caption of Table 2. We have added the missing abbreviations to the caption of Table 2 as follow: “Abbreviations: Disrd-distance to roads, Ele-elevation, NDVI-normalized difference vegetation index, Rix- roughness index, hdis-human disturbance.”

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      In this manuscript, the authors employ a combined proteomic and genetic approach to identify the glycoprotein QC factor malectin as an important protein involved in promoting coronavirus infection. Using proteomic approaches, they show that the non-structural protein NSP2 and malectin interact in the absence of viral infection, but not in the presence of viral infection. However, both NSP2 and malectin engage the OST complex during viral infection, with malectin also showing reduced interactions with other glycoprotein QC proteins. Malectin KD reduce replication of coronaviruses, including SARS-COV2. Collectively, these results identify Malectin as a glycoprotein QC protein involved in regulating coronavirus replication that could potentially be targeted to mitigate coronavirus replication.

      Overall, the experiments described appear well performed and the interpretations generally reflect the results. Moreover, this work identifies Malectin as an important pro-viral protein whose activity could potentially be therapeutically targeted for the broad treatment of coronavirus infection. However, there are some weaknesses in the work that, if addressed, would improve the impact of the manuscript.

      Notably, the mechanism by which malectin regulates viral replication is not well described. It is clear from the work that malectin is a pro-viral protein in the work presented, but the mechanistic basis of this activity is not pursued. Some potential mechanisms are proposed in the discussion, but the manuscript would be strengthened if additional insight was included. For example, does the UPR activated to higher levels in infected cells depleted of malectin? Do glycosylation patterns of viral (or non-viral) proteins change in malectin-depleted cells? Additional insight into this specific question would significantly improve the manuscript.

      We concur with the reviewer that the mechanism by which Malectin regulates viral replication remains unclear. It will be worth pursuing the molecular mechanisms underlying this phenotype in future studies. Our existing proteomics data sets can potentially offer additional insight into the questions posed here. Namely, we plan to analyze levels of protein markers of the UPR and other ER stress pathways in infected cells depleted of Malectin in our existing global proteomics data set. In addition, we will attempt to compare glycosylation patterns of endogenous proteins in Malectin-depleted cells. One caveat to this will be that it may be difficult to differentiate between spontaneous chemical deamidation and enzymatic PNGase F mediated deamidation.

      Further, the evidence for increased interactions between OST and malectin during viral infection is fairly weak, despite being a major talking point throughout the manuscript. The reduced interactions between malectin and other glycoproteostasis QC factors is evident, but the increased interactions with OST are not well supported. I'd recommend backing off on this point throughout the text, instead, continuing to highlight the reduced interactions.

      We note that the fold change increase of OST interactions with malectin are small compared to the fold change decrease of other glycoproteostasis factors. If this modest increase is consistent across replicates, we believe this bolsters the claim that it is a noteworthy change. However, if not, we can modify the text as suggested to emphasize the reduced interactions.

      I was also curious as to why non-structural proteins, nsp2 and nsp4, showed robust interactions with host proteins localized to both the ER and mitochondria? Do these proteins localize to different organelles or do these interactions reflect some other type of dysregulation? It would be useful to provide a bit of speculation on this point.

      We also find these ER and mitochondrial protein interactions curious, which we initially reported on (Davies, Almasy et al. 2020 ACS Infectious Diseases). In this prior report, we found that when expressed in HEK293T cells, SARS-CoV-2 nsp2 and nsp4 have partial localization to mitochondrial-associated ER membranes (MAMs), as determined by subcellular fractionation. Given that malectin has also been shown to have MAMs localization (Carreras-Sureda, et al. 2019 Nature Cell Biology), we can insert some speculation on this in the Discussion section.

      Again, the overall identification of malectin as a pro-viral protein involved in the replication of multiple different coronaviruses is interesting and important, but additional insights into the mechanism of this activity would strengthen the overall impact of this work.

      Reviewer #2 (Public Review):

      Summary:

      A strong case is presented to establish that the endoplasmic reticulum carbohydrate binding protein malectin is an important factor for coronavirus propagation. Malectin was identified as a coronavirus nsp2 protein interactor using quantitative proteomics and its importance in the viral life cycle was supported by using a functional genetic screen and viral assays. Malectin binds diglucosylated proteins, an early glycoform thought to transiently exist on nascent chains shortly after translation and translocation; yet a role for malectin has previously been proposed in later quality control decisions and degradation targeting. These two observations have been difficult to reconcile temporally. In agreement with results from the Locher lab, the malectin-interactome shown here includes a number of subunits of the oligosaccharyltransferase complex (OST). These results place malectin in close proximity to both the co-translational (STT3A or OST-A) and post-translational (STT3B or OST-B) complexes. It follows that malectin knockdown was associated with coronavirus Spike protein hypoglycosylation.

      Strengths:

      Strengths include using multiple viruses to identify interactors of nsp2 and quantitative proteomics along with

      multiple viral assays to monitor the viral life cycle.

      Weaknesses:

      Malectin knockdown was shown to be associated with Spike protein hypoglycosylation. This was further supported by malectin interactions with the OSTs. However, no specific role of malectin in glycosylation was discussed or proposed.

      We will emphasize our hypotheses on this point in the discussion and add a summary figure to highlight the specific role of malectin.

      Given the likelihood that malectin plays a role in the glycosylation of heavily glycosylated proteins like Spike, it is unfortunate that only 5 glycosites on Spike were identified using the MS deamidation assay when Spike has a large number of glycans (~22 sites). The mass spec data set would also include endogenous proteins. Were any heavily glycosylated endogenous proteins hypoglycosylated in the MS analysis in Fig 5D?

      We plan to interrogate this question in our existing MS deamidation proteomics data set as outlined above.

      The inclusion of the nsp4 interactome and its partial characterization is a distraction from the storyline that focuses on malectin and nsp2.

      We believe the nsp4 comparative interactome and functional genomics data offers a rich resource for further functional investigation by others, if made public. While we found the malectin and nsp2 storyline the most compelling to pursue, we believe the inclusion of the nsp4 data strengthens the overall approach, in agreement with Reviewer #3’s comments.

      Reviewer #3 (Public Review):

      Summary:

      In this study, Davies and Plate set out to discover conserved host interactors of coronavirus non-structural proteins (Nsp). They used 293T cells to ectopically express flag-tagged Nsp2 and Nsp4 from five human and mouse coronaviruses, including SARS-CoV-1 and 2, and analyzed their interaction with host proteins by affinity purification mass-spectrometry (AP-MS). To confirm whether such interactors play a role in coronavirus infection, the authors measured the effects of individual knockdowns on replication of murine hepatitis virus (MHV) in mouse Delayed Brain Tumor cells. Using this approach, they identified a previously undescribed interactor of Nsp2, Malectin (Mlec), which is involved in glycoprotein processing and shows a potent pro-viral function in both MHV and SARS-CoV-2. Although the authors were unable to confirm this interaction in MHV-infected cells, they show that infection remodels many other Mlec interactions, recruiting it to the ER complex that catalyzes protein glycosylation (OST). Mlec knockdown reduced viral RNA and protein levels during MHV infection, although such effects were not limited to specific viral proteins. However, knockdown reduced the levels of five viral glycopeptides that map to Spike protein, suggesting it may be affected by Mlec.

      Strengths:

      This is an elegant study that uses a state-of-the-art quantitative proteomic approach to identify host proteins that play critical roles in viral infection. Instead of focusing on a single protein from a single virus, it compares the interactomes of two viral proteins from five related viruses, generating a high confidence dataset. The functional follow-ups using multiple live and reporter viruses, including MHV and CoV2 variants, convincingly depict a pro-viral role for Mlec, a protein not previously implicated in coronavirus biology.

      Weaknesses:

      Although a commonly used approach, AP-MS of ectopically expressed viral proteins may not accurately capture infection-related interactions. The authors observed Mlec-Nsp2 interactions in transfected 293T cells (1C) but were unable to reproduce those in mouse cells infected with MHV (3C). EIF4E2/GIGYF2, two bonafide interactors of CoV2 Nsp2 from previous studies, are listed as depleted compared to negative controls (S1D). Most other CoV2 Nsp2 interactors are also depleted by the same analysis (S1D). Previously reported MERS Nsp2 interactors, including ASCC1 and TCF25, are also not detected (S1D). Furthermore, although GIGYF2 was not identified as an interactor of MHV Nsp2/4 in human cells (S1D), its knockdown in mouse cells reduced MHV titers about 1000 fold (S4). The authors should attempt to explain these discrepancies.

      We plan to address these discrepancies with further elaboration in the text.

      More importantly, the authors were unable to establish a direct link between Mlec and the biogenesis of any viral or host proteins, by mass-spectrometry or otherwise. Although it is clear that Mlec promotes coronavirus infection, the mechanism remains unclear. Its knockdown does not affect the proteome composition of uninfected cells (S15B), suggesting it is not required for proteome maintenance under normal conditions. The only viral glycopeptides detected during MHV infection originated from Spike (5D), although other viral proteins are also known to be glycosylated. Cells depleted for Mlec produce ~4-fold less Spike protein (4E) but no more than 2-fold less glycosylated spike peptides (5D), compounding the interpretation of Mlec effects on viral protein biogenesis. Furthermore, Spike is not essential for the pro-viral role of Mlec, given that Mlec knockdown reduces replication of SARS-CoV-2 replicons that express all viral proteins except for Spike (6A/B).

      These are all important points. We plan to acknowledge some of these compounding factors in the Discussion.

      Any of the observed effects on viral protein levels could be secondary to multiple other processes. Interventions that delay infection for any reason could lead to an imbalance of viral protein levels because Spike and other structural proteins are produced at a much higher rate than non-structural proteins due to the higher abundance of their cognate subgenomic RNAs. Similarly, the observation that Mlec depletion attenuates MHV-mediated changes to the host proteome (S15C/D) can also be attributed to indirect effects on viral replication, regardless of glycoprotein processing. In the discussion, the authors acknowledge that Mlec may indirectly affect infection through modulation of replication complex formation or ER stress, but do not offer any supporting evidence. Interestingly, plant homologs of Mlec are implicated in innate immunity, favoring a more global role for Mlec in mammalian coronavirus infections.

      We plan to interrogate our existing proteomics data for signatures of ER stress in Mlec-depleted cells (as outlined above).

      Finally, the observation that both Nsp2 (3C) and Mlec (3E/F) are recruited to the OST complex during MHV infection neither support nor refute any of these alternate hypotheses, given that Mlec is known to interact with OST in uninfected cells and that Nsp2 may interact with OST as part of the full length unprocessed Orf1a, as it co-translationally translocates into the ER. Therefore, the main claims about the role of Mlec in coronavirus protein biogenesis are only partially supported.

      We plan to acknowledge this alternative hypothesis in the Discussion.

    1. Author response:

      We are grateful to the reviewers for their insightful comments on our manuscript and are encouraged by their overall favorable assessments. For the eLife Version of Record, we will make the following revisions to address reviewers’ comments and broaden the applicability of our technique in the zebrafish research community:

      (1) We will elaborate on various facets with additional details:

      a) Experimental conditions | We will specify the transgenic background, injected plasmids, larval stage, viral type, and viral titer clearly for each related experiment.

      b) Experimental methods | We will depict in more details on how to inject the virus into a target area in larval zebrafish.

      c) Data analysis | We will provide more detailed information on the paired electrical stimulation-calcium imaging study and on identifying connected Purkinje cells and granule cells during circuit reconstruction.

      d) Discussion | We will elaborate on trans-synaptic specificity concerning glial cell labeling, toxicity related to viral dose and temperature, and the potential issue of secondary starters and multi-step circuit tracing.

      (2) We will address the issue of glial cell labeling by adding more discussion and characterization, including potential mechanisms and implications, cell distribution, labeling progress, survival, and capability for viral transmission as starter cells.

      (3) We will modify the text of the manuscript to clarify additional points raised by the reviewers.

      (4) We will provide public repositories for accessing both the items and information on zebrafish lines, plasmids, viral vectors, and reconstructed data generated in this study.

      In the end, we will submit full responses to the reviewer comments along with the revised version of the manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Satoshi Yamashita et al., investigate the physical mechanisms driving tissue bending using the cellular Potts Model, starting from a planar cellular monolayer. They argue that apical length-independent tension control alone cannot explain bending phenomena in the cellular Potts Model, contrasting with the vertex model. However, the evidence supporting this claim is incomplete. They conclude that an apical elastic term, with zero rest value (due to endocytosis/exocytosis), is necessary in constricting cells and that tissue bending can be enhanced by adding a supracellular myosin cable. Notably, a very high apical elastic constant promotes planar tissue configurations, opposing bending.

      Strengths:

      - The finding of the required mechanisms for tissue bending in the cellular Potts Model provides a more natural alternative for studying bending processes in situations with highly curved cells.

      - Despite viewing cellular delamination as an undesired outcome in this particular manuscript, the model's capability to naturally allow T1 events might prove useful for studying cell mechanics during out-of-plane extrusion.

      We thank the reviewer for the careful comments and insightful suggestions.

      Weaknesses:

      - The authors claim that the cellular Potts Model is unable to obtain the vertex model simulation results, but the lack of a substantial comparison undermines this assertion. No references are provided with vertex model simulations, employing similar setups and rules, and explaining tissue bending solely through an increase in a length-independent apical tension.

      Studies cited in a previous paragraph included the simulations employing the increased length-independent apical tension. For the sake of clarity, we added the citation to them as below.

      P4L174: “In contrast to the simulations in the preceding studies (Sherrard et al., 2010; Conte et al., 2012; Perez-Mockus et al., 2017; Pérez-González et al., 2021), our simulations could not reproduce the apical constriction”.

      We did not copy the parameters of the vertex models in the preceding studies because we also found that the apical, lateral, and basal surface tensions must be balanced otherwise the epithelial cell could not maintain the integrity (Figure 1—figure supplement 1), while the ratio was outside of the suitable range in the preceding studies.

      - The apparent disparity between the two models is attributed to straight versus curved cellular junctions, with cells with a curved lateral junction achieving lower minimum energies at steady-state. However, a critical discussion on the impact of T1 events, allowing cellular delamination, is absent. Note that some of the cited vertex model works do not allow T1 events while allowing curvature.

      We appreciate the comment and added it to the discussion as suggested.

      P12L301: “Even when the vertex model allowed the curved lateral surface, the model did not assume the cells to be rearranged and change neighbors, limiting the cell delamination (Pérez-González et al., 2021).”

      P12L311: “Note that the vertex model could also be extended to incorporate the curved edges and rearrangement of the cells by specifically programming them, and would reproduce the cell delamination. That is, we could find the importance of the balanced pressure because the cellular Potts model intrinscally included a high degree of freedom for the cell shape, the cell rearrangement, and the fluctuation.”

      - The suggested mechanism for inducing tissue bending in the cellular Potts Model, involving an apical elastic term, has been utilized in earlier studies, including a cited vertex model paper (Polyakov 2014). Consequently, the physical concept behind this implementation is not novel and warrants discussion.

      The reviewer is correct but Polyakov et al. assumed “that the cytoskeletal components lining the inside membrane surfaces of the cells provide these surfaces with springlike elastic properties” without justification. We assumed that the myosin activity generated not the elasticity but the contractility based on Labouesse et al. (2015), and expected that the surface elasticity corresponded with the membrane elasticity. Also, in the physical concept, we clarified how the contractility and the elasticity differently deformed the cells and tissue, and demonstrated why the elasticity was important for the apical constriction. We added it to the discussion as below.

      P12L316: “In the preceding studies, the apically localized myosin was assumed to generate either the contractile force (Sherrard et al., 2010; Conte et al., 2012; Perez-Mockus et al., 2017; Pérez-Vonzález et al., 2021) or the elastic force (Polyakov et al., 2014; Inoue et al., 2016; Nematbakhsh et al., 2020). However, the limited cell shape in the vertex model made them similar in terms of the energy change during the apical constriction, i.e., the effective force to decrease the apical surface. In this study, we showed that the contractile force and the elastic force differently deformed the cells and tissue, and demonstrated why and how the elasticity was important for the apical constriction.”

      - The absence of information on parameter values, initial condition creation, and boundary conditions in the manuscript hinders reproducibility. Additionally, the explanation for the chosen values and their unit conversion is lacking.

      We agree with the comment.

      For the initial configuration, we added an explanation to Tissue deformation by increased apical contractility with cellular Potts model section in the Results as below.

      P4L170: “A simulation started from a flat monolayer of cells beneath the apical ECM, and was continued until resulting deformation of cells and tissue could be evaluated for success of failure of reproducing the apical constriction.”

      For the parameter values we added a section “Parameters for the simulations” in the Methods.

      For the parameters unit conversion, we did not measure the surface tension and cell pressure in an actual tissue and thus could not compare the parameters to the actual forces. Instead, we varied the parameters and demonstrated that the apical constriction was reproduced with the wide range of the parameter values. We added it to the discussion as below.

      P12L310: “It succeeded with a wide range of parameter values, indicating a robustness of the model.”

      Reviewer #2 (Public Review):

      Summary:

      In their work, the authors study local mechanics in an invaginating epithelial tissue. The mostly computational work relies on the Cellular Potts model. The main result shows that an increased apical "contractility" is not sufficient to properly drive apical constriction and subsequent tissue invagination. The authors propose an alternative model, where they consider an alternative driver, namely the "apical surface elasticity".

      Strengths:

      It is surprising that despite the fact that apical constriction and tissue invagination are probably most studied processes in tissue morphogenesis, the underlying physical mechanisms are still not entirely understood. This work supports this notion by showing that simply increasing apical tension is perhaps not sufficient to locally constrict and invaginate a tissue.

      We thank the reviewer for recognizing the importance and novelty of our work.

      Weaknesses:

      The findings and claims in the manuscript are only partially supported. With the computational methodology for studying tissue mechanics being so well developed in the field, the authors could probably have done a more thorough job of supporting the main findings of their work.

      We thank the reviewer for the careful assessment and suggestions. However our simulation was computationally expensive, modeling the epithelium in an analytically calculable expression requires a lot of work, and it is beyond the scope of the present study.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Reference line 648: Correct the author's name (Pérez-González).

      We thank the reviewer and corrected the reference.

      (2) "Pale" colors are challenging to discern.

      We updated the figures.

      (3) Figure 1j: What does the yellow color in the cellular junction represent?

      We used the apical lateral site colored yellow in Fig. 1e-f’ to simulate the effect of the adherens junction. We updated the figure legend.

      (4) Figure 2c - left: Why is there a red apical junction?

      Our simulation model marked the apical junction in the initial configuration and updated the marking based on connectedness to surrounding other site marked as apical in the same cell. But when a cell was once delaminated and lost its apical junction, any surface site not adjacent to other epithelial cells were marked as basal junction because they were not adjacent to the apical junction.

      We added it to Cellular Potts model with partial surface elasticity section in the Methods as below.

      P17L430: “To simulate the differential phyisical properties of the apical, lateral, and basal surfaces, the subcellular locations are marked automatically, and the marking is updated during the simulation. In each cell, sites adjacent to different cells but not to the medium are marked as lateral.

      At the initial configuration, sites adjacent to the apical ECM are marked as apical, and during the simulation, sites adjacent to medium and other apical sites in the same cell are marked as apical.

      Rest of sites which are adjacent to medium but not marked as apical are marked as basal.

      Therefore, once a cell is delaminated and loses its apical surface, afterwards all sites in the cell adjacent to the medium are marked as basal even if it is adjacent to the apical ECM or the outer body fluid.”

      (5) Figure 4a: The snapshots are not in a steady state but in the middle of deformation. Is the time the same for all snapshots? The motivation to change P_0a is related to endocytosis. However, this could be achieved by decreasing P_0a to a non-zero value. Here, in the more drastic limit, the depth (a measure of bending) is very slight, approximately half of a cell size. What physically limits further invagination? Is it the number of cells or the range of parameters under study?

      The time length was the same for simulations in each figure, and we add it to Parameters for the simulations section in Method as below.

      P18L466: “In each figure, snapshots of the simulations show deformation by the same time length unless specified.”

      For P_0a, the reviewer is correct and the iterated ratcheting may decrease P_0a step by step instead of making it 0 immediately. Still, with P_a0 >0, the energy function and its derivative are both increasing with respect to the apical width as long as P_a > P_a0, and thus the apical shrinkage would be synchronized, even though the deformation would be smaller. We also run simulations by decreasing P_0a to 0.6 times the initial P_a, and observed smaller deformation as expected. On the other hand, the non-zero P_0a made the invagination deeper when it was combined with the effect of surrounding supracellular myosin cable, maybe due to a resistance of the apical surface against compression. One of the novel and important finding in this study is the synergetic effect of the elasticity-based apical constriction and the surrounding supracellular myosin cable. To demonstrate that the deep invagination was not due to the apical surface resistance against the compression, we showed the simulations with P_a0 = 0.

      For the conditions for further invagination, it may include the number of cells, a ratio between the cell height and width (Figure 5—figure supplement 1), interaction with ECM (Figure 5—figure supplement 2), etc. For the parameter, there might be an upper limit (Figure 4). We did not test the number of cells because of its computational cost. Among the conditions we tested, we found the planar compression by surrounding supracellular myosin the most influential rather than the mechanical property of apically constricting cells themselves.

      How each condition and parameter contributes to the invagination shall be studied in future. We added it to the conclusion as below.

      P15L395: “The depth, curvature, and speed of the invagination might be influenced by the cell shape, configuration, and parameters, and how each condition contributes to the invagination shall be studied in future.”

      (6) Figure 6b: What does the cell-surface color represent? If the idea was to represent junction tension, it would be clearer to color the junctions only.

      The junction tension may vary differently in different situations. For example, T1 transition is accompanied by enriched myosin along a shrinking cell-cell junction, and the junction bears higher tension, but other junctions of the same cell do not and thus the cell does not decrease its apical surface. In chick embryo neural tube closure, the junction tension is also polarized, and the cells shrink the apical surface along medial-lateral axis, driving the apical constriction (Nishimura et al., 2012, doi:10.1016/j.cell.2012.04.021). In the case of Drosophila embryo tracheal invagination, the cells shrank their apical surface isotropically (Figure 6a). If the junction tension was responsible for the shrinkage, all junctions of the cell must bear higher tension. Based on this assumption, the junction tension was averaged in each cell to check if the tracheal cells bore the higher average tension than surrounding cells.

      We also plotted stress tensor and calculated nematic order to check if there was radial or encircling tension alignment in the tracheal pit, but there was not.

      (7) Figure 6c: What does the junction color represent here?

      The junction color represent the relative junctional tension. We updated the figure legend.

      (8) Figure 6d-e: It is challenging to understand which error bar corresponds to each dataset.

      We updated the figure.

      (9) What is the definition of relative pressure?

      The geometrical tension inference method assumes that the tissue is in mechanical equilibrium and a sum of the junctional tensions and cell pressures pulling/pushing a vertex (tricellular junction) is 0. Therefore the calculated tensions and pressures are proportional to each other but not absolute values. We added it to the 3D Bayesian tension inference section of Methods as below.

      P24L567: “Since Equation 13 and Equation 14 only evaluate the balance among the forces, it cannot estimate an absolute value but a relative value of the tension and pressure.”

      (10) In the main text, it is mentioned that a large Es (apical elastic constant) leads to flat surfaces, avoiding bending, but the abstract says "strong apical surface tension," which, according to the rest of the text, would seem to be J_apical. Clarification is needed.

      The surface tension includes both of the surface contractility and the surface elasticity.

      We added it to Extended cellular Potts model to simulate epithelial deformations section in the Results as below.

      P3L122: “Note that in some studies the tension and the contractility are considered as equivalent, but they are distinguished in this study.”

      and

      P4L151: “The energy H included only the terms of the contact energy (Equation 1) and the area constraint (Equation 5), but the surface elasticity (Equation 2) nor (Equation 3) was not included, and thus the surface tension was determined by the contact energy.”

      Reviewer #2 (Recommendations For The Authors):

      (1) The model used is rather specific and it is rather confusing whether the issue is in the methodology or fundamental biophysics of apical constriction. For instance, one of the main narratives of the manuscript is that the Cellular Potts model better predicts apical constriction and tissue invagination than the vertex model. As I understand it, and as the authors state in p7 (line 210), "the difference between the vertex model and the cellular Potts model results was due to the straight lateral surface...". I assume that if apical constriction and tissue invagination were modelled with a vertex model with curved edges, while also allowing for cell rearrangements out of the tissue plane (some sort of epithelium-to-mesenchyme transition), the vertex model would yield exactly the same results as in the authors' cellular Potts model. If my understanding is correct, the authors should change the narrative of their manuscript and focus more on the comparison of a model with flat vs. curved edges, with "contractility" vs. "surface elasticity", with patterned apical contractility vs. non-patterned contractility (see my comment in point 2 below)... and not on comparison between CPM and VM.

      We appreciate the comments. The reviewers is correct that the vertex model can include the curved edges and the cell rearrangement, and it would reproduce the result of our cellular Potts model simulations. For the cellular Potts model, there was no need to specifically design how much the cell surface could be curved in a large arc, zigzag, or other shape, and that enabled us to find the conditions of delamination and bending.

      We added it to the discussion as below.

      P12L311: “Note that the vertex model could also be extended to incorporate the curved edges and rearrangement of the cells by specifically programming them, and would reproduce the cell delamination. That is, we could find the importance of the balanced pressure because the cellular Pott’s model intrinscally included a high degree of freedom for the cell shape, the cell rearrangement, and the fluctuation.”

      (2) About physics... and I think this is a really important point: one of the observations in the model was that in the "contractilty" model, only "edge cells" shrank its apical surface, while inner cells remained quadrilateral. Related to this, the authors say that one of the requirements for proper apical constriction is a mechanism that "simulataneously shrinks the apical surface among cells in a cluster". What would happen if the authors assumed patterned contractility, meaning that cells in the center of the cluster would be most apically-contractile, while those further away from the center, would not be contractile? Features like this were investigated in studies of ventral-furrow invagination [see, for instance, Spahn and Reuater PLOS ONE (2013) and Rauzi et al. Nat Commun (2015)-Fig. S13d].

      We thank the reviewer for the critical comment, and ran simulations with the patterned apical contractility. The apical contractility following a gradient of parabola shape succeeded in the simultaneous apical shrinkage. However, it was weak against fluctuations and the cells were delaminated by chance.

      We added it to Apical constriction by modified apical elasticity section in the result as below.

      P9L252: “We also tested another model for the simultaneous apical shrinkage, a gradient contractility model (Spahn and Reuter, 2013; Rauzi et al., 2015). If the inner cells bear higher apical surface contractility than the edge cells, that inner cells may shrink their apical surface. To synchronize the apical shrinkage, the apical contractility must follow a parabola shape gradient. Even though the gradient contractility enabled the cells to shrink the apical surface simultaneously, often some of the cells shrank faster than neighbors and were delaminated by chance (Figure 4—figure Supplement 1).”

      (3) The quality of the figures should be improved. Especially, Figure 3 and the related explanation in lines 183-192. This explanation is way too complicated and it is not clear what Figure 3c shows. For instance: if the arrows are indeed showing contractile forces (as written in the caption) then they are not illustrated correctly, but should be tangential to the cell membrane.

      We updated the figure.

      (4) The figures mostly show steady-state cross-sections from simulations. I miss a more dedicated study with model parameters being varied through wider ranges and some phase diagrams being shown etc. Also, some results could probably be supported by analytic calculations. For instance, the condition for stability (discussed in p4 lines 145-151), cells' preferred aspect ratio, cells' preferred "wedgeness" i.e., local curvature etc... I am sure some of these, if not all, could be calculated analytically and then these analytic results could help to interpret the phase diagrams.

      For the simulation results shown in the figures, we were not sure if the simulations results were in a steady state or not. We added it to Tissue deformation by increased apical contractility simulated with cellular Potts model section in the Results as below.

      P4L170: “A simulation started from a flat monolayer of cells beneath the apical ECM, and was continued until resulting deformation of cells and tissue could be evaluated for success of failure of reproducing the apical constriction.”

      For the ranges of parameters, we ran the simulation in wider range and showed results from sub-range. We added it to Parameters for the simulations section in Methods as below.

      P18L464: “The parameters were varied in a range, and the figures showed simulations with parameter values within a sub-range so that the results showed both success and failure in a development of interest.”

      For the analytical calculations, the Figure 3f shows a kind of phase diagram for shapes of a single cell. To clarify this, we rephrased “map of cell shapes” to “Phase diagram of cell shapes” in the figure legend, and added an explanation to the Results section as below.

      P6L207: “For the analysis of the cell shape in motion, we plotted a phase diagram for shapes of a single cell (Figure 3f).”

      For the analytical evaluation of the cellular Potts model simulations, there was a study doing similar but it concerned a cell of isotropic shape in a steady state (Magno et al., 2015, doi:10.1186/s13628-015-0022-x). Also, our simulation framework is computationally expensive and we could not vary the parameters in fine resolution. Therefore we could not include it in this study.

      (5) I am not sure about the terminology "contractility" vs. "elasticity". In Farhadifar et al. (2007) "contractility" is described by a squared apical-perimeter energy term, while in this work, the authors describe it by a surface-energy-like term.

      In general, elasticity is the ability of a material to resist against deformation and to return to its original shape/size. In Farhadifar et al. (2007), the cell apical area was assigned the area elasticity in this meaning. For the contractility, it is the ability to decrease the size/length, and thus it could be either expressed in linear or quadratic dependent on the modeling. In this study, we assumed cell-cell/cell-ECM adhesion and myosin activity to generate the surface contractility, and thus employed the linear expression. In Farhadifar et al. (2007) it was described as a line tension.

      We used the terms surface ‘elasticity’ and ‘contractility’ as distinctive elements composing the surface ‘tension’. We added it Extended cellular Potts model to simulate epithelial deformations section in the Results as below.

      P3L122: “Note that in some studies the tension and the contractility are considered as equivalent, but they are distinguished in this study.”

      (6) It is not entirely clear what are apical, basal, lateral, and cell "perimeters". This is a 2D model, so I assume all P-s are in fact interface lengths. In either case, this needs to be explained more clearly.

      We updated the explanation in Extended cellular Potts model to simulate epithelial deformations section in the Results as below.

      P3L111: “The cell's perimeter was partitioned automatically based on adjacency with other cells, and it was marked as apical, lateral, basal. Also, apico-lateral sites were marked as a location for the adherens junction. This cell representation also cast the vertical section of the cell. Therefore an area of the cell corresponded with a body of the cell, and a perimeter of the cell corresponded with the cell surface. Likewise the apical, lateral, and basal parts of the perimeter corresponded with the apical surface, cell-cell interface, and the basal surface of the cell respectively.”

      (7) The term H_{mc} is not clear at all. Why is this term called potential energy? What is U(i)? What is the exact biophysical interpretation of this term in 2D vs 3D?

      In 3D, the supracellular myosin cable is formed encircling the cells deformed by the apical constriction. Shrinking of the supracellular myosin cable makes the circle small, and it moves the cable toward the center of the circle. To simulate this motion of the supracellular myosin cable in the 2D cross section, we assigned the force exerted on the adherens junction of the boundary cells pulling toward the center, and because the force is relative to the position of the adherens junction and the center, it was expressed by the potential energy in the simulation.

      We updated Extended cellular Potts model to simulate epithelial deformation section in Results and Cellular Potts model with potential energy section in Methods as below.

      P4L140: “The potential energy was defined by a scalar field which made a horizontal gradient decreasing toward the center,”

      and

      P17L449: “In 3D, tension on a circular actomyosin cable would shrink the circle, and the shrinkage would pull the cable toward the center of the circle. In 2D cross section, the cable is pulled horizontally toward the middle line.”

      (8) Highten->increased

      We updated the text.

      (9) "It seems natural to consider that the myosin generates a force proportional to its density but not to the surface width nor the strain". This sentence should be supported by a reference. Also, if the force is proportional to myosin density, then it must depend on surface width, since density, I assume, is the number of motors per area.

      For the myosin density and generated force, in all preceding studies cited in this manuscript and others in the extent of our knowledge, the myosin and actin filaments density visualized by staining or labeling had been assumed relevant to the generated contractility without references. Therefore it might be well established and shared assumption.

      For the independence from the surface width and strain, the review comment is correct, but the results would be the same. If we presumed that the number of motors on the apical surface was constant in a cell during the apical constriction, then the density would increase when the apical surface was contracted, and thus it would make the apical contractility more unbalanced and promote the delamination. We added it to the results and discussion as below.

      P4L166: “For the sake of simplicity, we ignored an effect of the constriction on the apical myosin density, and discussed it later.”

      P14L328: “In our model, for the sake of simplicity, we ignored an effect of the constriction on the apical myosin density. If we presumed that the apical myosin would be condensed by the shrinkage of the apical surface, it would increase the apical tension in the shrinking cell and is expected to promote the cell delamination further. Therefore it would not change the results.”

      Reviewing Editor (Recommendations For The Authors):

      Please note also the following excerpts from discussions amongst the reviewers and the Reviewing Editor:

      Regarding Reviewer #2's Point 2:

      I believe the authors have assumed patterned contractility in their simulations, and this is shown by the "pale blue" cell color (see also lines 162-163). However, as Reviewer #2 points out in their point 2), the pale colors are very hard to see and therefore easy to miss.

      We updated figure coloring and also add the gradient pattern of contractility.

      Regarding Reviewer #2's point 5:

      It is indeed unconventional to call the "J" terms contractility, they are usually called contact energy or adhesive energy.

      In this study, we included both of the contact energy of cell-cell/cell-ECM adhesion and actomyosin activity in the surface contractility, and used the “J” term as it was conventional in the cellular Potts model.

      On the other hand, due to the parameters chosen for J_apical and J_basal in the pale blue cells, the apical membrane area will tend to shrink and the basal membrane will tend to enlarge. Because the lateral membrane energy J_lateral is constant among all cells (I think?), this will effectively drive cells to apically contract in the center.

      That expectation was an initial motivation of our study, but we found that the differential J alone could not drive the cells to apically contract in the center.

      I agree that extra clarification by the authors would be very helpful here.

      Reviewer #2:

      Regarding the patterned contractility: indeed, I missed this point (the pale blue region is really poorly visible).

      Nevertheless, it seems that contractility in the authors' model changes in a step-like fashion.

      [...] There may be important differences between furrowing under step-like patterning profile versus smooth "bell-like" patterning (see Supplementary Figure 13 in Rauzi et al. Nat Commun 2015). In particular, in the case of a step-like patterning, [there are] constrictions of side cells (similar to what the authors in this manuscript report), whereas in the bell-like patterning, [...] such side constrictions [do not occur].

      As replied to the reviewer #2 comment (2), we added the simulations with gradient-pattern contractility.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We extend our sincere gratitude to the reviewers for their constructive feedback and valuable suggestions, which have significantly contributed to enhancing the quality of our work. In response to the comments, we have meticulously revised our manuscript with the following updates:

      (1) New Data Inclusion: We have incorporated new immunofluorescent staining images, FACS analysis of monocytes, and single-cell RNA sequencing (scRNAseq) expression analysis focusing on genes related to IFNGR, as well as T cell memory subsets (Trm, Tcm, and Tem).

      (2) Comparative Analysis: We have conducted a comparative analysis between the active vitiligo dFBs and the ACD pAd (r5) identified in our study, which provides further insight into the immune response mechanisms.

      (3) Discussion Expansion: We have expanded the discussion to include the role of tissue-resident memory (Trm) T cells in our model and have addressed the limitations of our animal model and in vitro studies.

      (4) Supplemental Material: As requested by the reviewers, we have provided four new supplemental tables (Table S2 ~ S5) and specific information for antibodies used in our study.

      Please see our Point-to-Point Responses to Reviewers' comments below:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Liu et al. used scRNA-seq to characterize cell type-specific responses during allergic contact dermatitis (ACD) in a mouse model, specifically the hapten-induced DNFB model. Using the scRNA-seq data, they deconvolved the cell types responsible for the expression of major inflammatory cytokines such as IFNG (from CD4 and CD8 T cells), IL4/13 (from basophils), IL17A (from gd T cells), and IL1B from neutrophils and macrophages. They found the highest upregulation of a type 1 inflammatory response, centering around IFNG produced by CD4 and CD8 T cells. They further identified a subpopulation of dermal fibroblasts that upregulate CXCL9/10 during ACD and provided functional genetic evidence in their mouse model that disrupting IFNG signaling to fibroblasts decreases CD8 T cell infiltration and overall inflammation. They identify an increase in IFNG-expressing CD8 T cells in human patient samples of ACD vs. healthy control skin and co-localization of CD8 T cells with PDGFRA+ fibroblasts, which suggests this mechanism is relevant to human ACD. This mechanism is reminiscent of recent work (Xu et al., Nature 2022) showing that IFNG signaling in dermal fibroblasts upregulates CXCL9/10 to recruit CD8 T cells in a mouse model of vitiligo. Overall, this is a very wellpresented, clear, and comprehensive manuscript. The conclusions of the study are mostly well supported by data, but some aspects of the work could be improved by additional clarification of the identity of the cell types shown to be involved, including the exact subpopulation discovered by scRNA-seq and the subtype of CD8 T cell involved. The study was limited by its use of one ACD model (DNFB), which prevents an assessment of how broadly relevant this axis is. The human sample validation is slightly circumstantial and limited by the multiplexing capacity of immunofluorescence markers.

      Strengths:

      Through deep characterization of the in vivo ACD model, the authors were able to determine which cell types were expressing the major cytokines involved in ACD inflammation, such as IFNG, IL4/13, IL17A, and IL1B. These analyses are well-presented and thoughtful, showing first that the response is IFNG-dominant, then focusing on deeper characterization of lymphocytes, myeloid cells, and fibroblasts, which are also validated and complemented by FACS experiments using canonical markers of these cell types as well as IF staining. Crosstalk analyses from the scRNA-seq data led the authors to focus on IFNG signaling fibroblasts, and in vitro experiments demonstrate that CXCL9 and CXCL10 are expressed by fibroblasts stimulated by IFNG. In vivo functional genetic evidence demonstrates an important role for IFNG signaling in fibroblasts, as KO of Ifngr1 using Pdgfra-Cre Ifngr1 fl/fl mice, showed a reduction in inflammation and CD8 T cell recruitment.

      Weaknesses:

      (1) The use of one model limits an understanding of how broad this fibroblast-T cell axis is during ACD. However, the authors chose the most commonly employed model and cited additional work in a vitiligo model (another type 1 immune response).

      We thanks the reviewer for pointing out this limitation. Although the DNFB-elicited ACD model is the most commonly used animal model for ACD, our study is limited by the use of only one type 1 immune response model. We have now added new data (Figure 5-figure supplement 1A) showing that the active ACD pAd (r5) and the active IFNγ-responsive vitiligo dFBs (Xu et al., 2022) are enriched with a highly similar panel of IFNγ-inducible genes. Future studies are still needed to determine whether this fibroblast-T cell axis may be broadly applied to other ACD models or to other type-1 immune response-related inflammatory skin diseases.

      (2) The identity of the involved fibroblasts and T cells in the mouse model is difficult to assess as scRNA-seq identified subpopulations of these cell types, but most work in the Pdgfra-Cre Ifngr1 fl/fl mice used broad markers for these cell types as opposed to matched subpopulation markers from their scRNA-seq data.

      Thanks for the reviewer's constructive comments. To better showcase the dWAT layer where PDGFRA+ pAds are enriched, we have included new histological staining and PLIN1 (adipocyte marker) in new Figure 4 - figure supplement 1F-G. As shown in Figure 4 - figure supplement 1G, the PLIN1+ dWAT layer is located in the lower dermis right above the cartilage layer.  In Figure 4-figure supplement 1I and J, we have shown that phosphor-STAT1 (pSTAT1), a key signaling molecule activated by IFNγ, was detected primarily in PDGFRA+Ly6A+ pAds in the lower dermis where dWAT is located. In addition, we have now included new data showing that the pAd (dFB_r5) cluster preferentially expressed the highest levels of both Ifngr1 and Ifngfr2 among all dFB subclusters (new Figure 5 - figure supplement 1B). Furthermore, we have included new co-staining data showing that CXCL9 largely co-localized with ICAM1(new Figure 4 - figure supplement 1K), a marker for committed pAds (Merrick et al., 2019), in the reticular dermis and dWAT region of the ACD skin, further confirming that CXCL9 is specifically induced in the pAd subset of dFBs. Additionally, we included new staining data showing that ACD-mediated induction of CXCL9 in ICAM1+ dFBs were largely suppressed upon targeted deletion of Ifngr1 in Pdgfra+ dFBs (new Figure 6 - figure supplement 1D-E).

      (3) Human patient samples of ACD were co-stained with two markers at a time, demonstrating the presence of CD8+IFNG+ T cells, PDGFRA+CXCL10+ fibroblasts, and co-localization of PDGFRA+ fibroblasts and CD8+ T cells. However, no IF staining demonstrates co-expression of all 4 markers at once; thus, the human validation of co-localization of CD8+IFNG+ T cells and PDGFRA+CXCL10+ fibroblasts is ultimately indirect, although not a huge leap of faith. Although n=3 samples of healthy control and ACD samples are used, there is no quantification of any results to demonstrate the robustness of differences.

      Thanks for the reviewer’s constructive comments. We have shown that PDGFRA colocalizes with CXCL10, in the dermal micro-vascular structures, where CD8+ T cells infiltrate around PDGFRA+ dFBs. We are sorry that due to technical issues (antibody compatibility), we cannot provide the four color co-staining as suggested by the reviewers. In order to demonstrate the robustness and reproducibility of the staining presented, we have now supplemented 4 independent images for both Fig. 7A and Fig. 7E in the updated Figure 7-figure supplement 1A-B.

      Reviewer #2 (Public Review):

      Summary:

      The investigators apply scRNA seq and bioinformatics to identify biomarkers associated with DNFB-induced contact dermatitis in mice. The bioinformatics component of the study appears reasonable and may provide new insights regarding TH1-driven immune reactions in ACD in mice. However, the IF data and images of tissue sections are not clear and should be improved to validate the model.

      Strengths:

      The bioinformatics analysis.

      Weaknesses:

      The IF data presented in 4H, 6H, 7E and 7F are not convincing and need to be correlated with routine staining on histology and different IF markers for PDGFR. Some of the IF staining data demonstrates a pattern inconsistent with its target.

      We are sorry for the confusion, because 4H and 6H are staining on mouse skin sections, and 7E and 7F are staining on human skin sections, therefore the patterns of PDGFRA+ dFBs appeared inconsistent between species. As shown in Fig. 4H, in mouse skin, PDGFRA+CXCL9/10+ dFBs are located between the lower reticular dermis and dWAT region, where preadipocytes are located (Sun et al., 2023). To better showcase the dWAT layer where PDGFRA+ pAds are enriched, we have included new histological staining and PLIN1 (adipocyte marker) in new Figure 4 - figure supplement 1F-G. As shown in Figure 4 - figure supplement 1G, the PLIN1+ dWAT layer is located in the lower dermis right above the cartilage layer. Furthermore, we have included new co-staining data showing that CXCL9 largely co-localized with ICAM1(new Figure 4 - figure supplement 1K), a marker for committed pAds (Merrick et al., 2019), in the reticular dermis and dWAT region of the ACD skin, further confirming that CXCL9 is specifically induced in the pAd subset of dFBs.   

      As shown in Fig. 7E, in human skin, PDGFRA+CXCL10+ dFBs are located within the microvascular structures located at the dermal-epidermal junction (DEJ) region, where mesenchymal stem cells are enriched (Russell-Goldman & Murphy, 2020). We have included the corresponding HE histological staining image for Fig. 4H in new Figure 4-supplement 1F. Histological staining for Fig. 6H is the HE staining image in Fig. 6F. The histological staining for Fig. 7E and 7F is shown by Masson’s trichrome staining shown in Fig. 7C (a three-colour histological staining).

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Major comments:

      (1) While the focus on fibroblast and T cell interactions and overall biological findings regarding these interactions (IFNG - CXCL9/10 - CXCR3) is sound, it is slightly confusing about which exact subpopulations of these cells are involved in ACD pathogenesis as both scRNA-seq and IF are used but very broad markers are used for IF. Regarding fibroblasts, the scRNA-seq identifies the pAd (r5) cluster of fibroblasts as the main producer of CXCL9/10. However, the expression of IFNGR1 was not shown for this subpopulation as well as for other fibroblast subpopulations. Figure 6C shows IFNGR1 staining in the Ifngr1 fl/fl control mice which appears quite broad. With the seemingly broad expression of IFNGR1, why is it that only a subpopulation of fibroblasts upregulate CXCL9/10? Is there a specific location of these pAd fibroblasts that help drive this IFNG response? Please show the expression of Ifngr1 in the fibroblast scRNA-seq data.

      Thanks for the reviewer’s constructive comments. We have now included new data showing that the pAd (dFB_r5) cluster preferentially expressed higher levels of both Ifngr1 and Ifngfr2 among all dFB subclusters (new Figure 5 - figure supplement 1B). In addition, we included new co-staining data showing that CXCL9 largely co-localized with ICAM1, a marker for committed pAds (Merrick et al., 2019), in the reticular dermis and dWAT region of the ACD skin, further confirming that CXCL9 is specifically induced in the pAd subset of dFBs.

      (2) Regarding T cells, it is slightly confusing regarding what role the fibroblast-produced CXCL9/10 plays on T cell migration vs. activation. This is mainly because in vitro work focuses on T cell activation, while in vivo work seems to mainly assess T cell migration into the tissue. The in vivo studies have nicely shown that CD8 T cells are the main cell type affected by Ifngr1 iKO (i.e., a reduction of these cells), but T cell activity in vivo is not assessed (in the form of IFNG production). I have the following related questions:

      a. Authors do not discuss whether T cells involved in ACD in their model are tissue-resident memory T cells (Trm) or whether these are recruited from circulation. This may be possible to assess via additional analysis of the scRNA-seq data (looking for expression of Trm markers). 

      Thanks for the reviewer’s constructive comments. We have now included new data showing the expression of marker genes of various memory T cells in various T cell subclusters (new Figure 2 - figure supplement 1C-D). Antigen-specific CD8 or CD4 memory T cells can be classified into CD62hi/CCR7hi/CD28hi/CD27hi/CX3CR1lo central memory T cells (Tcm), CX3CR1hi/Cd28hi/Cd27lo/CD62lo/CCR7lo effector memory T cells (Tem), and CD49ahi/CD103hi/ CD69hi/BLIMP1hi tissue-resident memory T cells (Trm) (Benichou, Gonzalez, Marino, Ayasoufi, & Valujskikh, 2017; Cheon, Son, & Sun, 2023; Mackay et al., 2013; Martin & Badovinac, 2018; Park et al., 2023). We observed that in ACD skin, CD4+ and CD8+ T cells predominantly expressed marker genes associated with Tcm including Cd28, Cd27, Ccr7, and S1pr1/Cd62l. In contrast, marker genes associated with Tem (Cx3cr1) and Trm (Itga1/Cd49a, Itgae/Cd103, Cd69 and Prdm1/Blimp1, Cd127/Il7r) were only scarcely expressed in these αβ T cells, suggesting that ACD predominantly triggers a central memory T cell response in the skin.

      Furthermore, this hypothesis is supported by new lymph node gene expression results. We showed that the expression of Ifng, but not Il4 or Il17a, was rapidly induced in skin draining lymph nodes at 24 hours after ACD elicitation (new Figure 1-figure supplement 1H). This suggests a robust and systemic activation of type 1 memory T cell response in the early stage of ACD, and the migration of these lymphatic memory T cells to the skin may contribute to the exacerbation of skin inflammation.

      b. Authors have focused on CXCR3 axis involvement in IFNG production (Figures 5G-H) without assessing the presumed migratory role of this axis. Presumably, CD8 T cells are recruited to the skin via the CXCL9/10-CXCR3 axis, but this would be important to clarify given other work that has demonstrated Trm involvement in ACD. Authors should at least discuss how their model and findings support, refine, or even contradict the current paradigm of Trm involvement in ACD (Lefevre et al., 2021; PMID: 34155157).

      We are grateful for the constructive feedback provided by the reviewer. CXCR3 is a chemokine receptor on T cells and not only plays a pivotal role in the trafficking of type 1 T cells, but also is required for optimal generation of IFNG-secreting type 1 T cells in vivo (Groom et al., 2012). Our in vitro study is limited by only focusing on CXCL9/10-CXCR3 axis involvement in IFNγ production without studying its role in driving T cell migration. We have now addressed this limitation in the discussion section.

      In the murine model of ACD, the initial sensitization phase involves exposing mouse skin to a high dose of DNFB to prime effector T cells in lymphoid organs, and this is followed by a later challenge/elicitation phase, during which the mice are re-exposed to a lower dose of DNFB in a different area of the skin, distal from the original sensitization site (Manresa, 2021; Vocanson, Hennino, Rozieres, Poyet, & Nicolas, 2009). Our updated analysis of the expression of marker genes associated with central memory T cells (Tcm), effector memory T cells (Tem), and tissue-resident memory T cells (Trm), as presented in the revised Figure 2-figure supplement 1C-D, indicates that indicate that the type-1 inflammation observed upon ACD elicitation is predominantly driven by memory T cells recruited from lymphoid organs, rather than by skin resident memory T cells. We have read the reference provided by the reviewer along with a few other related studies indicating that Trm is involved in ACD. We found that these studies performed the elicitation phase on the same skin area where the initial sensitization is conducted, and only when it results in a rapid allergen-induced skin inflammatory response, that is primarily mediated by IL17A-producing and IFNγ-producing CD8+ skin resident memory T cells (Gadsboll et al., 2020; Murata & Hayashi, 2020; Schmidt et al., 2017; Wongchang et al., 2023). These studies suggest that Trm cells establish a long-lasting local memory during the initial sensitization, and upon re-exposure to the hapten in the same skin area, these site-specific Trm cells can rapidly contribute to a robust type-1 skin inflammatory response. Therefore, a robust involvement of Trm in ACD requires a repeated exposure of the same hapten to the same skin area. We have now added related discussion in the discussion section.

      c. While it may be difficult to assess given reduced numbers of CD8 T cells in the Ifngr1 iKO, is the CXCL9/10-CXCR3 axis affecting IFNG production by T cells in vivo?

      Yes, we have shown in Fig. 6G that ACD-mediated induction of Ifng was significantly suppressed in the Ifngr1-iKO mice compared to the control mice.

      (3) The authors cite prior work (Xu et al. Nature 2022) that demonstrated a similar mechanism for fibroblasts in recruiting vitiligo-inducing T cells. Are the pAd (r5) cluster of fibroblasts similar to the fibroblast subpopulation that drives vitiligo?

      The study on mouse model of vitiligo (Xu et al. Nature 2022) did not perform single-cell RNAseq of the vitiligo mouse skin. Instead, they conducted RNAseq analysis on the sorted PDGFRA+ dFBs. Therefore, we cannot directly compare our pAd (r5) cluster with the fibroblast subpopulation that drives vitiligo. Nevertheless, by utilizing a Venn diagram to compare the top 100 lFNγ signaling dependent genes upregulated in the active vitiligo mouse dFBs and the top 100 genes enriched in our ACD pAd (dFB_r5) cells, we identified 29 commonly upregulated genes between the two conditions (Figure 5-figure supplement 1A). Furthermore, all these 29 genes were among the top IFNγ-inducible genes in primary dFBs. These shared genes include CXCL9, CXCL10, and several other downstream targets of IFNγ signaling, such as B2M, BST2, CD274, as well as the GBP family members GBP3, GBP4, GBP5, GBP7, and additional genes like H2-K1, H2-Q4, H2-Q7, H2-T23, IFIT3, ISG15, and STAT1. This result suggests that the pAd (dFB_r5) cells possess a common IFNγ-pathway gene signature with the active vitiligo mouse dFBs, indicating a potential overlap in molecular pathways.

      (4) The authors should include bulk RNA-seq data from fibroblast stimulation (Figure 5b) at a minimum in the GEO submission. They should ideally include the differentially expressed genes in a supplementary table.

      Thanks for the reviewer’s constructive comments. We have now included the raw FPKM file for the bulk RNAseq data shown in Fig. 5 in Supplemental Table S3, and the list for differentially expressed genes in Supplemental Table S4.

      (5) The authors state that human sample stainings were n = 3 per group for healthy control and ACD (Figure 7), but no quantification or statistical testing is provided to demonstrate significant differences in findings such as co-localization of fibroblasts and T cells, IFNG+CD8+ T cells, etc.

      Thanks for the reviewer’s constructive comments. We have now supplemented 4 independent images for both Fig. 7A and Fig. 7E in the new Figure 7-figure supplement 1A-B to demonstrate the robustness and reproducibility of the staining presented.

      Minor comments:

      (1) Figure 1G, possible typos, Il14 and Il11b are on the violin plots when I believe authors meant Il4 and Il1b.

      Thank a lot for pointing out these typos. We have now made the correction in the updated manuscript figure 1.

      (2) The authors label cluster 27 as neutrophils based on the expression of Ly6g and S100a8. These markers are also expressed by Cd14+ inflammatory monocytes. I believe the authors need to additionally validate that these cells are neutrophils (via staining or additional analyses). Neutrophils are notoriously difficult to capture in scRNA-seq given low RNA content. Later, they are quantified by FACS using CD11b+Ly6G+ markers, but I do not believe this would distinguish them from CD14+ monocytes. As this is a relatively minor aspect of the manuscript, I consider this a minor concern, but a finding that should be as accurate as possible as Il1b is likely important, and identifying its accurate source likewise.

      Thanks a lot for reviewer’s constructive comments. According to the reviewer’s suggestion, we have now added Cd14 expression in Figure 1C, and found that indeed cluster 27 express not only expressed Ly6G but also expressed Cd14. Based on literatures, the expression of Ly6G in circulating blood, spleen, and peripheral tissues is limited to neutrophils, whereas monocytes, macrophages, and lymphocytes are negative of Ly6G (Ikeda et al., 2023; Lee, Wang, Parisini, Dascher, & Nigrovic, 2013). Therefore, Ly6G can be used as a marker to distinguish neutrophils and monocytes. Although CD14 is highly expressed in monocytes, neutrophils can also express CD14 at lower level (Antal-Szalmas, Strijp, Weersink, Verhoef, & Van Kessel, 1997). Therefore, the cluster 27 is likely a mixed population of neutrophils and monocytes. So we have changed the definition of this cluster as NEU/Mon in the updated manuscript.

      To confirm the presence of neutrophils and monocytes in ACD, we have included new FACS analysis of inflammatory monocytes, which are gated as CD11B+Ly6G-F4/80-CD11C-Ly6Chi, according to published FACS protocol(Rose, Misharin, & Perlman, 2012). We found that elicitation of ACD led to a transient influx of monocytes at 24 hrs post treatment, whereas the percentage of neutrophils continued to increase by 60 hours post-treatment (Figure 3L, and Figure 3-figure supplement 1G). In addition, at 60 hrs, the percentage of neutrophils (~5%) was > 10 times greater than the percentage of monocytes (~0.4%), indicating that neutrophils are the dominant granulocytes at 60 hours post ACD elicitation.

      (3) The authors should include a cluster marker table as a supplementary file to accompany Figure 1C. Only top cluster markers are shown in 1C.

      Thanks a lot for reviewer’s constructive comments. We have now included the top 5 enriched genes in each cell clusters shown in Fig. 1C in supplementary Table S2.

      (4) Figures 2A/B have mismatched labels. There is a gdT/ILC2 label in the 2B, but not in 2A. Please match these. Along these lines, which gdT cluster is the IL17A expressing cluster as shown in 1D? Matching these labels will clarify which population is doing what.

      Thanks a lot for reviewer to point out this mistake. To avoid confusion about the T cell clusters, we have added a specific recluster# for the T cell clusters as r0~r7 (Figure 2A-B). The r4 cluster is a mixed population of δγT and ILC2, therefore termed as δγT/ILC2. As shown in Figure 2-figure supplement 1E, IL17A is primarily expressed in the δγT cell (r5). We have now corrected δγT2 to δγT/ILC2 throughout the manuscript. To avoid confusion, we have now added cluster # in updated Figure 2D.

      (5) In Figure 3E, the authors used CD11B as a distinguishing marker for basophils (CD11B+) vs. mast cells (CD11B-). Mcpt8 is a better distinguishing marker, so I am wondering why the authors chose CD11B.

      Thanks a lot for reviewer’s comments. In scRNAseq, we did use Mcpt8 as a basophil specific marker to distinguish basophils and mast cells (see Figure 1C). However, Mcpt8 is not a surface receptor that can be used in FACS analysis. Therefore, to distinguish basophils from mast cells by FACS, we have to choose surface markers expressed on these cells. FcεR1a is a highly specific markers expressed exclusively on basophils and mast cells, and CD11B is expressed on basophils but not in mature mast cells (Hamey et al., 2021). As a result, FACS analysis of the surface expression of CD11B and FceR1a can distinguish basophils (CD11B+ FcεR1a+) from mast cells (CD11B- FcεR1a+). The use of CD11B and FcεR1a to distinguish basophils and mast cells can also been see in a published reference study (Arinobu et al., 2005).

      (6) Antibody information is missing for IF studies. No clones, catalog numbers, vendors, RRIDs, or dilutions are included in the Methods section for any of the IF data.

      Thanks a lot for reviewer’s constructive comments. We have now added related information for all the antibodies we used for FACS or IF data in the method section.

      (7) Figure 3 supplement E and F appear to be reversed based on legend descriptions.

      Thank a lot for pointing this out. We have now made the correction in the updated Supplementary file.

      References:

      Antal-Szalmas, P., Strijp, J. A., Weersink, A. J., Verhoef, J., & Van Kessel, K. P. (1997). Quantitation of surface CD14 on human monocytes and neutrophils. J Leukoc Biol, 61(6), 721-728. doi:10.1002/jlb.61.6.721

      Arinobu, Y., Iwasaki, H., Gurish, M. F., Mizuno, S., Shigematsu, H., Ozawa, H., . . . Akashi, K. (2005). Developmental checkpoints of the basophil/mast cell lineages in adult murine hematopoiesis. Proc Natl Acad Sci U S A, 102(50), 18105-18110. doi:10.1073/pnas.0509148102

      Benichou, G., Gonzalez, B., Marino, J., Ayasoufi, K., & Valujskikh, A. (2017). Role of Memory T Cells in Allograft Rejection and Tolerance. Front Immunol, 8, 170. doi:10.3389/fimmu.2017.00170

      Cheon, I. S., Son, Y. M., & Sun, J. (2023). Tissue-resident memory T cells and lung immunopathology. Immunol Rev, 316(1), 63-83. doi:10.1111/imr.13201

      Gadsboll, A. O., Jee, M. H., Funch, A. B., Alhede, M., Mraz, V., Weber, J. F., . . . Bonefeld, C. M. (2020). Pathogenic CD8(+) Epidermis-Resident Memory T Cells Displace Dendritic Epidermal T Cells in Allergic Dermatitis. J Invest Dermatol, 140(4), 806-815 e805. doi:10.1016/j.jid.2019.07.722

      Groom, J. R., Richmond, J., Murooka, T. T., Sorensen, E. W., Sung, J. H., Bankert, K., . . . Luster, A. D. (2012). CXCR3 chemokine receptor-ligand interactions in the lymph node optimize CD4+ T helper 1 cell differentiation. Immunity, 37(6), 1091-1103. doi:10.1016/j.immuni.2012.08.016

      Hamey, F. K., Lau, W. W. Y., Kucinski, I., Wang, X., Diamanti, E., Wilson, N. K., . . . Dahlin, J. S. (2021). Single-cell molecular profiling provides a high-resolution map of basophil and mast cell development. Allergy, 76(6), 1731-1742. doi:10.1111/all.14633

      Ikeda, N., Kubota, H., Suzuki, R., Morita, M., Yoshimura, A., Osada, Y., . . . Asano, K. (2023). The early neutrophil-committed progenitors aberrantly differentiate into immunoregulatory monocytes during emergency myelopoiesis. Cell Rep, 42(3), 112165. doi:10.1016/j.celrep.2023.112165

      Lee, P. Y., Wang, J. X., Parisini, E., Dascher, C. C., & Nigrovic, P. A. (2013). Ly6 family proteins in neutrophil biology. J Leukoc Biol, 94(4), 585-594. doi:10.1189/jlb.0113014

      Mackay, L. K., Rahimpour, A., Ma, J. Z., Collins, N., Stock, A. T., Hafon, M. L., . . . Gebhardt, T. (2013). The developmental pathway for CD103(+)CD8+ tissue-resident memory T cells of skin. Nat Immunol, 14(12), 1294-1301. doi:10.1038/ni.2744

      Manresa, M. C. (2021). Animal Models of Contact Dermatitis: 2,4-Dinitrofluorobenzene-Induced Contact Hypersensitivity. Methods Mol Biol, 2223, 87-100. doi:10.1007/978-1-0716-1001-5_7

      Martin, M. D., & Badovinac, V. P. (2018). Defining Memory CD8 T Cell. Front Immunol, 9, 2692. doi:10.3389/fimmu.2018.02692

      Merrick, D., Sakers, A., Irgebay, Z., Okada, C., Calvert, C., Morley, M. P., . . . Seale, P. (2019). Identification of a mesenchymal progenitor cell hierarchy in adipose tissue. Science, 364(6438). doi:10.1126/science.aav2501

      Murata, A., & Hayashi, S. I. (2020). CD4(+) Resident Memory T Cells Mediate Long-Term Local Skin Immune Memory of Contact Hypersensitivity in BALB/c Mice. Front Immunol, 11, 775. doi:10.3389/fimmu.2020.00775

      Park, S. L., Christo, S. N., Wells, A. C., Gandolfo, L. C., Zaid, A., Alexandre, Y. O., . . . Mackay, L. K. (2023). Divergent molecular networks program functionally distinct CD8(+) skin-resident memory T cells. Science, 382(6674), 1073-1079. doi:10.1126/science.adi8885

      Rose, S., Misharin, A., & Perlman, H. (2012). A novel Ly6C/Ly6G-based strategy to analyze the mouse splenic myeloid compartment. Cytometry A, 81(4), 343-350. doi:10.1002/cyto.a.22012

      Russell-Goldman, E., & Murphy, G. F. (2020). The Pathobiology of Skin Aging: New Insights into an Old Dilemma. Am J Pathol, 190(7), 1356-1369. doi:10.1016/j.ajpath.2020.03.007

      Schmidt, J. D., Ahlstrom, M. G., Johansen, J. D., Dyring-Andersen, B., Agerbeck, C., Nielsen, M. M., . . . Bonefeld, C. M. (2017). Rapid allergen-induced interleukin-17 and interferon-gamma secretion by skin-resident memory CD8(+) T cells. Contact Dermatitis, 76(4), 218-227. doi:10.1111/cod.12715

      Sun, L., Zhang, X., Wu, S., Liu, Y., Guerrero-Juarez, C. F., Liu, W., . . . Zhang, L. J. (2023). Dynamic interplay between IL-1 and WNT pathways in regulating dermal adipocyte lineage cells during skin development and wound regeneration. Cell Rep, 42(6), 112647. doi:10.1016/j.celrep.2023.112647

      Vocanson, M., Hennino, A., Rozieres, A., Poyet, G., & Nicolas, J. F. (2009). Effector and regulatory mechanisms in allergic contact dermatitis. Allergy, 64(12), 1699-1714. doi:10.1111/j.1398-9995.2009.02082.x

      Wongchang, T., Pluangnooch, P., Hongeng, S., Wongkajornsilp, A., Thumkeo, D., & Soontrapa, K. (2023). Inhibition of DYRK1B suppresses inflammation in allergic contact dermatitis model and Th1/Th17 immune response. Sci Rep, 13(1), 7058. doi:10.1038/s41598-023-34211-x

      Xu, Z., Chen, D., Hu, Y., Jiang, K., Huang, H., Du, Y., . . . Chen, T. (2022). Anatomically distinct fibroblast subsets determine skin autoimmune patterns. Nature, 601(7891), 118-124. doi:10.1038/s41586-021-04221-8

    1. Author response:

      The following is the authors’ response to the original reviews.

      Main points:

      (1) We have added data for fructose in Fig. 1

      (2) We have added sta1s1cs (red stars and NS) comparing Tp between fed and refed flies. 

      (3) We have modified the figure for each point to the opened small circles.

      (4) We have moved the data from Fig. S3 to Fig. 2 and 3.

      (5) We have added the schema1c diagrams depic1ng behavioral assay in Fig. S1.

      (6) We have added heatmaps for WT and Gr64f-Gal4>UAS-CsChrimson flies in Fig. S2.

      (7) We have added Orco1 mutant data in Fig. S4.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This paper presents valuable findings that gustation and feeding state influence the preferred environmental temperature preference in flies. Interestingly, the authors showed that by refeeding starved animals with the non-nutritive sugar sucralose, they are able to tune their preference towards a higher temperature in addition to nutrient-dependent warm preference. The authors show that temperature-sensing and sweet-sensing gustatory neurons (SGNs) are involved in the former but not the latter. In addition, their data indicate that pep3dergic signals involved in internal state and clock genes are required for taste-dependent warm preference behavior.

      The authors made an analogy of their results to the cephalic phase response (CPR) in mammals where the thought, sight, and taste of food prepare the animal for the consumption of food and nutrients. They further linked this behavior to core regulatory genes and peptides controlling hunger and sleep in flies having homologues in mammals. These valuable behavioral results can be further inves3gated in flies with the advantage of being able to dissect the neural circuitry underlying CPR and nutrient homeostasis.

      Strengths: 

      (1) The authors convincingly showed that tasting is sufficient to drive warm temperature preference behavior in starved flies and that it is independent of nutrient-driven warm preference. 

      (2) By using the genetic manipulation of key internal sensors and genes controlling internal feeding and sleep states such as DH44 neurons and the per genes for example, the authors linked gustation and temperature preference behavior control to the internal state of the animal. 

      Weaknesses: 

      (1) The title is somewhat misleading, as the term homeostatic temperature control linked to gustation only applies to starved flies. 

      We agree with the reviewer's suggestion and have changed the title to "Taste triggers a homeostatic temperature control in hungry flies".

      (2) The authors used a temperature preference assay and refeeding for 5 minutes, 10 minutes, and 1 hour.

      Experimentally, it makes a difference if the flies are tested immediately after 10 minutes or at the same 3me point as flies allowed to feed for 1 hour. Is 10 minutes enough to change the internal state in a nutrition-dependent manner? Some of the authors' data hint at it (e.g. refeeding with fly food for 10 minutes), but it might be relevant to feed for 5/10 minutes and wait for 55/50min to do the assays at comparable time points. 

      Thank you for your suggestions. The temperature preference behavioral test itself takes 30 minutes from the time the flies are placed in the apparatus until the final choice is made. This means that after the hungry flies have been refed for 5 minutes, they will determine their preferred temperature within 35 minutes. It has been shown that insulin levels peak at 10 minutes and gradually decline (Tsao, et al., PLoS Genetics 2023). However, it is unclear how subtle insulin levels affect behavior and how quickly the flies are able to consume food. These factors may contribute to temperature preference in flies. Therefore, to minimize "extraneous" effects, we decided to test the behavioral assay immediately after they had eaten the food. We have noted in the material and method section that why we chose the condition based on behavior duration and insulin effect. 

      (3) A figure depicting the temperature preference assay in Figure 1 would help illustrate the experimental approach. It is also not clear why Figure 1E is shown instead of full statistics on the individual panels shown above (the data is the same). 

      We have revised Figure 1A and added statistics in Figure 1BCD. We also added a figure depicting the temperature preference assay (Fig. S1).

      (4) The authors state that feeding rate and amount were not changed with sucralose and glucose. However, the FLIC assay they employed does not measure consumption, so this statement is not correct, and it is unclear if the intake of sucralose and glucose is indeed comparable. This limits some of the conclusions. 

      We agree and removed “amount” and have revised the MS. 

      (5) The authors make a distinction between taste-induced and nutrient-induced warm preference. Yet the statistics in most figures only show the significance between the starved and refed flies, not the fed controls. As the recovery is in many cases incomplete and used as a distinction of nutritive vs nonnutritive signals (see Figure 1E) it will be important to also show these additional statistics to allow conclusions about how complete the recovery is. 

      We agree with the comments and have revised the MS and figures. 

      (6) The starvation period used is ranging from 1 to 3 days, as in some cases no effect was seen upon 1 day of starvation (e.g. with clock genes or temperature sensing neurons). While the authors do provide a comparison between 18-21 and 26-29 hours old flies in Figure S1, a comparison for 42-49 and 66-69 hours of starvation is missing. This also limits the conclusion as the "state" of the animal is likely quite different after 1 day vs. 3 days of starvation and, as stated by the authors, many flies die under these conditions.  

      We mainly used 2 overnights of starvation.  Some flies (e.g. Ilp6 mutants) were completely healthy even after 2 overnights of starvation, we had to starve them for 3 overnights. For example, Ilp6 mutants needed 3 overnights of starvation to show a significant difference Tp between fed and starved flies. On the other hand, some flies (e.g. w1118 control flies) were very sick after 2 overnights of starvation, we had to starve them for one overnight. Therefore, the starvation conditions which we used for this manuscript are from 1- 3-overnights.

      First, we confirmed the starvation time by focusing on Tp which resulted in a sta1s1cally significant Tp difference between fed and starved flies; as men1oned above, flies prefer lower temperatures when starvation is prolonged (Umezaki et al., Current Biology 2018). Therefore, if Tp was not statistically different between fed and starved flies, we extended the starva1on 1me from 1 to 3 overnights. Importantly, we show in Fig. S3 that the dura1on of starvation did not affect the recovery effect. Furthermore, since control flies do not survive 42-49 or 66-69 hours of starvation, we can not test the reviewer's suggestion. We have carefully documented the conditions in the Material and method and figure legends.

      (7) In Figure 2, glucose-induced refeeding was not tested in Gr mutants or silenced animals, which would hint at post-ingestive recovery mechanisms related to nutritional intake. This is only shown later (in Figure S3) but I think it would be more fitting to address this point here. The data presented in Figure S3 regarding the taste-evoked vs nutrient-dependent warm preference is quite important while in some parts preliminary. It would nonetheless be justified to put this data in the main figures. However, some of the conclusions here are not fully supported, in part due to different and low n numbers, which due to the inherent variability of the behavior do not allow statistically sound conclusions. The authors claim that sweet GRNs are only involved in taste-induced warm preference, however, glucose is also nutritive but, in several cases, does not rescue warm preference at all upon removal of GRN function (see Figures S3A-C). This indicates that the Gal4 lines and also the involved GRs are potentially expressed in tissues/neurons required for internal nutrient sensing. 

      Thank you for your suggestion. We have added Figure S3ABC (glucose refeeding using Gr mutants and silenced animals) to Figure 2. There is no low N number since we tested > 5 times, i.e. >100 flies were tested. Tp may have a variation probably due to the effect of starvation on their temperature preference. 

      We did not mention that "The authors claim that sweet GRNs are only involved in taste-induced warm preference...". However, our wri1ng may not be clear enough. We agree that "...GRs may be expressed in tissues/neurons required for internal nutrient sensing. ..."  We have rewritten and revised the section.  

      (8) In Figure 4, fly food and glucose refeeding do not fully recover temperature preference after refeeding. With the statistical comparison to the fed control missing, this result is not consistent with the statement made in line 252. I feel this is an important point to distinguish between state-dependent and taste/nutrition-dependent changes.  

      We inserted the statistics and compared between Fed and other conditions. 

      (9) The conclusion that clock genes are required for taste-evoked warm preference is limited by the observation that they ingest less sucralose. In addition, the FLIC assay does not allow conclusions about the feeding amount, only the number of food interactions. Therefore, I think these results do not allow clear-cut conclusions about the impact of clock genes in this assay.  

      We agree and remove “amount” and have revised the MS. The per01 mutants ate (touched) sucralose more often than glucose. On the other hand, 1m01 mutants ate glucose more often than sucralose (Figure S6BC). However, these mutants s1ll showed a similar TP pattern for sucralose and glucose refeeding (Fig. 5CD). The results suggest that the 1m01 flies eat enough amount of sucralose over glucose that their food intake does not affect the TP behavioral phenotype. We have rewritten and revised the section.

      (10) CPR is known to be influenced by taste, thought, smell, and sight of food. As the discussion focused extensively on the CPR link to flies it would be interesting to find out whether the smell and sight of food also influence temperature preference behavior in animals with different feeding states.  

      We have added the data using Olfactory receptor co-receptor (Orco1) mutant, which lack olfaction, in Fig. S4. They failed to show the taste-evoked warm preference, but exhibited the nutrient-induced warm preference. Therefore, the data suggest that olfactory detection is also involved in taste-evoked warm preference. On the other hand, "seeing food" is probably more complicated, since light dramatically affects temperature preference behavior and the circadian clock that regulates temperature preference rhythms. Therefore, it will not be unlikely to draw a solid conclusion from the short set of experiments. We will address this issue in the next study.

      (11) In the discussion in line 410ff the authors claim that "internal state is more likely to be associated with taste-evoked warm preference than nutrient-induced warm preference." This statement is not clear to me, as neuropeptides are involved in mediating internal state signals, both in the brain itself as well as from gut to brain. Thus, neuropeptidergic signals are also involved in nutrient-dependent state changes, the authors might just not have identified the peptides involved here. The global and developmental removal of these signals also limits the conclusions that can be drawn from the experiments, as many of these signals affect different states, circuits, and developmental progression.  

      We agree with the comments. We have removed the sentences and revised the MS.  

      Reviewer #2 (Public Review): 

      Animals constantly adjust their behavior and physiology based on internal states. Hungry animals, desperate for food, exhibit physiological changes immediately upon sensing, smelling, or chewing food, known as the cephalic phase response (CPR), involving processes like increased saliva and gastrointestinal secretions. While starvation lowers body temperature, the mechanisms underlying how the sensation of food without nutrients induces behavioral responses remain unclear. Hunger stress induces changes in both behavior and physiological responses, which in flies (or at least in Drosophila melanogaster) leads to a preference for lower temperatures, analogous to the hunger-driven lower body temperature observed in mammals. In this manuscript, the authors have used Drosophila melanogaster to investigate the issue of whether taste cues can robustly trigger behavioral recovery of temperature preference in starving animals. The authors find that food detection triggers a warm preference in flies. Starved flies recover their temperature preference after food intake, with a distinction between partial and full recovery based on the duration of refeeding. Sucralose, an artificial sweetener, induces a warm preference, suggesting the importance of food-sensing cues. The paper compares the effects of sucralose and glucose refeeding, indicating that both taste cues and nutrients contribute to temperature preference recovery. The authors show that sweet gustatory receptors (Grs) and sweet GRNs (Gustatory Receptor Neurons) play a crucial role in taste-evoked warm preference. Optogenetic experiments with CsChrimson support the idea that the excitation of sweet GRNs leads to a warm preference. The authors then examine the internal state's influence on taste-evoked warm preference, focusing on neuropeptide F (NPF) and small neuropeptide F (sNPF), analogous to mammalian neuropeptide Y. Mutations in NPF and sNPF result in a failure to exhibit taste-evoked warm preference, emphasizing their role in this process. However, these neuropeptides appear not to be critical for nutrient-induced warm preference, as indicated by increased temperature preference during glucose and fly food refeeding in mutant flies. The authors also explore the role of hunger-related factors in regula3ng taste-evoked warm preference. Hunger signals, including diuretic hormone (DH44) and adipokinetic hormone (AKH) neurons, are found to be essential for taste-evoked warm preference but not for nutrient-induced warm preference. Additionally, insulin-like peptides 6 (Ilp6) and Unpaired3 (Upd3), related to nutritional stress, are identified as crucial for taste-evoked warm preference. The investigation then extends into circadian rhythms, revealing that taste-evoked warm preference does not align with the feeding rhythm. While flies exhibit a rhythmic feeding pattern, taste-evoked warm preference occurs consistently, suggesting a lack of parallel coordination. Clock genes, crucial for circadian rhythms, are found to be necessary for taste-evoked warm preference but not for nutrient-induced warm preference. 

      Strengths: 

      A well-written and interesting study, investigating an intriguing issue. The claims, none of which to the best of my knowledge controversial, are backed by a substantial number of experiments. 

      Weakness: 

      The experimental setup used and the procedures for assessing the temperature preferences of flies are rather sparingly described. Additional details and data presentation would enhance the clarity and replicability of the study. I kindly request the authors to consider the following points: 

      i) A schematic drawing or diagram illustrating the experimental setup for the temperature preference assay would greatly aid readers in understanding the spatial arrangement of the apparatus, temperature points, and the positioning of flies during the assay. The drawing should also be accompanied by specific details about the setup (dimensions, material, etc). 

      Thank you for your suggestions. We have added the schematic drawing in Fig. S1.

      ii) It would be beneficial to include a visual representation of the distribution of flies within the temperature gradient on the apparatus. A graphical representation, such as a heatmaps or histograms, showing the percentage of flies within each one-degree temperature bin, would offer insights into the preferences and behaviors of the flies during the assay. In addition to the detailed description of the assay and data analysis, the inclusion of actual data plots, especially for key findings or representative trials, would provide readers with a more direct visualization of the experimental outcomes. These additions will not only enhance the clarity of the presented information but also provide the reader with a more comprehensive understanding of the experimental setup and results. I appreciate the authors' attention to these points and look forward to the potential inclusion of these elements in the revised manuscript. 

      Thank you for the advice. We have added the heat map for WT and Gr64fGal4>CsChrimson data in Fig. S2. 

      Reviewer #3 (Public Review): 

      Summary: 

      The manuscript by Yujiro Umezaki and colleagues aims to describe how taste stimuli influence temperature preference in Drosophila. Under starvation flies display a strong preference for cooler temperatures than under fed conditions that can be reversed by refeeding, demonstrating the strong impact of metabolism on temperature preference. In their present study, Umezaki and colleagues observed that such changes in temperature preference are not solely triggered by the metabolic state of the animal but that gustatory circuits and peptidergic signalling play a pivotal role in gustation-evoked alteration in temperature preference. 

      The study of Umezaki is definitively interesting and the findings in this manuscript will be of interest to a broad readership. 

      Strengths: 

      The authors demonstrate interesting new data on how taste input can influence temperature preference during starvation. They propose how gustatory pathways may work together with thermosensitive neurons, peptidergic neurons and finally try to bridge the gap between these neurons and clock genes. The study is very interesting and the data for each experiment alone are very convincing. 

      Weaknesses: 

      In my opinion, the authors have opened many new questions but did not fully answer the initial question - how do taste-sensing neurons influence temperature preferences? What are the mechanisms underlying this observation? Instead of jumping from gustatory neurons to thermosensitive neurons to peptidergic neurons to clock genes, the authors should have stayed within the one question they were asking at the beginning. How does sugar sensing influence the physiology of thermos-sensation in order to change temperature preference? Before addressing all the following question of the manuscript the authors should first directly decipher the neuronal interplay between these two types of neurons. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Figure S3D is cited before S2, so please rearrange the numbering.

      Thank you. We have changed the numbering.

      I would also suggest a different color to visualize the data points in Figure S3, as some are barely visible on the dark bars (e.g. on a dark green background). 

      We have revised the figures. The data points were changed to smaller opened circles. 

      Reviewer #2 (Recommendations For The Authors): 

      *Please, expand on the experimental procedure, and describe the assay in detail. 

      We have added a scheme for the assay in Fig. S1 and also have revised the manuscript and figures.

      *Show the distribution of the gradient data that the preference values are based upon. Not necessarily for all, but for select key experiments. Heatmaps for each replicate (stacked on top of each other) would be a nice way of showing this. Simple histograms would of course work as well. 

      We have added heatmaps of selected key experiments that were added in Fig. S2. We have revised the manuscript and figures, correspondingly.

      Reviewer #3 (Recommendations For The Authors  

      The manuscript by Yujiro Umezaki and colleagues aims at describing how taste stimuli influence temperature preference in Drosophila. Under starvation, flies display a strong preference for cooler temperatures than under-fed conditions that can be reversed by refeeding, demonstrating the strong impact of metabolism on temperature preference. In their present study, Umezaki and colleagues observed that such changes in temperature preference are not solely triggered by the metabolic state of the animal but that gustatory circuits play a pivotal role in temperature preference. The study of Umezaki is definitively interesting and the findings in this manuscript will be of interest to a broad readership. However, I would like to draw the authors' attention to some points of concern: 

      The title to me sounds somehow inadequate. The definition of homeostasis (Cambridge Dictionary) is as follows: "the ability or tendency of a living organism, cell, or group to keep the conditions INSIDE it the same despite any changes in the conditions around it, or this state of internal balance". What do the authors mean by homeostatic temperature control? Reading the title not knowing much about poikilotherm insects I would understand that the authors claim that Drosophila can indeed keep a temperature homeostasis as mammals do. As Drosophila is not a homoiotherm animal and thus cannot keep its body temperature stable the title should be amended.  

      Homeostasis means a state of balance between all the body systems necessary for the body to survive and function properly. Drosophila are ectotherms, so the source of temperature comes from the environment, and their body temperature is very similar to that of their environment. However, the flies' temperature regulation is not simply a passive response to temperature. Instead, they actively seek a temperature based on their internal state. We have shown that the preferred temperature increases during the day and decreases during the night, showing a circadian rhythm of temperature preference (TPR). Because their environmental temperature is very close to their body temperature, TPR gives rise to body temperature rhythms (BTR). We have shown that TPR is similar to BTR in mammals. (Kaneko et al., Current Biology 2012 and Goda et al., JBR 2023). Similarly, we showed that the hungry flies choose a lower temperature so that the body temperature is also lower. Therefore, our data suggest that the fly maintains its homeostasis by using the environmental temperature to adjust its body temperature to an appropriate temperature depending on its internal state. Therefore, I would like to keep the title as "Taste triggers a homeostatic temperature control in hungry flies" We have added more explana1on in the Introduc1on and Discussion.

      Accordingly, the authors compare the preference of flies to cooler temperatures to the reduced body temperature of mammals (Lines 64 - 65). However, according to the cited literature the reduced body temperature in starved rats is discussed to reduce metabolic heat production (Sakurada et al., 2000). The authors should more rigorously give a short summary of the findings in the cited papers and the original interpretation to help the reader not get confused.

      In flies, it has been shown that a lower temperature means a lower metabolic rate, and a higher temperature means a higher metabolic rate. Therefore, hungry flies choose a lower temperature where their metabolic rate is lower and they do not need as much heat.

      Similarly, in mammals, starvation causes a lower body temperature, hypothermia. Body temperature is controlled by the balance between heat loss and heat production. The starved mammals showed lower heat production. We have added this information to the introduction. 

      The authors show that 5 min fly food refeeding causes a par3al recovery of the naïve temperature preference of the flies (Figure 1B) and that feeding of sucralose par3ally rescues the preference whereas glucose rescues the preference similar to refeeding with fly food would do. As glucose is both sweet and metabolically valuable it would be clearer for the reader if the authors start with the fly food experiment and then show the glucose experiment to show that the altered temperature preference depends on the food component glucose. From there they can further argue that glucose is both sweet (hedonic value) and metabolically valuable. And to disentangle sweetness from metabolism one needs a sugar that is sweet but cannot be metabolized - sucralose. 

      Thank you for your advice. Since the data with sucralose is the one we want to highlight the most, we decided to present it in the order of sucralose, glucose, and fly food.

      In the sucralose experiment the authors omit the 5 min data point and only show the 10 min time point. As Figure 1F indicates that both Glucose and Sucralose elicit the same attractiveness in the flies and that sweetness influences the temperature preference, it is important that the authors show the 5 min temperature preference too to underline the effect of the sweet taste stimulus on the fly behavior independent from the caloric value. Further, the authors should demonstrate not only the cumulative touches but how much sucralose or glucose may already be consumed by the fly in the depicted time frames. 

      It is interesting to see how much sucralose or glucose the flies consume over the time frames shown. Although the cumula1ve exposure to sugar is ideally equivalent to the amount of sugar, we need a different way to actually measure the amount of sugar. We will now emphasize "cumulative touches" rather than "amount of sugar" in the text. In the next study, we will look at how much sucralose or glucose the fly has already consumed.

      Sucralose and Glucose have a similar molecular structure - it would be interesting to see how the sweet taste of a sugar with a different molecular structure like fructose and its receptor Gr43b (Myamato & Amrein 2014) may contribute to temperature preferences.  

      Sucralose and Glucose are not structurally similar. That said, we tested fructose refeeding anyway. The hungry flies showed a taste-evoked warm preference after fructose refeeding. We have added data in Figure 1E and F. The data suggest that sweet taste is more important than sugar structure. We also tested Gr43b>CsChrimson. However, the flies do not show the taste-evoked warm preference (data not shown). The data suggest that Gr43b is not the major receptor controlling taste-evoked warm preference. We have revised the manuscript.

      Both sugars appear similarly attractive to the flies (Figure 1F) - are water, sucralose, and glucose presented in a choice assay or are these individually in separate experiments? 

      Water, sucralose, and glucose were individually presented in separate experiments. We clarified it in the figure legend.

      Subsequently, the authors address the question of how sweet taste may influence temperature preferences in flies. To this end, the authors first employ gustatory receptor mutants for Gr5a, Gr64a, and Gr61a and demonstrate that sucralose feeding does not rescue temperature preference in the absence of sweet taste receptors. In an alternative approach, the authors do not use mutants but an expression of UAS:Kir in Gr64F neurons. Taking a closer look at the graph it appears that the Kir expressing flies have an increased (nearly 1{degree sign}C) temperature preference than the starved mutant flies. Is this preference change related to the mutation directly and what would be the result if Kir would be conditionally only expressed after development is completed, or is the observed temperature preference related to the Gr64f-Gal4 line? If the latter would be the case perhaps the authors may want to bring the flies to the same genetic background to allow for a more direct comparison of the temperature preferences. 

      The Gr64fGal4>Kir flies show a ~one degree higher preferred temperature under starvation compared to the mutants. However, the phenotype is similar to the controls, Gr64fGal4/+ flies, under starvation. Therefore, this phenotype is not due to either the mutation or the Kir effect. Most importantly, the Gr64fGal4>Kir flies failed to show a taste-evoked warm preference. Together with other mutant data, we concluded that sweet GRNs are required for taste-evoked warm preference.

      Overall, the figure legend for Figure 2 is very cryptic and should be more detailed.

      We have revised the figure legend for Figure 2. 

      To shed light on the mechanisms underlying the changes in temperature preferences through gustatory stimuli the authors next blocked heat and cold sensing neurons in fed and starved flies and found out that TrpA1 expressing anterior cells and R11F02-Gal4 expressing neurons both participate in sweetness-induced alteration of temperature preference in starved animals. At this point, it should be explicitly indicated in the figure that the flies need more than one overnight starva3on to display the behavior (Figure 3A). 

      We have revised the manuscript.

      The data provided by the authors indicate a kind of push-and-pull mechanism between heat and cold-sensing neurons under starvation that is somehow influenced by sweet taste sensing. Further, the authors demonstrate that TrpA1-as well as R11F02-Gal4 driven Chrimson activation is sufficient to partially rescue temperature preference under starvation. At this point is unclear why the authors use a tubGal80ts expression system but not for the TrpA1SH-Gal4 driven Chrimson. As the development itself and the conditions under which the animals were raised may have influence on the temperature preference it is important that both groups are equally raised if the authors want to directly compare with each other. 

      As we wrote in the Material and Method, the R11F02-Gal4>uas-CsChrimson flies died during the development. Therefore, we had to use tubGal80ts. On the other hand, the TrpA1-Gal4>CsChrimson flies can survive to adults. As we mentioned in MS, all flies were treated with ATR after they had fully developed into adults. This means that both TrpA1-Gal4 and R11F02-Gal4 expressing cells are ac1vated by red light via CsChrimson only in adult stages. We carefully revised the MS.

      It is a pity that the authors at this point have decided to not deepen the understanding of the circuitry between thermo-sensation and metabolic homeostasis but subsequently change the focus of their study to investigate how internal state influences taste-evoked warm preference in hungry flies. Using mutants for NPF and sNPF the authors demonstrate that both peptides play a pivotal role in taste-evoked warm preference after sucrose feeding but not for nutrient-induced warm preference. Similarly, they found that DH44, AKH and dILP6, Upd2 and Upd3 neurons are also required for taste-evoked warm preference but not for nutrient-induced warm preference. Here again, the authors do not keep the systems stable and change between inhibition of neurons through Kir and mutants for peptides. For a better comparison, it would be preferable to use always exactly the same technique to inhibit neuron signalling.

      It would be interesting to find the neural circuity of thermo-sensation and metabolic homeostasis, but we do not have any luck so far. We will continue to look into the neural circuits which control taste-evoked warm preference and nutrient-induced warm preference. Since UAS-Kir is such a strong reporter, it may kill the flies sometime. So we couldn't use UAS-Kir for all Gal4 flies. 

      DH44 is expressed in the brain and in the abdominal ganglion where they share the expression pattern with 4 Lk neurons per hemisphere. Seeing the impact of Lk signalling in metabolism (AlAnzi et al., 2010) the authors should provide evidence that the observed effect is indeed because of DH44 and not Lk.

      It would be interesting to see if Lk may play a role in taste-evoked warm preference and/or nutrient-induced warm preference. We would like to systematically screen which neuropeptides and receptors are involved in the behavior in the next study. 

      Seeing the results on dILP6 it is interesting that Li and Gong (2015) could show in larvae that cold-sensing neurons directly interact with dILP neurons in the brain. It would be interesting to see whether similar circuitry may exist in adult flies to regulate temperature preferences and these peptidergic neurons. Further, it appears interesting that again these animals need much longer time to display the observed shift in temperature (which again should be clearly indicated in the figure legend too). These observations should be more carefully considered in the discussion part too.

      We have revised the manuscript.

      In the last part of the study, the authors investigate how sensory input from temperature-sensitive cells may transmit information to central clock neurons and how these in turn may influence temperature preference under starvation. The experiments assume that DH44-expressing neurons play a role in the output pathway of the central clock. Using the clock gene null mutants per and tim the authors show that even though the animals display a significant starvation response neither per nor tim mutants exhibited taste-evoked warm preference, indicating a taste but not nutrient-evoked temperature preference regulation. 

      The authors demonstrate interesting new data on how taste input can influence temperature preference during starvation. They propose how gustatory pathways may work together with thermosensitive neurons, peptidergic neurons and finally try to bridge the gap between these neurons and clock genes. The study is very interesting and the data for each experiment alone are very convincing. However, in my opinion, the authors have opened many new questions but did not fully answer the initial question - how do taste-sensing neurons influence temperature preferences? What are the mechanisms underlying this observation? Instead of jumping from gustatory neurons to thermosensitive neurons to peptidergic neurons to clock genes, the authors should have stayed within the one question they were asking at the beginning. How does sugar sensing influence the physiology of thermos-sensation? Before addressing all the following questions of the manuscript the authors should first directly decipher the neuronal interplay between these two types of neurons. 

      Thank you for your suggestion. It would be interesting to find the neural circuity of thermo-sensation and metabolic homeostasis. We have tried but there is no luck so far. 

      The authors could e.g., employ Ca or cAMP-imaging in anterior or cold-sensitive cells and see how the responsiveness of these cells may be altered after sugar feeding. Or at least follow the idea of Li and Gong about the thermos-regulation of dILP-expressing neurons. 

      Thank you for your suggestion. Since we do not know how dlLP-expression neurons are involved in temperature response in the adult flies. We will focus on the cells using Calcium imaging for the next study.

      Anatomical analysis using the GRASP technique may further help to understand the interplay of these neurons and give new insights into the circuitry underlying food preference alteration under starvation. 

      Thank you for your suggestion. It would be interesting to find the neural circuity of thermo-sensation and metabolic homeostasis. We have tried but there is no luck so far.  

      Minor comments: 

      Line 51: Hungry animals are desperate for food - I think the authors should not anthropomorphize at this point too\ much but rather strictly describe how the animals change their behavior without any interpretation of the mental state of the animal. 

      We have modified the manuscript.

      Line 80: Hunger and satiety dramatically affect animal behavior and physiology and control feeding - please not only cite the papers but also give a short overview of the cited papers on which behaviors are altered and how. 

      We have revised the manuscript. 

      Overall statistic: The authors do comparative statistics always against starved animals throughout but often state in the text a comparison against fed (Line 111: "but did not reach that of the fed flies") I think the authors should describe the date according to their statistics and keep this constant throughout the paper. 

      Sorry for the confusion. We originally had it, but we removed it. We have added the additional statistical analyses.  

      Figure legends: Overall the figure legends could be more developed and more detailed.

      We have revised the manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      As adult-born granule neurons have been shown to play diverse roles, both positive and negative, to modulate hippocampal circuitry and function in epilepsy, understanding the mechanisms by which altered neurogenesis contributes to seizures is important for future therapeutic strategies. The work by Jain et al. demonstrates that increasing adult neurogenesis before status epilepticus (SE) leads to a suppression of chronic seizures in the pilocarpine model of temporal lobe epilepsy. This work is potentially interesting because previous studies showed suppressing neurogenesis led to reduced chronic seizures.

      To increase neurogenesis, the authors conditionally delete the pro-apoptotic gene Bax using a tamoxifen-inducible Nestin-CreERT2 which has been previously published to increase proliferation and survival of adult-born neurons by Sahay et al. After 6 weeks of tamoxifen injection, the authors subjected male and female mice to pilocarpine-induced SE. In the first study, at 2 hours after pilocarpine, the authors examine latency to the first seizure, severity and total number of acute seizures, and power during SE. In the second study in a separate group of mice, at 3 weeks after pilocarpine, the authors examine chronic seizure number and frequency, seizure duration, postictal depression, and seizure distribution/cluster seizures. Overall, the study concludes that increasing adult neurogenesis in the normal adult brain can reduce epilepsy in females specifically. However, important BrdU birthdating experiments in both male and female mice need to be included to support the conclusions made by the authors. Furthermore, speculative mechanisms lacking direct evidence reduce enthusiasm for the findings.

      There are two suggestions. First, BrdU birthdating of newborn neurons is important to add to the paper so that there is support for the conclusions. Second, speculative text reduced enthusiasm. In response, we clarified the conclusions. We do not think that the clarified conclusions require BrdU birthdating (discussed further below). We also removed two schematics (and associated text) that we think the reviewer was referring to when speculation was mentioned.

      We also want to point out something minor -that the times of injections listed above are not correct.

      a. Seizures were not measured 2 hrs after pilocarpine; that is when the anticonvulsant diazepam was administered to males. 

      b. Seizures were not measured 3 weeks after pilocarpine; the duration of recording was 3 weeks.  

      (1) BrdU birthdating is required for conclusions.

      We think that the Reviewer was suggesting birthdating because we were not clear about our conclusions, and we apologize for the confusion. The Reviewer stated that we concluded: “conditionally deleting Bax in Nestin-Cre+ cells leads to increased neurogenesis and hilar ectopic granule cells, thereby reducing chronic seizures.”  (Note this is a quote from the review).

      However, we did not intend to conclude that. We intended to conclude that conditionally deleting Bax in Nestin-Cre+ mice reduced chronic seizures in the mouse model of epilepsy that we used. Also, that conclusion only pertained to females. Please note we did not conclude that hilar ectopic granule cells led to reduced seizures. We also concluded that Bax deletion increased neurogenesis in female mice. We have revised the text to make the conclusions clear.

      Abstract, starting on line 67:

      The results suggest that selective Bax deletion to increase adult neurogenesis can reduce experimental epilepsy, and the effect shows a striking sex difference.

      Results, starting on line 448:

      Because Cre+ epileptic females had increased numbers of immature neurons relative to Cre- females at the time of SE, and prior studies show that Cre+ females had less neuronal damage after SE (Jain et al., 2019), female Cre+ mice might have had reduced chronic seizures because of high numbers of immature neurons. However, the data do not prove a causal role.

      Starting on line 477:

      ...we hypothesized that female Cre+ mice would have fewer hilar ectopic GCs than female Cre- mice. However, that female Cre+ mice did not have fewer hilar ectopic GCs.

      Discussion, starting on line 563:

      The chronic seizures, measured 4-7 weeks after pilocarpine, were reduced in frequency by about 50% in females. Therefore, increasing young adult-born neurons before the epileptogenic insult can protect against epilepsy. However, we do not know if the protective effect was due to the greater number of new neurons before SE or other effects. Past data would suggest that increased numbers of newborn neurons before SE leads to a reduced SE duration and less neuronal damage in the days after SE. That would be likely to lessen the epilepsy after SE. However, there may have been additional effects of larger numbers of newborn neurons prior to SE.

      Conclusions, starting on line 745:

      In the past, suppressing adult neurogenesis before SE was followed by fewer hilar ectopic GCs and reduced chronic seizures. Here, we show that the opposite - enhancing adult neurogenesis before SE and increased hilar ectopic GCs - do not necessarily reduce seizures. We suggest instead that protection of the hilar neurons from SE-induced excitotoxicity was critical to reducing seizures. The reason for the suggestion is that the survival of hilar neurons would lead to persistence of the normal inhibitory functions of hilar neurons, protecting against seizures. However, this is only a suggestion at the present time because we do not have data to prove it. Additionally, because protection was in females, sex differences are likely to have played an important role. Regardless, the results show that enhancing neurogenesis of young adult-born neurons in Nestin-Cre+ mice had a striking effect in the pilocarpine model, reducing chronic seizures in female mice.

      The Reviewer is correct that it would be interesting to know when the increase in adult neurogenesis occurred that was critical to the effect. For example, was it the initial increase following Bax deletion but before pilocarpine-induced SE, or the increase in neurogenesis following SE, or increased adult neurogenesis in the chronic stage of epilepsy. It also might be that related aspects of neurogenesis played a role such as the degree that maturation was normal in adult-born neurons. We have not pursued the experiments to identify these aspects of neurogenesis because of how much work it would entail. Also, approaches to conclude cause-effect relationships are going to be difficult. 

      (2) Speculation.

      We removed the text and supplemental figures with schematics that we think were the overly speculative parts of the paper the Reviewer mentioned.

      Strengths:

      (1) The study is sex-matched and reveals differences in response to increasing adult neurogenesis in chronic seizures between males and females.

      (2) The EEG recording parameters are stringent, and the analysis of chronic seizures is comprehensive. In two separate experiments, the electrodes were implanted to record EEG from the cortex as well as the hippocampus. The recording was done for 10 hours post pilocarpine to analyze acute seizures, and for 3 weeks continuous video EEG recording was done to analyze chronic seizures.

      Weaknesses:

      (1) Cells generated during acute seizures have different properties to cells generated in chronic seizures. In this study, the authors employ two bouts of neurogenesis stimuli (Bax deletion dependent and SE dependent), with two phases of epilepsy (acute and chronic). There are multiple confounding variables to effectively conclude that conditionally deleting Bax in Nestin-Cre+ cells leads to increased neurogenesis and hilar ectopic granule cells, thereby reducing chronic seizures.

      As mentioned above, with a clarification of our conclusions we think we have addressed the concern. We believe that we conditionally deleted Bax in Nestin-expressing cells. We believe we found that female mice had reduced loss of hilar mossy cells and somatostatin-expressing neurons after SE, and fewer chronic seizures after SE. While it makes sense that increased neurogenesis caused the reduced seizures, we acknowledge it was not proved.

      We do not make conclusions about the role of hilar ectopic granule cells. However, we note that they appear to have been similar in number across groups, which suggests they played no role in the results. This is very surprising and therefore adds novelty.

      (2) Related to this is the degree of neurogenesis between Cre+ and Cre- mice and the nature of the sex differences. It is crucial to know the rate/fold change of increased neurogenesis before pilocarpine treatment and whether it is different between male and female mice.

      We agree that if sex differences in adult neurogenesis could be shown by a sex difference in rate, fold change, maturation, and other characteristics.  However, sex differences can also be shown by a change in doublecortin (DCX), which is what we did. We respectfully submit that we do not see an exhaustive study is critical.

      As a result, we have clarified DCX was studied either before SE or in the period of chronic seizures:

      Results, starting on line 406:

      III. Before and after epileptogenesis, Cre+ female mice exhibited more immature neurons than Cre- female mice but that was not true for male mice.

      Starting on line 446:

      Therefore, elevated DCX occurred after chronic seizures had developed in Cre+ mice but the effect was limited to females.

      Discussion, starting on line 592:

      This study showed that conditional deletion of Bax from Nestin-expressing progenitors increased young adult-born neurons in the DG when studied 6 weeks after deletion and using DCX as a marker of immature neurons.

      (3) The authors observe more hilar Prox1 cells in Cre+ mice compared to Cre- mice. The authors should confirm the source of the hilar Prox1+ cells.

      This is an excellent question but it is unclear that it is critical to the seizures since both sexes showed more hilar Prox1 cells in Cre+ mice but only the females had fewer seizures than Cre- mice. This is the additional text to describe the results (starting on Line 493):

      In past studies, hilar ectopic GCs have been suggested to promote seizures (Scharfman et al., 2000; Jung et al., 2006; Cho et al., 2015). Therefore, we asked if the numbers of hilar ectopic GCs correlated with the numbers of chronic seizures. When Cre- and Cre+ mice were compared (both sexes pooled), there was a correlation with numbers of chronic seizures (Fig. 6D1) but it suggested that more hilar ectopic GCs improved rather than worsened seizures. However, the correlation was only in Cre- mice, and when sexes were separated there was no correlation (Fig. 6D3).

      When seizure-free interval was examined with sexes pooled, there was a correlation for Cre+ mice (Fig. 6D2) but not Cre- mice. Strangely, the correlations of Cre+ mice with seizure-free interval (Fig. 6D2, D4) suggest ectopic GCs shorten the seizure-free interval and therefore worsen epilepsy, opposite of the correlative data for numbers of chronic seizures. In light of these inconsistent results it seems that hilar ectopic granule cells had no consistent effect on chronic seizures.

      (4) The biggest weakness is the lack of mechanism. The authors postulate a hypothetical mechanism to reconcile how increasing and decreasing adult-born neurons in GCL and hilus and loss of hilar mossy and SOM cells would lead to opposite effects - more or fewer seizures. The authors suggest the reason could be due to rewiring or no rewiring of hilar ectopic GCs, respectively, but do not provide clear-cut evidence.

      As we mention above, we removed the supplemental figures with schematics because they probably were what seemed overly speculative.

      We acknowledge that mechanism is not proven by our study. However, we would like to mention that in our view, showing preservation of hilar mossy cells and SOM cells, but not PV cells, does add mechanistic data to the paper. We understand more experiments are necessary.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Jain et al explore whether increasing adult neurogenesis is protective against status epilepticus (SE) and the development of spontaneous recurrent seizures (chronic epilepsy) in a mouse pilocarpine model of TLE. The authors increase adult neurogenesis via conditional deletion of Bax, a pro-apoptotic gene, in Nestin-CreERT2Baxfl/fl mice. Cre- littermates are used as controls for comparisons. In addition to characterizing seizure phenotypes, the authors also compare the abundance of hilar ectopic granule cells, mossy cells, hilar SOM interneurons, and the degree of neuronal damage between mice with increased neurogenesis (Cre+) vs Cre- controls. The authors find less severe SE and a reduction in chronic seizures in female mice with pre-insult increased adult-born neurons. Immunolabeling experiments show these females also have preservation of hilar mossy cells and somatostatin interneurons, suggesting the pre-insult increase in adult neurogenesis is protective.

      Strengths:

      (1) The finding that female mice with increased neurogenesis at the time of pilocarpine exposure have fewer seizures despite having increased hilar ectopic granule cells is very interesting.

      (2) The work builds nicely on the group's prior studies.

      (3) Apparent sex differences are a potentially important finding.

      (4) The immunohistochemistry data are compelling.

      (5) Good controls for EEG electrode implantation effects.

      (6) Nice analysis of most of the SE EEG data.

      Weaknesses:

      (1) In addition to the Cre- littermate controls, a no Tamoxifen treatment group is necessary to control for both insertional effects and leaky expression of the Nestin-CreERT2 transgene.

      About “leaky” expression, we have not found expression to be leaky. We checked by injecting a Cre-dependent virus so that mCherry would be expressed in those cells that had Cre.  The results were published as Supplemental Figure 9 in Jain et al. (2019).

      In the revised manuscript we also mention a study that examined three Nestin-CreERT2 mouse lines (Sun et al., 2014). One of the mouse lines was ours. The leaky expression was not in the mouse line we use. We have added these points to the revised manuscript:

      Methods, section II starting on line 791:

      Although Nestin-Cre-ERT2 mouse lines have been criticized because  they can have leaky expression, the mouse line used in the present study did not (Sun et al., 2014), which we confirmed (Jain et al., 2019).

      (2) The authors suggest sex differences; however, experimental procedures differed between male and female mice (as the authors note). Female mice received diazepam 40 minutes after the first pilocarpine-induced seizure onset, whereas male mice did not receive diazepam until 2 hours post-onset. The former would likely lessen the effects of SE on the female mice. Therefore, sex differences cannot be accurately assessed by comparing these two groups, and instead, should be compared between mice with matching diazepam time courses.

      We agree that a shorter delay between pilocarpine and diazepam would be likely to lead to less damage. However, the latency from pilocarpine to SE varied, making the time from the onset of SE to diazepam variable. Most of the variability was in females. By timing the diazepam injection differently in males and females, we could make the time from the onset of SE to diazepam similar between females and males. We had added a supplemental figure to show that our approach led to no significant differences between females and males in the latency to SE, time between SE and diazepam injection, and time between pilocarpine and diazepam injection. We also show that Cre+ females and Cre- females were not different in these times, so it could not be related to the neuroprotection of Cre+ females.

      Additionally, the authors state that female mice that received diazepam 2 hours post-onset had severe brain damage. This is concerning as it would suggest that SE is more severe in the female than in the male mice.

      We regret that our language was misleading. We intended to say females had more morbidity and mortality than males (lack of appetite and grooming, death in the days after SE) when we gave DZP 2 hrs after Pilo. We actually don’t know why because there were no differences in severity of SE. We think the females had worse outcome when they had a short latency to SE.  These females had a longer period of SE before DZP than males, probably leading to worse outcome. To correct this we gave DZP to females sooner. Then morbidity and mortality was improved in females. 

      Interestingly, after we did this we saw females did not always have a short latency to SE. We maintained the same regimen however, to be consistent. As the new supplemental figure (above) shows, there were significant sex differences in the latency to SE, time between SE and DZP, and time between pilocarpine and DZP.

      (3) Some sample sizes are low, particularly when sex and genotypes are split (n=3-5), which could cause a type II statistical error.

      We agree and have noted this limitation in the Discussion:

      Additional considerations, starting on line 739:

      This study is limited by the possibilities of type II statistical errors in those instances where we divided groups by genotype and sex, leading to comparisons of 3-5 mice/group.

      (4) Several figures show a datapoint in the sex and genotype-separated graphs that is missing from the corresponding male and female pooled graphs (Figs. 2C, 2D, 4B).

      We are very grateful to the Reviewer for pointing out the errors. They are corrected.

      (5) In Suppl Figs. 1B & 1C, subsections 1c and 2c, the EEG trace recording is described as the end of SE; however, SE appears to still be ongoing in these traces in the form of periodic discharges in the EEG.

      The Reviewer is correct.  It is a misconception that SE actually ends completely. The most intense seizure activity may, but what remains is abnormal activity that can last for days. Other investigators observe the same and have suggested that it argues against the concept of a silent period between SE and chronic epilepsy. We had discussed this in our prior papers and had referenced how we define SE.  In the revised manuscript we add the information to the Methods section instead of referencing a prior study:

      Methods, starting on line 899:

      SE duration was defined in light of the fact that the EEG did not return to normal after the initial period of intense activity. Instead, intermittent spiking occurred for at least 24 hrs, as we previously described (Jain et al., 2019) and has been described by others (Mazzuferi et al., 2012; Bumanglag and Sloviter, 2018; Smith et al., 2018). We therefore chose a definition that captured the initial, intense activity. We defined the end of this time as the point when the amplitude of the EEG deflections were reduced to 50% or less of the peak deflections during the initial hour of SE. Specifically, we selected the time after the onset of SE when the EEG amplitude in at least 3 channels had dropped to approximately 2 times the amplitude of the EEG during the first hour of SE, and remained depressed for at least 10 min (Fig. S2 in (Jain et al., 2019). Thus, the duration of SE was defined as the time between the onset and this definition of the "end" of SE.

      (6) In Results section II.D and associated Fig.3, what the authors refer to as "postictal EEG depression" is more appropriately termed "postictal EEG suppression". Also, postictal EEG suppression has established criteria to define it that should be used.

      We find suppression is typical in studies of ECT or humans (Esmaeili et al., 2023; Gascoigne et al., 2023; Hahn et al., 2023; Kavakbasi et al., 2023; Langroudi et al., 2023; Karl et al., 2024; Vilan et al., 2024; Zhao et al., 2024) and animal research uses the term postictal depression(Kanner et al., 2010; Krishnan and Bazhenov, 2011; Riljak et al., 2012; Singh et al., 2012; Carballosa-Gonzalez et al., 2013; Kommajosyula et al., 2016; Smith et al., 2018; Uva and de Curtis, 2020; Medvedeva et al., 2023). Therefore we think depression is a more suitable term.

      The example traces in Fig. 3A and B should also be expanded to better show this potential phenomenon.

      We expanded traces in Fig. 3 as suggested. They are in Fig 3A.

      (7) In Fig.5D, the area fraction of DCX in Cre+ female mice is comparable to that of Cre- and Cre+ male mice. Is it possible that there is a ceiling effect in DCX expression that may explain why male Cre+ mice do not have a significant increase compared to male Cre- mice?

      We thank the Reviewer for the intriguing possibility. We now mention it in the manuscript:

      Results, starting on line 456:

      It is notable that the Cre+ male mice did not show increased numbers of immature neurons at the time of chronic seizures but Cre+ females did. It is possible that there was a “ceiling” effect in DCX expression that would explain why male Cre+ mice did not have a significant increase in immature neurons relative to male Cre- mice.

      (8) In Suppl. Fig 6, the authors should include DCX immunolabeling quantification from conditional Cre+ male mice used in this study, rather than showing data from a previous publication.

      We have made this revision.

      (9) In Fig 8, please also include Fluorojade-C staining and quantification for male mice.

      The additional data for males have been added to part D.

      (10) Page 13: Please specify in the first paragraph of the discussion that findings were specific to female mice with pre-insult increases in adult-born neurogenesis.

      This has been done.

      Minor:

      (11) In Fig. 1 and suppl. figure 1, please clarify whether traces are from male or female mice.

      We have clarified.

      (12) Please be consistent with indicating whether immunolabeling images are from female or male mice.

      a. Fig 5B images labeled as from "Cre- Females" and "Cre+ Females".

      b. Suppl. Fig 8: Images labeled as "Cre- F" and "Cre+ F".

      c. Fig 6: sex not specified.

      d. Fig. 7: sex only specified in the figure legend.

      e. Fig 8: only female mice were included in these experiments, but this is not clear from the figure title or legend.

      We revised all figures according to the comments.

      (13) Page 4: the last paragraph of the introduction belongs within the discussion section.

      We recognize there is a classic view that any discussion of Results should not be in the Introduction. However, we find that view has faded and more authors make a brief summary statement about the Results at the end of the Introduction. We would like to do so because it allow Readers to understand the direction of the study at the outset, which we find is helpful.

      (14) Page 6: The sentence "The data are consistent with prior studies..." is unnecessary.

      We have removed the text.

      (15) Suppl. Fig 6A: Please include representative images of normal condition DCX immunolabeling.

      We have added these data. There is an image of a Cre- female, Cre+ female, Cre- male and Cre+ male in the new figure, Supplemental Figure 6. All mice had tamoxifen at 6 weeks of age and were perfused 6 weeks later. None of the mice had pilocarpine.

      (16) In Suppl. Fig 7C, I believe the authors mean "no loss of hilar mossy and SOM cells" instead of "loss of hilar mossy and SOM cells".

      This Figure was removed because of the input from Reviewer 1 suggesting it was too speculative.

      Reviewer #1 (Recommendations For The Authors):

      (1) The main claim of the study is that increasing adult neurogenesis decreases chronic seizures. However, to quantify adult-born neurons, DCX immunoreactivity is used as the sole metric to determine neurogenesis. This is insufficient as changes in DCX-expressing cells could also be an indicator of altered maturation, survival, and/or migration, not proliferation per se. To claim that increasing adult neurogenesis is associated with a reduction of chronic seizures, the authors should perform a pulse/chase (birth dating) experiment with BrdU and co-labeling with DCX.

      We think that increased DCX does reflect increased adult neurogenesis. However, we agree that one does not know if it was due to increased proliferation, survival, etc. We also note that this mouse line has been studied thoroughly to show there was increased neurogenesis with BrdU, Ki67 and DCX. We mention that paper in the revised text:

      Methods, starting on line 786:

      It was shown that after tamoxifen injection in adult mice there is an increase in dentate gyrus neurogenesis based on studies of bromo-deoxyuridine, Ki67, and doublecortin (Sahay et al., 2011).

      (2) As mentioned above, analysis of DCX staining alone months after TAM injections is limited. Instead, the cells could be labelled by BrdU prior to TAM injection, following which quantification of BrdU+/Prox1+ cells at 6 weeks post TAM injection should be performed in Cre+ and Cre- mice (males and females) to yield the rate of neurogenesis increase.

      We respectfully disagree that birthdating cells is critical. Using DCX staining just before SE, we know the size of the population of cells that are immature at the time of SE. This is what we think is most important because these immature neurons are those that appear to affect SE, as we have already shown.

      (3) To confirm the source of the hilar Prox1+ cells, a dual BrdU/EdU labeling approach would be beneficial. BrdU injection could be given before TAM injection and EdU injection before pilocarpine to label different cohorts of neural stem cells. Co-staining with Prox1 at different time points will help in identifying the origin of hilar ectopic cells.

      We are grateful for the ideas of the Reviewer. We hesitate to do these experiments now because it seems like a new study to find out where hilar granule cells come from.

      REFERENCES

      Bumanglag AV, Sloviter RS (2018) No latency to dentate granule cell epileptogenesis in experimental temporal lobe epilepsy with hippocampal sclerosis. Epilepsia 59:2019-2034.

      Carballosa-Gonzalez MM, Munoz LJ, Lopez-Alburquerque T, Pardal-Fernandez JM, Nava E, de Cabo C, Sancho C, Lopez DE (2013) EEG characterization of audiogenic seizures in the hamster strain gash:Sal. Epilepsy Res 106:318-325.

      Cho KO, Lybrand ZR, Ito N, Brulet R, Tafacory F, Zhang L, Good L, Ure K, Kernie SG, Birnbaum SG, Scharfman HE, Eisch AJ, Hsieh J (2015) Aberrant hippocampal neurogenesis contributes to epilepsy and associated cognitive decline. Nat Commun 6:6606.

      Esmaeili B, Weisholtz D, Tobochnik S, Dworetzky B, Friedman D, Kaffashi F, Cash S, Cha B, Laze J, Reich D, Farooque P, Gholipour T, Singleton M, Loparo K, Koubeissi M, Devinsky O, Lee JW (2023) Association between postictal EEG suppression, postictal autonomic dysfunction, and sudden unexpected death in epilepsy: Evidence from intracranial EEG. Clin Neurophysiol 146:109-117.

      Gascoigne SJ, Waldmann L, Schroeder GM, Panagiotopoulou M, Blickwedel J, Chowdhury F, Cronie A, Diehl B, Duncan JS, Falconer J, Faulder R, Guan Y, Leach V, Livingstone S, Papasavvas C, Thomas RH, Wilson K, Taylor PN, Wang Y (2023) A library of quantitative markers of seizure severity. Epilepsia 64:1074-1086.

      Hahn T et al. (2023) Towards a network control theory of electroconvulsive therapy response. PNAS Nexus 2:pgad032.

      Jain S, LaFrancois JJ, Botterill JJ, Alcantara-Gonzalez D, Scharfman HE (2019) Adult neurogenesis in the mouse dentate gyrus protects the hippocampus from neuronal injury following severe seizures. Hippocampus 29:683-709.

      Jung KH, Chu K, Lee ST, Kim J, Sinn DI, Kim JM, Park DK, Lee JJ, Kim SU, Kim M, Lee SK, Roh JK (2006) Cyclooxygenase-2 inhibitor, celecoxib, inhibits the altered hippocampal neurogenesis with attenuation of spontaneous recurrent seizures following pilocarpine-induced status epilepticus. Neurobiol Dis 23:237-246.

      Kanner AM, Trimble M, Schmitz B (2010) Postictal affective episodes. Epilepsy Behav 19:156-158.

      Karl S, Sartorius A, Aksay SS (2024) No effect of serum electrolyte levels on electroconvulsive therapy seizure quality parameters. J ECT 40:47-50.

      Kavakbasi E, Stoelck A, Wagner NM, Baune BT (2023) Differences in cognitive adverse effects and seizure parameters between thiopental and propofol anesthesia for electroconvulsive therapy. J ECT 39:97-101.

      Kommajosyula SP, Randall ME, Tupal S, Faingold CL (2016) Alcohol withdrawal in epileptic rats - effects on postictal depression, respiration, and death. Epilepsy Behav 64:9-14.

      Krishnan GP, Bazhenov M (2011) Ionic dynamics mediate spontaneous termination of seizures and postictal depression state. J Neurosci 31:8870-8882.

      Langroudi ME, Shams-Alizadeh N, Maroufi A, Rahmani K, Rahchamani M (2023) Association between postictal suppression and the therapeutic effects of electroconvulsive therapy: A systematic review. Asia Pac Psychiatry 15:e12544.

      Mazzuferi M, Kumar G, Rospo C, Kaminski RM (2012) Rapid epileptogenesis in the mouse pilocarpine model: Video-EEG, pharmacokinetic and histopathological characterization. Exp Neurol 238:156-167.

      Medvedeva TM, Sysoeva MV, Sysoev IV, Vinogradova LV (2023) Intracortical functional connectivity dynamics induced by reflex seizures. Exp Neurol 368:114480.

      Riljak V, Maresova D, Jandova K, Bortelova J, Pokorny J (2012) Impact of chronic ethanol intake of rat mothers on the seizure susceptibility of their immature male offspring. Gen Physiol Biophys 31:173-177.

      Sahay A, Scobie KN, Hill AS, O'Carroll CM, Kheirbek MA, Burghardt NS, Fenton AA, Dranovsky A, Hen R (2011) Increasing adult hippocampal neurogenesis is sufficient to improve pattern separation. Nature 472:466-470.

      Scharfman HE, Goodman JH, Sollas AL (2000) Granule-like neurons at the hilar/CA3 border after status epilepticus and their synchrony with area CA3 pyramidal cells: Functional implications of seizure-induced neurogenesis. J Neurosci 20:6144-6158.

      Singh B, Singh D, Goel RK (2012) Dual protective effect of passiflora incarnata in epilepsy and associated post-ictal depression. J Ethnopharmacol 139:273-279.

      Smith ZZ, Benison AM, Bercum FM, Dudek FE, Barth DS (2018) Progression of convulsive and nonconvulsive seizures during epileptogenesis after pilocarpine-induced status epilepticus. J Neurophysiol 119:1818-1835.

      Sun MY, Yetman MJ, Lee TC, Chen Y, Jankowsky JL (2014) Specificity and efficiency of reporter expression in adult neural progenitors vary substantially among nestin-creer(t2) lines. J Comp Neurol 522:1191-1208.

      Uva L, de Curtis M (2020) Activity- and ph-dependent adenosine shifts at the end of a focal seizure in the entorhinal cortex. Epilepsy Res 165:106401.

      Vilan A, Grangeia A, Ribeiro JM, Cilio MR, de Vries LS (2024) Distinctive amplitude-integrated EEG ictal pattern and targeted therapy with carbamazepine in kcnq2 and kcnq3 neonatal epilepsy: A case series. Neuropediatrics 55:32-41.

      Zhao C, Tang Y, Xiao Y, Jiang P, Zhang Z, Gong Q, Zhou D (2024) Asymmetrical cortical surface area decrease in epilepsy patients with postictal generalized electroencephalography suppression. Cereb Cortex 34.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      Comment 1: One of the only demonstrations of the expression and physiological significance of TRPCs in VTA DA neurons was published by (Rasmus et al., 2011; Klipec et al., 2016) which are not cited in this paper. In their study, TRPC4 expression was detected in a uniformly distributed subset of VTA DA neurons, and TRPC4 KO rats showed decreased VTA DA neuron tonic firing and deficits in cocaine reward and social behaviors. Update: The authors say they have added a discussion of these papers, but I do not see it in the updated manuscript.

      We thank the reviewer for the suggestion. The discussion for this has been added (line 557-565).

      Comment 2: The authors should report the results (exact data values) of female mice in the results text, or pool the male and female data if the sex differences are not significant.

      We agree with reviewer. Some experiments were further redone with female and the data of male and female mice have been reported in the results of text.

      Comment 3: The selectivity of drugs should be referred as "selective" rather than "specific". 

      Thanks, “specific” has been changed to “selective”.  

      Comment 4: Line 62: typo, "substantia nigra". 

      Thanks, “substantial nigra” has been changed to “substantia nigra” in line 65.  

      Comment 5: Line 77: some new studies suggest that NALCN might have voltage dependency

      (rectification).

      Thanks, description of NALCN voltage dependence has been corrected in line 81-83.

      Comment 6: Line 175: change "less" to "fewer". 

      Thanks, “less” has been changed to “fewer”.

      Comment 7: Line 299: choose one - "was not ... or" or "was neither ... nor". 

      Thanks, this error has been corrected. 

      Comment 8: In Figure 1Aii and Figure 3Bi, it was not specified in the results text or figure legend that C1-C5 represent individual cell until the legend for Figure 4.

      Thanks, these description about gel have been added in the figure legends. 

      Reviewer #2 (Public Review): 

      Comment 1: From the previous review, we mentioned that " 'The HCN' as written in line 69 is a bit misleading, as HCN channels in the heart and brain are different members of a family of channels, although as written in the text, it seems that they are identical." This is still the case (now line 73).

      We agreed with the reviewer’s comments. The introduction about HCN has been corrected (line 74-78). 

      Comment 2: The authors state in line 112 that "most of the experiments were also repeated in female mice" - this is true in the case of most electrophysiological experiments, although not behavioral experiments. Authors should amend the statement in line 112 and clarify in the Discussion section which findings are generalizable between sexes; e.g.:

      a.  Discussion of HCN contribution to VTA DA activity (beginning line 453) should clarify male mice. 

      b.  Similarly, any discussion of behavioral findings should clarify male mice. 

      We agreed with the reviewer’s comments. The sexes of mice used have been noted in the results and discussion. 

      Comment 3: The authors' statement in lines 179-183 ("In contrast, fewer GABAergic neuronal markers (Glutamic acid decarboxylase, GAD1/2 and vesicular GABA transporter, VGAT) co-expressed with the DA neurons, which is consistent with previous studies that VTA DA neurons co-expressing GABAergic neuronal markers mainly project to the lateral habenula") is a little confusing - as stated, it seems that the authors are confirming DA/GABA coexpression in VTA-LHb neurons, which is not the case.

      We agreed with the reviewer’s comments. We corrected this statement (line 182-186).

      Comment 4: Additional information could be included in the Methods section description of Western Blotting procedures - e.g., what thickness of tissue and what size gauge were used to dissect VTA for these experiments?

      Thanks. The description of tissue in Western Blotting procedures has been added.

      Comment 5:

      a. Grammatical errors in line 23 of Abstract (also lines 31-32)

      b. "drove" should read "strove" in line 92 

      c. Grammatical errors in lines 401, 444, and 448 

      We thank the reviewer for pointing out grammatical errors and we corrected them.

      Reviewer #3 (Public Review): 

      Comment 1: The main strength of this study lies on a comprehensive bottom-up approach ranging from patch-clamp recordings to behavioral tasks. These tasks mainly address anxiety-like behaviors and so-called depression-like behaviors (sucrose choice, forced swim test, tail suspension test). The results gathered by means of these procedures are clearcut. However, the reviewer believes that the authors should be more cautious when interpreting immobility responses to stress (forced swim, tail suspension) as "depression-like" responses. These stress models have been routinely used (and validated) in the past to detect the antidepressant properties of compounds under investigation, which by no means indicates that these are depression models. For readers interested by this debate, I suggest to read e.g. De Kloet and Molendijk (Biol. Pscyhiatry 2021).

      We thank the reviewer for the suggestion. We will be more careful and rigorous in the selection of stress models in our subsequent research work.

      Editor's note:

      Should you choose to revise your manuscript, please include full statistical reporting including exact p-values wherever possible alongside the summary statistics (test statistic and df) and 95% confidence intervals. These should be reported for all key questions and not only when the p-value is less than 0.05.

      We have added the full statistical reporting including exact p-values wherever possible alongside the summary statistics (test statistic and df) and 95% confidence intervals into the results and the figure legends of the revised manuscript.

    1. Author response:

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      In Drosophila melanogaster, expression of Sex-lethal (Sxl) protein determines sexual identity and drives female development. Functional Sxl protein is absent from males where splicing includes a termination codon-containing "poison" exon. Early during development, in the soma of female individuals, Sxl expression is initiated by an X chromosome counting mechanism that activates the Sxl establishment promoter (SxlPE) to produce an initial amount of Sxl protein. This then suppresses the inclusion of the "poison" exon, directing the constructive splicing of Sxl transcripts emerging from the Sxl maintenance promotor (SxlPM) which is activated at a later stage during development irrespective of sex. This autoregulatory loop maintains Sxl expression and commits to female development. 

      Sxl also determines the sexual identity of the germline. Here Sxl expression generally follows the same principles as in somatic tissues, but the way expression is initiated differs from the soma. This regulation has so far remained elusive. 

      In the presented manuscript, Goyal et al. show that activation of Sxl expression in the germline depends on additional regulatory DNA sequences, or sequences different from the ones driving initial Sxl expression in the soma. They further demonstrate that sisterless A (sisA), a transcription factor that is required for activation of Sxl expression in the soma, is also necessary, but not sufficient, to initiate the expression of functional Sxl protein in female germ cells. sisA expression precedes Sxl induction in the germline and its ablation by RNAi results in impaired expression of Sxl, formation of ovarian tumors, and germline loss, phenocopying the loss of Sxl. Intriguingly, this phenotype can be rescued by the forced expression of Sxl, demonstrating that the primary function of sisA in the germline is the induction of Sxl expression. 

      Strengths: 

      The clever design of probes (for RNA FISH) and reporters allowed the authors to dissect Sxl expression from different promoters to get novel insight into sex-specific gene regulation in the germline. All experiments are carefully controlled. Since Sxl regulation differs between the soma and the germline, somatic tissues provide elegant internal controls in many experiments, ensuring e.g. functionality of the reporters. Similarly, animals carrying newly generated alleles (e.g. genomic tagging of the Sxl locus) are fertile and viable, demonstrating that the genetic manipulation does not interfere with protein function. The conclusions drawn from the experimental data are sound and advance our understanding of how Sxl expression is induced in the female germline. 

      Weaknesses: 

      The assays employed by the authors provide valuable information on when Sxl promoters become active. However, since no information on the stability of the gene products (i.e. RNA and protein) is available, it remains unclear when the SxlPE promoter is switched off in the germline (conceptually it only needs to be active for a short time period to initiate production of functional Sxl protein). As correctly stated by the authors, the persisting signals observed in the germline might therefore not reflect the continuous activity of the SxlPE promoter. 

      Mapping of regulatory elements and their function: SxlPE with 1.5 kb of flanking upstream sequence is sufficient to recapitulate early Sxl expression in the soma. The authors now provide evidence that beyond that, additional DNA sequences flanking the SxlPE promoter are required for germline expression. However, a more precise mapping was not performed. Also, due to technical limitations, the authors could not precisely map the sisA binding sites. Since this protein is also involved in the somatic induction of Sxl, its binding sites likely reside in the region 1.5kb upstream of the SxlPE promoter, which has been reported to be sufficient for somatic regulation. The regulatory role of the sequences beyond SxlPE-1.5kb therefore remains unaddressed and it remains to be investigated which trans-acting factor(s) exert(s) its/their function(s) via this region. 

      We agree that a more precise mapping of the essential elements within the 10.2 kb reporter is an important direction in which to proceed. Unfortunately, this is out of the scope of the current manuscript given current lab personnel. In regard to the 1.5 kb promoter that activates SxlPE in the soma, we do not feel that the Sisa binding sites are necessarily in this region. It is important to note that, while the 1.5 kb promoter is sufficient for female-specific expression in the soma, it may not contain all of the regulatory elements that normally regulate PE from the endogenous locus. Activation of PE in the soma is thought to be regulated by a combination of positive-acting factors (SisA, SisB, etc.) and repressive factors (e.g. Dpn) that set a threshold for PE activation. Much more work would need to be done to determine whether all of these factors bind to the 1.5 kb promoter, or whether additional sequences are also involved to control the proper timing and robustness of normal Sxl PE activation in the soma.

      The central question of how Sxl expression is initiated and controlled in the germline still remains unanswered. Since sisA is zygotically expressed in both the male and the female germline (Figure 4D), it is unlikely the factor that restricts Sxl expression to the female germline. 

      X chromosome “counting” elements like SisA are always expressed in both males and females, but it is thought that the 2X does of them in females activates PE, while the 1X does in males does not. Thus, we do expect SisA to be expressed in both males and females as we observed.

      How does weak expression of Sxl in male tissues or expression above background after knockdown of sisA reconcile with the model that an autoregulatory feedback loop enforces constant and clonally inheritable Sxl expression once Sxl is induced? Is the current model for Sxl expression too simple or are we missing additional factors that modulate Sxl expression (such as e.g. Sister of Sex-lethal)? While I do not expect the authors to answer these questions, I would expect them to appropriately address these intriguing aspects in the discussion. 

      It is difficult to know what is “background” and what is actual weak Sxl expression in males. We agree that, if it is real, then why it doesn’t activate autoregulation of the Sxl PM transcript is mysterious. And yes, the current model for female-specific expression of Sxl in the soma may well be incomplete. Sxl PM transcript is present in the testis based on community RNA-seq data and our own analysis of male vs. female bam-mutant gonads (PMID 31329582), but it is at lower levels. Whether the lower level in the testis is due to tissue differences or sex-specific regulation of RNA levels is unknown. Our observations that the HA-tagged Sxl Early protein remains present in somatic cells in L1 larvae, and that GFP expression from the 10.2 kb Sxl PE-GFP can be detected in the soma until L2 could either be due to perdurance of the protein products, or continued sex-specific expression of PE long after the time that it was thought to shut off. This is also long after dosage compensation should have equalized the expression of X chromosome gene expression, meaning that X chromosomes can no longer be “counted” by factors like SisA and SisB. Thus, sex-specific expression of PE at this time would require another mechanism besides the current model (such as feedback regulation of Sxl PE transcription from downstream factors).

      Reviewer #2 (Public Review): 

      Summary: 

      The authors wanted to determine whether cis-acting factors of Sxl - two different Sxl promoters in somatic cells - regulate Sxl in a similar way in germ cells. They also wanted to determine whether trans-acting factors known to regulate Sxl in the soma also regulate Sxl in the germline. 

      Regarding the cis-acting factors, they examine the Sxl "establishment promoter" (SxlPE) that is activated in female somatic cells by the presence of two X chromosomes. Slightly later in development, dosage compensation equalizes X chromosome expression in males and females and so X chromosomes can no longer be counted. The second Sxl promoter is the "maintenance promoter," (SxlPM), which is activated in both sexes. The mRNA produced from the maintenance promoter has to be alternatively splicing from early Sxl protein generated earlier in development by the PE. This leads to an autoregulatory loop that maintains Sxl expression in female somatic cells. The authors used fluorescent in situ hybridization (FISH) with oligopaints to determine the temporal activation of the PE or PM promoters. They find that - unlike the soma - the PE does not precede the PM and instead is activated contemporaneously or later than the PM - this is confusing with the later results (see below). Next, they generated transcriptional reporter constructs containing large segments of the Sxl locus, the 1.5 kb used in somatic studies, a 5.2 kb reporter, and a 10.2 kb. Interestingly the 1.5 kb reporter that was reported to recapitulate Sxl expression in soma and germline was not observed by the authors. The 5.2 kb reporter was observed in female somatic cells but not in germ cells. Only when they include an additional 5 kb downstream of the 5.2 kb reporter (here the 10.2 kb reporter) they did see expression in germ cells but this occurred at the L1 stages. Their data indicate that Sxl activity in the germ requires different cis-regulation than the soma and that the PE is activated later in germ cells than in somatic cells. The authors next use gene editing to insert epitope tags in two distinct strains in the hopes of creating an early Sxl and a later Sxl protein derived from the PE and PM, respectively. The HA-tagged protein from the PE was seen in somatic cells but never in the germline, possibly due to very low expression. The FLAG-tagged late Sxl protein is observed in L2 germ cells. Because the early HA-Sxl protein is not perceptible in germ cells, it is not possible to conclude its role in the germline. However, because late FLAG-Sxl was only observed in L2 germ cells and the PE was detected in L1, this leaves open the possibility that PE produces early HA-Sxl (which currently cannot be detected), which then alternatively splices the transcript from the PM. In other words, the soma and germline could have a similar temporal relationship between the two Sxl promoters. While I agree with the authors about this conclusion, the earlier work with the oligopaints leads to the conclusion that SE is active after PM. This is confusing. 

      The temporal relationship between Sxl PE and Sxl PM in the germline is indeed confusing. One source of confusion comes from whether one is discussing Sxl protein production or promoter activity. As the reviewer nicely summarizes, our transcription analysis with oligopaints indicates that, unlike in the soma, Sxl PE is NOT on in the germline prior to PM. Our other data indicate that PE is instead likely only active well after transcription from PM has begun. However, this still means that the temporal order of the EARLY and LATE Sxl proteins can be the same as the soma. Even if PM is active well before PE in the germline, the PE transcript cannot produce any functional protein in the absence of being alternatively spliced by the Sxl protein (Sxl autoregulation). Thus, even if PM is active before PE in the germline, we would not expect to observe any LATE Sxl protein until the PE promoter comes on, and produces a pulse of EARLY Sxl protein. The fact that we observe LATE Sxl protein at L2 is consistent with our observation that the 10.2 kb Sxl PE reporter is active at L1. We will attempt to explain all of this better in a revised manuscript.

      Next, the authors wanted to turn their attention to the trans-acting factors that regulate Sxl in the soma, including Sisterless A (SisA), SisB, Runt, and the JAK/STAT ligand Unpaired. Using germline RNAi, the authors found that only knockdown of SisA causes ovarian tumors, similar to the loss of Sxl, suggesting that SisA regulates Sxl (ie the PE) in both the soma and the germline. They generated a SisA null allele using CRISPR/Cas9 and these animals had ovarian tumors and germ cell-less ovaries. FISH revealed that sisA is activated in primordial germ cells in stages 3-6 before the activation of Sxl. They used CRISPR-Cas9 to generate an endogenously-tagged SisA and found that tagged SisA was expressed in stage 3-6 PCGs, which is consistent with activating PE in the germline. They showed that sisA is upstream of Sxl as germline depletion of sisA led to a significant decrease in expression from the 10.2 kb PE reporter and in SXL protein. The authors could rescue the ovarian tumors and loss of Sxl protein upon germline depletion of sisA by supplying Sxl from another protein (the otu promoter). These data indicate that sisA is necessary for Sxl activation in the germline. However, ectopic sisA in germ cells in the testis did not lead to ectopic Sxl, suggesting that sisA is not sufficient to activate Sxl in the germline. 

      Strengths: 

      (1) The genetic and genomic approaches in this study are top-notch and they have generated reagents that will be very useful for the field. 

      (2) Excellent use of powerful approaches (oligo paint, reporter constructs, CRISPR-Cas9 alleles). 

      (3) The combination of state of art approaches and quantification of phenotypes allows the authors to make important conclusions. 

      Weaknesses: 

      (1) Confusion in line 127 (this indicates that SxlPE is not activated before SxlPM in the germline) about PE not being activated before the PM in the germline when later figures show that PE is activated in L1 and late Sxl protein is seen in L2. It would be helpful to the readers if the authors edited the text to avoid this confusion. Perhaps more explanation of the results at specific points would be helpful. 

      We agree--see response above.

      Reviewer #3 (Public Review): 

      Summary: 

      The mechanisms governing the initial female-specific activation of Sex-lethal (Sxl) in the soma, the subsequent maintenance of female-specific expression and the various functions of Sxl in somatic sex determination and dosage compensation are well documented. While Sxl is also expressed in the female germline where it plays a critical role during oogenesis, the pathway that is responsible for turning Sxl on in germ cells has been a long-standing mystery. This manuscript from Goyal et al describes studies aimed at elucidating the mechanism(s) for the sex-specific activation of the Sex-lethal (Sxl) gene in the female germline of Drosophila. 

      In the soma, the Sxl establishment promoter, Sxl-Pe, is regulated in pre-cellular blastoderm embryos in somatic cells by several X-linked transcription factors (sis-a, sis-b, sis-c and runt). At this stage of development, the expression of these transcription factors is proportional to gene dose, 2x females and 1x in males. The cumulative two-fold difference in the expression of these transcription factors is sufficient to turn Sxl-Pe on in female embryos. Transcripts from the Sxl-Pe promoter encode an "early" version of the female Sxl protein, and they function to activate a splicing positive autoregulatory loop by promoting the female-specific splicing of the initial pre-mRNAs derived from the Sxl maintenance promoter, Sxl-Pm (which is located upstream of Sxl-Pm). These female Sxl-Pm mRNAs encode a Sxl protein with a different N-terminus from the Sxl-Pe mRNAs, and they function to maintain female-specific splicing in the soma during the remainder of development. 

      In this manuscript, the authors are trying to understand how the Sxl-Pm positive autoregulatory loop is established in germ cells. If Sxl-Pe is used and its activation precedes Sxl-Pm as is true in the soma, they should be able to detect Sxl-Pe transcripts in germ cells before Sxl-Pm transcripts appear. To test this possibility, they generated RNA FISH probes complementary to the Sxl-Pe first exon (which is part of an intron sequence in the Sxl-Pm transcript) and to a "common sequence" that labels both Sxl-Pe and Sxl-Pm transcripts. Transcripts labeled by both probes were detected in germ cells beginning at stage 5 (and reaching a peak at stage 10), so either the Sxl-Pm and Sxl-Pe promoters turn on simultaneously, or Sxl-Pe is not active. 

      They next switched to Sxl-Pe reporters. The first Sxl-Pe:gfp reporter they used has a 1.5 kb upstream region which in other studies was found to be sufficient to drive sex-specific expression in the soma of blastoderm embryos. Also like the endogenous Sxl gene it is not expressed in germ cells at this early stage. In 2011, Hashiyama et al reported that this 1.5 kb promoter fragment was able to drive gfp expression in Vasa-positive germ cells later in development in stage 9/10 embryos. However, because of the high background of gfp in the nearby soma, their result wasn't especially convincing. Though they don't show the data, Goyal et al indicated that unlike Hashiyama et al they were unable to detect gfp expressed from this reporter in germ cells. Goyal et al extended the upstream sequences in the reporter to 5 kb, but they were still unable to detect germline expression of gfp. 

      Goyal et al then generated a more complicated reporter which extends 5 kb upstream of the Sxl-Pe start site and 5 kb downstream-ending at or near 4th exon of the Sxl-Pm transcript (the Sxl-Pe10 kb reporter). (The authors were not explicit as to whether the 5 kb downstream sequence extended beyond the 4th exon splice junction-in which case splicing could potentially occur with an upstream exon(s)-or terminated prior to the splice junction as seems to be indicated in their diagram.) With this reporter, they were able to detect sex-specific gfp expression in the germline beginning in L1 (first instar larva). With the caveat that gfp detection might be delayed compared to the onset of reporter activation, these findings indicated that the sequences in the reporter are able to drive sex-specific transcription in the germline at least as early as L1. 

      The authors next tagged the N-terminal end of the Sxl-Pe protein with HA (using Crispr/Cas9) and the N-terminal end of Sxl-Pm protein with Flag. They report that the HA-Sxl-Pe protein is first detected in the soma at stage 9 of embryogenesis. Somatic HA-Sxl-Pe protein persists into L1, but is no longer detected in L2. However, while somatic HA-Sxl-Pe protein is detected, they were unable to detect HA-Sxl-Pe protein in germ cells. In the case of FLAG-Sxl-Pm, it could first be detected in L2 germ cells indicating that at this juncture the Sxl-positive autoregulatory loop has been activated. This contrasts with Sxl-Pm transcripts which are observed in a few germ cells at stage 5 of embryogenesis, and in most germ cells by stage 10. The authors propose (based on the expression pattern of the Sxl-Pe10kb reporter and the appearance of Flag-Sxl-Pm protein) that Sxl-Pe comes on in germ cells in L1, and that the Sxl-Pe protein activates the female splicing of Sxl-Pm transcripts, giving detectable Flag-Sxl-Pm proteins beginning in L2. 

      To investigate the signals that activate Sxl-Pe in germ cells, the authors tested four of the X-linked genes (sis-a, sis-b, sis-c, and runt) that function to activate Sxl-Pe in the soma in early embryos. RNAi knockdown of sis-b, sis-c, and runt had no apparent effect on oogenesis. In contrast, knockdown of sis-a resulted in tumorous ovaries, a phenotype associated with Sxl mutations. (Three different RNAi transgenes were tested-two gave this phenotype, the third did not.) Sxl-Pe10kb reporter activity in L1 female germ cells is also dependent on sis-A. 

      Several approaches were used to confirm a role for sis-a in a) oogenesis and b) the activation of the Sxl-Pm autoregulatory loop. They showed that sis-a germline clones (using tissue-specific Crispr/Cas9 editing) resulted in the tumorous ovary phenotype and reduced the expression of Sxl protein in these ovaries. They found that sis-a transcripts and GFP-tagged Sis-A protein are present in germ cells. Finally, they showed tumorous ovary phenotype induced by germline RNAi knockdown of sis-a can be partially rescued by expressing Sxl in the germ cells. 

      Critique: 

      While this manuscript addresses a longstanding puzzle - the mechanism activating the Sxl autoregulatory loop in female germ cells-and likely identified an important germline transcriptional activator of Sxl, sis-a, the data that they've generated doesn't make a compelling story. At every step, there are puzzle pieces that don't fit the narrative. In addition, some of their findings are inconsistent with many previous studies. 

      We respect and appreciate this reviewer for the detailed comments. However, we feel that the claim that our work doesn’t “make a compelling story” and that many “pieces…don’t fit the narrative” is incorrect. The main issue that this reviewer raises is that we do not know if Sxl “early” transcription in the germline initiates from the Pe promoter. This is true, which we fully acknowledge, but the detail of whether “germline early” transcription of Sxl initiates from Pe or from other, as yet undefined, germline promoter does not affect the main conclusions of the paper. These conclusions are that a) regulation of Sxl in the germline is fundamentally different from in the soma and 2) despite point (1), sisA acts as an activator of Sxl in both the soma and the germline. Neither of these main points is disputed by this reviewer.

      (1) The authors used RNA FISH to time the expression of Sxl-Pe and Sxl-Pm transcripts in germ cells. Transcripts complementary to Sxl-Pe and Sxl-Pm were detected at the same time in embryos beginning at stage 5. This is not a definitive experiment as it could mean a) that Sxl-Pe and Sxl-Pm turn on at the same time, b) that Sxl-Pe comes on after Sxl-Pm (as suggested by the Sxl-Pe10kb reporter) or c) Sxl-Pe never comes on. 

      When designing this experiment, we wanted to test whether the “soma model” of Pe activation before Pm was also true in the germ cells. Our data clearly demonstrate that transcripts beginning downstream of Pe are not expressed prior to transcripts beginning downstream of Pm. Thus, we can state that the “soma model” of Pe first and then Pm does not occur in the germline, which is very interesting. However, we cannot make any other conclusions about Pe in the germline from these data, as the reviewer indicates.

      (2) Hashiyama et al reported that they detected gfp expression in stage 9/10 germ cells from a 1.5 kb Sxl-Pe-gfp. As noted above, this result wasn't entirely convincing and thus it isn't surprising that Goyal et al were unable to reproduce it. Extending the upstream sequences to just before the 1st exon of Sxl-Pm transcripts also didn't give gfp expression in germ cells. Only when they added 5 kb downstream did they detect gfp expression. However, from this result, it isn't possible to conclude that the Sxl-Pe promoter is actually driving gfp expression in L1 germ cells. Instead, the Sxl promoter active in the germ line could be anywhere in their 10 kb reporter. 

      We agree that we have not determined the transcriptional start sites for Sxl in the germline and it is possible that the 10.2 kb reporter uses a different promoter than Pe, as long as that transcript can also be spliced into exon 4 where the GFP tag has been placed. The three types of experiments conducted—FISH to regions of the nascent transcripts, tagged versions of the different predicted ORFs, and promoter-GFP constructs—are extensive, but all have different limitations. Indeed, it would be challenging to determine the transcription start sites in the germline, as it would require obtaining enough L1 larvae to be able to dissociate the animals, or isolated gonads, into single cells in order to FACS purify the germ cells for RACE or long-read sequencing (I’m not sure that L1 larval single-nucleus seq would be enough for calling start sites). Otherwise, there would be no way to determine if expected or unexpected transcripts came from the soma or the germline. We can consider these experiments in the future.

      Fortunately, the main conclusions from this paper do not require knowing whether the germline uses Pe or some other “germline early” promoter that can produce Sxl protein in the absence of autoregulation by existing Sxl protein. The observations that a nascent transcript including the region downstream of Pm is observed in embryonic germ cells, but that the tagged LATE protein is not observed until L2, suggest that the transcript produced in early germ cells cannot produce a functional protein. This is consistent with the need for Sxl autoregulation of the Pm transcript in the germline as in the soma, as was previously thought. This is further supported by the observations that activity of the 10.2 kb reporter is only observed in L1 germ cells, and that the LATE Sxl protein is only observed in germ cells after this point. Thus, we can conclude that either Pe, or another “germline early” promoter, acts to produce female-specific Sxl protein to initiate autoregulation of Sxl splicing and protein production in the germline. We feel that this is a significant advance for the field, and we will make it more clear in the text that the initial expression of Sxl in the germline may not be from the Pe promoter.

      Other conclusions of the manuscript are unaffected by the start site for “germline early” Sxl transcription, including that the germline activates Sxl protein expression much later than the soma, which calls into question previous work indicating an early role for Sxl in the germline. Also unaffected is our conclusion that different enhancer sequences are required for activation of Sxl expression in the germline than in the soma, consistent with previous work demonstrating that the genetics of Sxl activation in the germline are different than in the soma. Lastly, our conclusions that sisA acts upstream of Sxl, and is required for Sxl germline expression, either directly or indirectly, are also unaffected by the nature of the Sxl “germline early” start site.

      (3) At least one experiment suggests that Sxl-Pe never comes on in germ cells. The authors tagged the N-terminus of the Sxl-Pe protein with HA and the N-terminus of the Sxl-Pm protein with Flag. Though they could detect HA-Sxl-Pe protein in the soma, they didn't detect it in germ cells. On the other hand, the Flag-Sxl-Pm protein was detected in L2 germ cells (but not earlier). These results would more or less fit with those obtained for the 10 kb reporter and would support the following model: Prior to L1, Sxl-Pm transcripts are expressed and spliced in the male pattern in both male and female germ cells. During L1, Sxl protein expressed via a mechanism that depends upon a 10 kb region spanning Sxl-Pe (but not on Sxl-Pe) is produced and by L2 there are sufficient amounts of this protein to switch the splicing of Sxl-Pm transcripts from a male to a female pattern-generating Flag-tagged Sxl-Pm protein. 

      As described above, it is indeed possible that another promoter besides Pe is active as the “germline early” promoter. We will make this more clear in a revised version, but the major conclusions of the manuscript are unaffected.

      (4) The 10kb reporter is sex-specific, but not germline-specific. The levels of gfp in female L1 somatic cells are equal to if not greater than those in L1 female germ cells. That the Sxl-Pe10kb reporter is active in the soma complicates the conclusion that it represents a germ line-specific promoter. Germline activity is, however, sensitive to sis-A knockdowns which is plus. Presumably, somatic expression of the reporter wouldn't be sensitive to a (late) sis-A knockdown- but this wasn't shown. 

      We are confused by this comment because we do not conclude that the Pe is a germline-specific promoter. Pe is known to be expressed in the soma, from considerable previous work cited by this reviewer, and the simplest model is that Pe is used in both the soma and the germline, as reflected by our 10.2 kb reporter. It is actually quite interesting how late this promoter seems active in the soma, contrary to current dogma, but we did not study somatic activation of Sxl in this work.

      (5) Their results with the HA-Sxl-Pe protein don't fit with many previous studies-assuming that the authors have explained their results properly. They report that HA-Sxl-Pe protein is first detected in the soma at stage 9 of embryogenesis and that it then persists till L2. However, previous studies have shown that Sxl-Pe transcripts and then Sxl-Pe proteins are first detected in ~NC11-NC12 embryos. In RNase protection experiments, the Sxl-Pe exon is observed in 2-4 hr embryos, but not detected in 5-8 hr, 14-12 hr, L1, L2, L3, or pupae. Northerns give pretty much the same picture. Western blots also show that Sxl-Pe proteins are first detectable around the blastoderm stage. So it is not at all clear why HA-Sxl-Pe proteins are first observed at stage 9 which, of course, is well after the time that the Sxl-Pm autoregulatory loop is established. 

      Given the obvious problems with the initial timing of somatic expression described here, it is hard to know what to make of the fact that HA-tagged Sxl-Pe proteins aren't observed in germ cells. 

      As for the presence of HA-Sxl-Pe proteins later than expected: While RNase protection/Northern experiments showed that Sxl-Pe mRNAs are expressed in 2-4 hr embryos and disappear thereafter, one could argue from the published Western experiments that the Sxl-PE proteins expressed at the blastoderm stage persist at least until the end embryogenesis, though perhaps at somewhat lower levels than at earlier points in development. So the fact that Goyal et al were able to detect HA-Sxl-Pe proteins in stage 9 embryos and later on in L1 larva probably isn't completely unexpected. What is unexpected is that the HA-Sxl-Pe proteins weren't present earlier. 

      We thank the reviewer for this detailed analysis. Since we were not focused on somatic expression of Sxl in this work, it is possible that stage 9 was the earliest stage we observed in our experiments, rather than the earliest stage in which it is ever observed. We will repeat these experiments to verify when the HA-tagged early Sxl protein is first observed. However, these comments have no bearing on our conclusions about Sxl expression in the germline, which is the focus of this manuscript.

      (6) The authors use RNAi and germline clones to demonstrate that sis-A is required for proper oogenesis: when sis-A activity is compromised in germ cells, i) tumorous ovary phenotypes are observed and ii) there is a reduction in the expression of Sxl-Pm protein. They are also able to rescue the phenotypic effects of sis-a knockdown by expressing a Sxl-Pm protein. While the experiments indicating sis-a is important for normal oogenesis and that at least one of its functions is to ensure that sufficient Sxl is present in the germline stem cells seem convincing, other findings would make the reader wonder whether Sis-A is actually functioning (directly) to activate Sxl transcription from promoter X. 

      It is true that we do not know the binding specificity for SisA, which is why we have made no claims about the directness of SisA regulation of Sxl. This does not change our conclusions that sisA is upstream of Sxl activation, since loss of sisA function has a similar phenotype to loss of Sxl, loss of sisA blocks Sxl protein expression, and expression of Sxl rescues the sisA mutant phenotype.

      The authors show that sis-a mRNAs and proteins are expressed in stage 3-5 germ cells (PGCs). This is not unexpected as the X-linked transcription factors that turn Sxl-Pe on are expressed prior to nuclear migration, so their protein products should be present in early PGCs. The available evidence suggests that their transcription is shut down in PGCs by the factors responsible for transcriptional quiescence (e.g., nos and pgc) in which case transcripts might be detected in only one or two PGC-which fits with their images. However, it is hard to believe that expression of Sis-A protein in pre-blastoderm embryos is relevant to the observed activation of the Sxl-Pm autoregulatory loop hours later in L2 larva. 

      It is also not clear how the very low level of gfp-Sis-A seen in only a small subset of migrating germ cells in stage 10 embryos (Figure S6) would be responsible for activating the Sxl-Pe10kb reporter in L1. It seems likely that the small amount of protein seen in stage 10 embryos is left over from the pre-cellular blastoderm stage. In this case, it would not be surprising to discover that the residual protein is present in both female and male stage 10 germ cells. This would raise further doubts about the relevance of the gfp-Sis-A at these early stages. 

      In fact, given the evidence presented implicating sis-a in activating Sxl, (the germline activation of the Sxl-Pe10kb reporter, the RNAi knockdowns, and the germ cell-specific sis-a clones) it is clear that the sis-A RNAs and proteins seen in pre-cellular blastoderm PGCs aren't relevant. The germline clone experiment (and also the RNAi knockdowns) indicates that sis-A must be transcribed in germ cells after Cas9 editing has taken place. Presumably, this would be after transcription is reactivated in the germline (~stage 10) and after the formation of the embryonic gonad (stage 14) so that the somatic gonadal cells can signal to the germ cells. With respect to the reporter, the relevant time frame for showing that sis-A is present in germ cells would be even later in L1. 

      The reviewer is correct in wondering how early sisA transcription can affect late Sxl activation, and we are clear about this conundrum in our manuscript. However, they are incorrect about the early sisA expression. Our experiments examining nascent sisA transcripts indicate that sisA is zygotically expressed in the formed germ cells rather than being leftover from expression in early nuclei. The fact that only a portion of germ cells express sisA at any time may well be due to a timing issue, where not all germ cells express sisA at the same time. They are also incorrect about the timing of Cas9 editing in the germline—the guide RNAs are expressed from a general promoter that is active both maternally and in the early embryo, and the Cas9 RNA from the nos promoter is deposited in the germ plasm where it is translated long before cellularization, meaning that sisA CRISPR knockout can begin at the earliest stages of germ cell formation or before.

      (7) As noted above, the data in this manuscript do not support the idea that Sxl-Pe proteins activate the Sxl-Pm female splicing in the germline. Flybase indicates that there is at least one other Sxl promoter that could potentially generate a transcript that includes the male exon but still could encode a Sxl protein. This promoter "Sxl-Px" is located downstream of Sxl-Pm and from its position it could have been included in the authors' 10 kb reporter. The reported splicing pattern of the endogenous transcript skips exon2, and instead links an exon just downstream of Sxl-Px to the male exon. The male exon is then spliced to exon4. If the translation doesn't start and end at one of the small upstream orfs in the exons close to Sxl-Px and the male exon, a translation could begin with an AUG codon in exon4 that is in frame with the Sxl protein coding sequence. This would produce a Sxl protein that lacks aa sequences from N-terminus, but still retains some function. 

      Another possible explanation for how gfp is expressed from the 10 kb reporter is that the transcript includes the "z" exon described by Cline et al., 2010.

      As discussed above, the exact location of the start site for the Sxl transcript in the germline remains to be determined, but does not affect the main conclusions of the paper.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      First, all the experiments are performed in Jurkat T cells that may not recapitulate the regulation of polarization in primary T cells.

      To extend our results in Jurkat cells forming IS to primary cells, we have now performed experiments using synapses established by Raji cells and either primary T cells  (TCRmediated) or primary CAR T cells (CAR-mediated) (new Suppl. Fig. S7). These experiments clearly show the presence of FMNL1 at these two different IS classes (new Suppl. Fig. S7), similar to what was found in Jurkat-Raji synapses. In addition, since most of the experiments were performed in Jurkat cells, we have changed the title of our manuscript, to be faithful to the main body of our results. New sentences dealing with this important issue have been included in the Results and Discussion sections.

      Moreover, all the experiments analyzing the role of PKCdelta are performed in one clone of wt or PKCdelta KO Jurkat cells. This is problematic since clonal variation has been reported in Jurkat T cells.

      Referee is right, this is the reason why we have studied three different control clones (C3, C9, C7) and three PKCdelta-interfered clones (P5, P6 and S4) all derived from JE6.1 clone and the results have been previously published (Herranz et al 2019)(Bello-Gamboa et al 2020). All these clones expressed similar levels of the relevant cell surface molecules and formed synaptic conjugates with similar efficiency (Herranz et al 2019). The P5, P6 and S4 clones exhibited a similar defect in MVB/MTOC polarization when compared with the control clones (Herranz et al 2019)(Bello-Gamboa et al 2020). Experiments developed by other researchers using a different clone of Jurkat (JE6.1) and primary CD4+ and CD8+ lymphocytes interfered in FMNL1 (Gomez et al. 2007), showed a comparable defect in MTOC polarization to that found in our control clones when were transiently interfered in FMNL1 (Bello-Gamboa et al 2020, this manuscript). In this manuscript we have studied, instead of canonical JE6.1 clone, C3 and C9 control clones derived from JE6.1, since the puromycin-resistant control clones (containing a scramble shRNA) were isolated by limiting dilution together with the PKCdelta-interfered clones (Herranz et al. 2019), thus C3 and C9 clones are the best possible controls to compare with P5 and P6 clones. Please realize that microsatellite analyses, available upon request, supports the identity of our C3 clone with JE6.1. Moreover, when GFP-PKCdelta was transiently expressed in the three PKCdelta-interfered clones, MTOC/MVB polarization was recovered to control levels (Herranz et al. 2019). Therefore, the deficient MTOC/MVB polarization in all these clones is exclusively due to the reduction in PKCdelta expression (Herranz et al 2019), and thus clonal variation cannot underlie our results in stable clones. We have now included new sentences to address this important point and to mention the inability of FMNL1betaS1086D to revert the deficient MTOC polarization occurring in P6 PKCdelta-interfered clone, as occurred in P5 clone. Due to the fact we have now included more figures and panels to satisfy editor and referees’s comments, we have not included the dot plot data corresponding to C9 and P6 clones to avoid a too long and repetitive manuscript. Since all the FMNL1 interference and FMNL1 variants reexpression experiments were performed in transient assays (2-4 days after transfection), there was no chance for any clonal variation in these short-time experiments. Moreover, internal controls using untransfected cells or Raji cells unpulsed with SEE were carried out in all these transient experiments.

      Finally, although convincing, the defect in the secretion of vesicles by T cells lacking phosphorylation of FMNL1beta on S1086 is preliminary. It would be interesting to analyze more precisely this defect. The expression of the CD63‑GFP in mutants by WB is not completely convincing. Are other markers of extracellular vesicles affected, e.g. CD3 positive?

      We acknowledge this comment. It is true that the mentioned results do not directly demonstrate the presence of exosomes at the synaptic cleft of the synapses, since the nanovesicles were harvested from the cell culture supernatants from synaptic conjugates and these nanovesicles could be produced by multi‑directional degranulation of MVBs. To address this important issue, we have performed STED super‑resolution imaging of the immune synapses made by control and FMNL1-interfered cells. Nanosized (100-150 nm) CD63+ vesicles can be found in the synaptic cleft between APC and control cells with polarized MVBs, whereas we could not detect these vesicles in the synaptic cleft from FMNL1-interfered cells that maintain unpolarized MVBs (New Fig. 10). New sentences have been included in the Results and Discussion dealing with this important point. Regarding the use of CD3 as a marker of extracellular vesicles, please realize that CD3 is neither an enriched nor a specific marker of exosomes, since it is also present in plasma membrane shedding vesicles, molting vesicles from microvilli, apoptotic bodies and small cell fragments, apart from exosomes, thus we have preferred to use the canonic exosome marker CD63 as a general exosome reporter readout, for WB and immunofluorescence (MVBs, exosomes), time-lapse of MVBs (suppl. Video 8) and super resolution experiments (Fig. 10).   

      Reviewer #2 (Public Review):

      Summary:

      The authors have addressed the role of S1086 in the FMNL1beta DAD domain in 4 F-actin dynamics, MVB polarization, and exosome secretion, and investigated the potential implication of PKCdelta, which they had previously shown to regulate these processes, in FMNL1beta S1086 phosphorylation. This is based on:

      (1) the documented role of FMNL1 proteins in IS formation

      (2) their ability to regulate F-actin dynamics

      (3) the implication of PKCdelta in MVB polarization to the IS and FMNL1beta phosphorylation

      (4) the homology of the C-terminal DAD domain of FMNL1beta with FMNL2, where a phosphorylatable serine residue regulating its auto-inhibitory function had been previously identified. They demonstrate that FMNL1beta is indeed phosphorylated on S1086 in a PKCdelta-dependent manner and that S1086-phosphorylated FMNL1beta acts downstream of PKCdelta to regulate centrosome and MVB polarization to the IS and exosome release. They provide evidence that FMNL1beta accumulates at the IS where it promotes F-actin clearance from the IS center, thus allowing for MVB secretion.  

      Strengths

      The work is based on a solid rationale, which includes previous findings by the authors establishing a link between PKCdelta, FMNL1beta phosphorylation, synaptic F-actin clearance, and MVB polarization to the IS. The authors have thoroughly addressed the working hypotheses using robust tools. Among these, of particular value is an expression vector that allows for simultaneous RNAi-based knockdown of the endogenous protein of interest (here all FMNL1 isoforms) and expression of wild-‐‑type or mutated versions of the protein as YFP‐tagged proteins to facilitate imaging studies. The imaging analyses, which are the core of the manuscript, have been complemented by immunoblot and immunoprecipitation studies, as well as by the measurement of exosome release (using a transfected MVB/exosome reporter to discriminate exosomes secreted by T cells).

      Weaknesses

      The data on F-‐‑actin clearance in Jurkat T cells knocked down for FMNL1 and expressing wild-type FMNL1 or the non‑phosphorylatable or phosphomimetic mutants thereof would need to be further strengthened, as this is a key message of the manuscript. Also, the entire work has been carried out on Jurkat cells. Although this is an excellent model easily amenable to genetic manipulation and biochemical studies, the key finding should be validated on primary T cells

      Referee’s global assessment is right. To extend our results in Jurkat cells forming IS, we have now performed experiments using synapses established by Raji cells and either primary T cells (TCR-mediated) or primary CAR T cells (CAR-mediated) (new Suppl. Fig. S7). These experiments clearly show the presence of FMNL1 at these two different IS classes (new Suppl. Fig. S7), similar to what was found in Jurkat-Raji synapses. In addition, since most of the experiments were performed in Jurkat cells, we have changed the title of our manuscript, to be faithful to the main body of our results. New sentences have been included in Results and Discussion to address these important points.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      This study shows the role of the phosphorylation of FMNL1b on S1086 on the polarity of T lymphocytes in T lymphocytes, which is a new and interesting finding. It would be important to confirm some of the key results in primary T cells and to analyze in-depth the defect in actin remodeling (quantification of the images, analysis of some key actors of actin remodeling). The description of the defect in the secretion of extracellular vesicles would also benefit from a more accurate analysis of the content of vesicles. 

      Referee is right.  We have now performed experiments using synapses containing Raji cells and either primary T cells (TCR-mediated) or primary CAR T cells (CAR-mediated) (new Suppl. Fig. S7). These experiments clearly show the presence of FMNL1 at these two different IS classes, similar to what was found in Jurkat-‐‑Raji synapses. Moreover, since most of the experiments were performed in Jurkat cells, we have changed the title of our manuscript, to be faithful to the main body of our results. Regarding the use of CD63 instead of other markers such as for instance,  CD3 (as stated by the other referee), please realize that CD3 is neither an enriched nor a specific marker of exosomes, since it is also present in plasma membrane shedding vesicles, molting vesicles from microvilli, apoptotic bodies and small cell fragments, apart from exosomes, thus we have preferred to use the accepted consensus, canonic extracellular vesicle marker CD63 (International Society of Extracellular Vesicles positioning, Thery et al 2018, doi: 10.1080/20013078.2018.1535750. eCollection 2018., Alonso et al. 2011) as a general exosome reporter readout, for both WB, immunofluorescence (MVBs, exosomes) and super-resolution experiments. Accordingly, GFP-‐‑CD63 reporter plasmid was used for exosome secretion in transient expression studies and living cell time-lapse experiments (Suppl. Video 8). Any other exosome marker will also be present in Raji cells and will not allow to analyse exclusively the secretion of exosomes by the effector Jurkat cells, since B lymphocytes produce a large quantity of exosomes upon MHC‑II stimulation by Th lymphocytes (Calvo et al, 2020, doi:10.3390/ijms21072631). To reinforce the exosome data in the context of the immune synapse, STED super-resolution imaging of the immune synapses made by control and FMNL1‑interfered cells was performed. Nanosized (100-150 nm) CD63+ vesicles can be found in the synaptic cleft of control cells with polarized MVBs, whereas we could no detect these vesicles in the synaptic cleft from FMNL1-interfered cells that maintain unpolarized MVBs (new Fig. 10).

      Moreover, all the videos are not completely illustrative. For example, in video 2 it would be more appropriate show only the z plane corresponding to the IS to see more precisely the F-actin remodeling relative to CD63 labeling.

      Referee is right. It is true that the upper rows in some videos may distract the reader of the main message contained in the lower row, that includes the 90º turn-generated, zx plane corresponding to the IS interface. Accordingly, we have maintained the still images of the whole synaptic conjugates in the first row from video 2; this will allow the reader to perceive a general view of the fluorochromes on the whole cell conjugates, as a reference, and to compare precisely the F-actin remodeling relative to CD63 labeling only at the zx interface (lower row). We have now processed the videos 1 and 5 following similar criteria

      The quality of videos 3 and 4 are not good enough. For video 7, it seems that the labeling of phospho-‐‑Ser is very broad at the IS, which is expected since it should label all the proteins that are phosphorylated by PKCs. The resolution of microscopy (at the best 200 to 300 nm) does not allow us to conclude on the co-‐localization of FMNL1b with phospho-‐‑Ser and is thus not conclusive. Finally, the study would benefit from a more careful statistical analysis. The dot plots showing polarity are presented for one experiment. Yet, the distribution of the polarity is broad. Results of the 3 independent experiments should be shown and a statistical analysis performed on the independent experiments

      Referee is right, we have amended video settings (brightness/contrast) in videos 3 and 4 to improve this issue. In addition, we would like to remark that the translocation of proteins to cellular substructures in living cells is not a trivial issue, since certain protein localizations are too dynamic to be properly imaged with enough spatial resolution. The equilibrium resulting from the association/dissociation of a certain protein to the membrane, in addition to the protein diffusion naturally occurring in living cells, as well as signal intensity fluctuations inherent to the stochastic nature of fluorescence emission often provide barriers for image quality (Shroff et al, 2024). Thus, additional image blurring is expected when compared with that observed in fixed samples. However, we think it is important to provide the potential readers with a dynamic view of FMNL1 localization, which can only be achieved through real-time videos, in addition to the still frames from the same videos provided in Fig. 6A (the referee did not argue against the inclusion of these frames), together with images from fixed cells in Fig 6B, for comparison. This is the reason why we have preferred to maintain the improved videos to complement the results of some spare frames from the videos, together with images from fixed cells in the same figure (Fig. 6).

      Regarding video 7, we agree that colocalization is limited by the spatial resolution of confocal  microscopy,  and this fact does not allow us to infer that FMNL1beta is phosphorylated at the IS. However, please realize we have never concluded this in our manuscript.  Instead, we claimed that “colocalization of endogenous FMNL1 and YFP‑FMNL1βWT with anti‑phospho‑Ser  …is compatible with the idea that both endogenous FMNL1 and YFP‑FMNL1βWT are specifically phosphorylated at the cIS”. Moreover, we have now performed colocalization in super‑resolved STED microscopy images, that reduces the XY resolution down to 30-­40 nm (Suppl. Fig. S12), and the results also support colocalization of endogenous FMNL1 with anti-phospho‑Ser PKC at the IS within a 30 nm resolution limit. We have now somewhat softened our conclusion: “Although all these data did not allow us to infer that FMNL1β is phosphorylated at the IS due to the resolution limit of confocal and STED microscopes, the results are compatible with the idea that both endogenous FMNL1 and YFP-FMNL1βWT are specifically phosphorylated at the cIS”.   

      Regarding statistical analyses we agree the dot distribution in the polarity experiments is quite broad, but this is consistent with the end point strategy used by a myriad of research groups (including ourselves) to image an intrinsically stochastic, rapid and asynchronous processes such as immune synapse formation and to score MTOC/MVB  polarization (Calvo et al 2018, https://doi.org/10.3389/fimmu.2018.00684). Despite this fact,  ANOVA  analyses have underscored the statistical significance of all the experiments represented by dot plot experiments. We cannot average or perform meta statistical analyses by combining the equivalent cohort results from independent experiments, since we have observed that small variations of certain variables (SEE concentration, cell recovery, time after transfection, etc.) affect synapse formation and PI values among experiments without altering the final outcome in each case. Please, note that our manuscript includes now 10  multi‑panel figures,  12  multi‑panel supplementary figures and 8 videos, and it is already quite large.  Thus,  we feel the inclusion of redundant, triplicate dot plot figures will dilute and distract to any potential reader from the main message of our already comprehensive contribution. We have now included new sentences at the figure legends to remark ANOVA analyses were executed separately in all the 3 independent experiments.

      Reviewer #2 (Recommendations For The Authors):

      (1) The key findings should be validated on primary CD4+ T cells (of which Jurkat is a transformed model).

      Referee is right. However, as commented by the other referee, the data from activating surfaces clearly shows that the synaptic actin architecture of the immune synapse from primary CD8+ T cells is essentially indistinguishable and thus unbiased from that of Jurkat T cells, but different to that of primary CD4+ cells (Murugesan, 2016). Thus, our data in Jurkat T cells are directly applicable to the synaptic architecture of primary CD8+ cells. In addition, to definitely extend our results in Jurkat cells forming IS, we have performed experiments using synapses established by Raji cells and either primary T cells (TCR-mediated) or primary CAR T cells (CAR-mediated) (new Suppl. Fig. S7) challenged by Raji cells. We have preferred to work with mixed CD4+ and CD8+ cells in order to maintain potential interactions in trans between these subpopulations that may affect or influence IS formation. These experiments clearly show the presence of FMNL1 at these two different IS classes (new Suppl. Fig. S7), similar to what was found in JurkatRaji synapses. Moreover, since most of the experiments were performed in Jurkat cells as stated by the referee, we have changed the title of our manuscript, to circumscribe our results to the model we have used and to be faithful to the main body of our results.

      (2) The image of wt YFP-­FMNL1beta in Figure 4A displays a weak CD63 signal and shows an asymmetric polarization of both the centrosome and MVBs. It should be replaced with a more representative one.

      Referee is right. Accordingly, we have modified the CD63 channel settings (brightness/contrast) in this panel to make it comparable to the other panels in the same figure. In addition, thanks to this referee´s comment, we have realized the position of the MTOC (yellow dot) in the diagram in the right side of the YFP-FMNL1betaWT panels row appeared mislocated, producing the mentioned apparent asymmetry with respect to MVBs’s center of mass (green dot) position. This mistake leads to an apparent segregation between the position of the center of mass of these organelles which certainly does not correspond with the real image. We have now amended the scheme and we apologize for this mistake.

      (3) The images showing F-­actin clearance at the IS (Figure 8, S4, S5) are not very convincing, also when looking at the MFI along the T cell-­‐‑APC interface in the en-­‐face  views.  Since  the  F-­actin  signal  also  includes  some  signal  from  the  APC, transfecting T cells with an actin reporter to selectively image T cell actin could better clarify this key point.

      Referee´s point is correct. However, we (83), and other researchers using the proposed actin reporter approach in the same Raji/Jurkat IS model (Fig. 4 in ref 84) have already excluded the possibility that actin cytoskeleton of Raji cells can also contribute to the measurements of synaptic F-actin. In Materials and Methods, page 37, lines 1048-1055 we included this related sentence:  ¨It is important to remark that MHC-II-antigen triggering on the B cell side of the Th synapse does not induce noticeable F-­actin changes along the synapse (i.e. F-­actin clearing at the central IS), in contrast to TCR stimulation on T cell side (84) (85) (3). In addition, we have observed that majority of F‐‑actin changes along the IS belongs to the Jurkat cell (83). Thus, the contribution to the analyses of the residual, invariant F‐actin from the B cell is negligible using our protocol (83).

      Thus, we can exclude this caveat may affect our results.

      (4) A similar consideration applies to the MVB distribution in the en‑face images. For example, in Figure S5 the MVB profile, with some peripheral distribution, does not appear very different in cells expressing wt YFP‑tagged FMNL1beta versus the S1086A‑expressing cells.

      The referee's assessment regarding Supp. Figure S5 is valid. Using only the plot profile, the outcomes obtained with YFP-FMNL1βWT may appear comparable to those derived from YFP-FMNL1βS1086A. Nonetheless, this resemblance is attributed to the plot profile's exclusive consideration of the MVBs signal in the interface from the immune synapse region (white rectangle). The upper images (second row), where the whole cell is displayed, illustrate that in YFP-FMNL1βWT, MVB are specifically accumulated within this specific region, in contrast to the scattered distribution observed in YFP-FMNL1βS1086A, where MVB are dispersed throughout the cell without distinction. While MVBs are evident in both instances within the synapse region, the reason behind this observation is different. The YFP-FMNL1βWT transfected cell (third column) shows a pronounced MVB concentration within the synaptic area (white rectangle), which leads to MVB PI=0.52, whereas the YFP-FMNL1βS1086A transfected cell (fourth column), as it presents a scattered distribution of MVB throughout the cell, also exhibits some MVB (but only a small proportion of the total cellular MVB) in the synaptic area, which yields MVB PI=-0.09. Please realise that the position of the center of mass of the distribution of MVB (MVBC) labelled in this figure (white squares) is an unbiased parameter that mirrors MVB center of mass polarization. A new sentence has been included in the figure legend to clarify this important point.

      (5) The image in the first row in Figure 6B does not show a clear accumulation of FMNL1beta at the IS, possibly because the T cell is in contact with two APCs. This image should be replaced.

      Referee is right Therefore, we have replaced the quoted example with a single cell:cell synapse that shows a clearer and more localized accumulation in the cIS, thereby avoiding the mentioned caveat.

      (6) In Figure 2A the last row shows what appears to be a T:T cell conjugate (with one cell expressing the YFP-­‐‑tagged protein). The image should be replaced with another showing a T cell-­APC (blue) conjugate.

      Referee is right, we have accordingly replaced the mentioned image with a T cell:APC conjugate.

      (7) The Discussion is very long and dispersive. It would benefit from shortening it and making it more focused.

      Referee is right, we have shortened and focused it, by eliminating the whole second and third paragraphs of the discussion. Moreover, a whole paragraph in page 24 has been also deleted.

      We have also focussed the discussion towards the new data in primary T lymphocytes.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      This paper uses a model of binge alcohol consumption in mice to examine how the behaviour and its control by a pathway between the anterior insular cortex (AIC) to the dorsolateral striatum (DLS) may differ between males and females. Photometry is used to measure the activity of AIC terminals in the DLS when animals are drinking and this activity seems to correspond to drink bouts in males but not females. The effects appear to be lateralized with inputs to the left DLS being of particular interest. 

      Strengths: 

      Increasing alcohol intake in females is of concern and the consequences for substance use disorder and brain health are not fully understood, so this is an area that needs further study. The attempt to link fine-grained drinking behaviour with neural activity has the potential to enrich our understanding of the neural basis of behaviour, beyond what can be gleaned from coarser measures of volumes consumed etc. 

      Weaknesses: 

      The introduction to the drinking in the dark (DID) paradigm is rather narrow in scope (starting line 47). This would be improved if the authors framed this in the context of other common intermittent access paradigms and gave due credit to important studies and authors that were responsible for the innovation in this area (particularly studies by Wise, 1973 and returned to popular use by Simms et al 2010 and related papers; e.g., Wise RA (1973). Voluntary ethanol intake in rats following exposure to ethanol on various schedules. Psychopharmacologia 29: 203-210; Simms, J., Bito-Onon, J., Chatterjee, S. et al. Long-Evans Rats Acquire Operant Self-Administration of 20% Ethanol Without Sucrose Fading. Neuropsychopharmacol 35, 1453-1463 (2010).)

      We appreciate the reviewer’s perspective on the history of the alcohol research field. There are hundreds of papers that could be cited regarding all the numerous different permutations of alcohol drinking paradigms. This study is an eLife “Research Advances” manuscript that is a direct follow-up study to a previously published study in eLife (Haggerty et al., 2022) that focused on the Drinking in the Dark model of binge alcohol drinking. This study must be considered in the context of that previous study (they are linked), and thus we feel that a comprehensive review of the literature is not appropriate for this study.

      The original drinking in the dark demonstrations should also be referenced (Rhodes et al., 2005). Line 154 Theile & Navarro 2014 is a review and not the original demonstration. 

      This is a good recommendation. We have added this citation to Line 33 and changed Line 154.

      When sex differences in alcohol intake are described, more care should be taken to be clear about whether this is in terms of volume (e.g. ml) or blood alcohol levels (BAC, or at least g/kg as a proxy measure). This distinction was often lost when lick responses were being considered. If licking is similar (assuming a single lick from a male and female brings in a similar volume?), this might mean males and females consume similar volumes, but females due to their smaller size would become more intoxicated so the implications of these details need far closer consideration. What is described as identical in one measure, is not in another. 

      As shown in Figure 1, all measures of intake are reported as g/kg for both water and alcohol to assess intakes across fluids that are controlled by body weights. We do not reference changes in fluid volume or BACs to compare differences in measured lickometry or photometric signals, except in one instance where we suggest that the total volume of water (ml) is greater than the total amount of alcohol (ml) consumed in DID sessions, but this applies generally to all animals, regardless of sex, across all the experimental procedures.

      In Figure 2 – Figure Supplement 1 we show drinking microstructures across single DID sessions, and that males and females drink similarly, but not identically, when assessing drinking measures at the smallest timescale that we have the power to detect with the hardware we used for these experiments. Admittedly, the variability seen in these measures is certainly non-zero, and while we are tempted to assume that there exist at least some singular drinks that occur identically between males and females in the dataset that support the idea that females are simply just consuming more volume of fluid per singular drink, we don’t have the sampling resolution to support that claim statistically. Further, even if females did consume more volume per singular drink that males, we do not believe that is enough information to make the claim that such behavior leads to more “intoxication” in females compared males, as we know that alcohol behaviors, metabolism, and uptake/clearance all differ significantly by sex and are contributing factors towards defining an intoxication state. We’ve amended the manuscript to remove any language of referencing these drinking behaviors as identical to clear up the language.

      No conclusions regarding the photometry results can be drawn based on the histology provided. Localization and quantification of viral expression are required at a minimum to verify the efficacy of the dual virus approach (the panel in Supplementary Figure 1 is very small and doesn't allow terminals to be seen, and there is no quantification). Whether these might differ by sex is also necessary before we can be confident about any sex differences in neural activity. 

      We provide hit maps of our fiber placements and viral injection centers, as we have, and many other investigators do regularly for publication based on histological verification. Figure 1A clearly shows the viral strategy taken to label AIC to DLS projections with GCaMP7s, and a representative image shows green GCaMP positive terminals below the fiber placement. Considering the experiments, animals without proper viral expression did not display or had very little GCaMP signal, which also serves as an additional expression-based control in addition to typical histology performed to confirm “hits”. These animals with poor expression or obvious misplacement of the fiber probes were removed as described in the methods. Further, we also report our calcium signals as z-scored differences in changes in observed fluorescence, thus we are comparing scaled averages of signals across sexes, and days, which helps minimize any differences between “low” or “high” viral transduction levels at the terminals, directly underneath the tips of the fibers.

      While the authors have some previous data on the AIC to DLS pathway, there are many brain regions and pathways impacted by alcohol and so the focus on this one in particular was not strongly justified. Since photometry is really an observational method, it's important to note that no causal link between activity in the pathway and drinking has been established here. 

      As mentioned above, this article is an eLife Research Advances article that builds on our previous AIC to DLS work published in eLife (Haggerty et al., 2022). Considering that this is a linked article, a justification for why this brain pathway was chosen is superfluous. In addition, an exhaustive review of all the different brain regions and pathways that are affected by binge alcohol consumption to justify this pathway seems more appropriate to a review article than an article such as this.  

      We make no claims that photometric recordings are anything but observational, but we did observe these signals to be different when time-locked to the beginning of drinking behaviors. We describe this link between activity in the pathway and drinking throughout the manuscript. It is indeed correlational, but just because it is not causal does not mean that our findings are invalid or unimportant.

      It would be helpful if the authors could further explain whether their modified lickometers actually measure individual licks. While in some systems contact with the tongue closes a circuit which is recorded, the interruption of a photobeam was used here. It's not clear to me whether the nose close to the spout would be sufficient to interrupt that beam, or whether a tongue protrusion is required. This detail is important for understanding how the photometry data is linked to behaviour. The temporal resolution of the GCaMP signal is likely not good enough to capture individual links but I think more caution or detail in the discussion of the correspondence of these events is required. 

      The lickometers do not capture individual licks, but a robust quantification of the information they capture is described in Godynyuk et al. 2019 and referenced in multiple other papers (Flanigan et al. 2023, Haggerty et al. 2022, Grecco et al. 2022, Holloway et al. 2023) where these lickometers have been used. However, individual lick tracking is not a requirement for tracking drinking behaviors more generally. The lickometers used clearly track when the animals are at the bottles, drinking fluids, and we have used the start of that lickometer signal to time-lock our photometry signals to drinking behaviors. We make no claims or have any data on how photometric signals may be altered on timescales of single licks. In regard to how AIC to DLS signals change on the second time scale when animals initiate drinking behaviors, we believe we explain these signals with caution and in context of the behaviors they aim to describe.

      Even if the pattern of drinking differs between males and females, the use of the word "strategy" implies a cognitive process that was never described or measured. 

      We use the word strategy to describe a plan of action that is executed by some chunking of motor sequences that amounts to a behavioral event, in this case drinking a fluid. We do not mean to imply anything further than this by using this specific word.

      Reviewer #2 (Public Review): 

      Summary: 

      This study looks at sex differences in alcohol drinking behaviour in a well-validated model of binge drinking. They provide a comprehensive analysis of drinking behaviour within and between sessions for males and females, as well as looking at the calcium dynamics in neurons projecting from the anterior insula cortex to the dorsolateral striatum. 

      Strengths: 

      Examining specific sex differences in drinking behaviour is important. This research question is currently a major focus for preclinical researchers looking at substance use. Although we have made a lot of progress over the last few years, there is still a lot that is not understood about sex-differences in alcohol consumption and the clinical implications of this. 

      Identifying the lateralisation of activity is novel, and has fundamental importance for researchers investigating functional anatomy underlying alcohol-driven behaviour (and other reward-driven behaviours). 

      Weaknesses: 

      Very small and unequal sample sizes, especially females (9 males, 5 females). This is probably ok for the calcium imaging, especially with the G-power figures provided, however, I would be cautious with the outcomes of the drinking behaviour, which can be quite variable. 

      For female drinking behaviour, rather than this being labelled "more efficient", could this just be that female mice (being substantially smaller than male mice) just don't need to consume as much liquid to reach the same g/kg. In which case, the interpretation might not be so much that females are more efficient, as that mice are very good at titrating their intake to achieve the desired dose of alcohol. 

      We agree that the “more efficient” drinking language could be bolstered by additional discussion in the text, and thus have added this to the manuscript starting at line 440.

      I may be mistaken, but is ANCOVA, with sex as the covariate, the appropriate way to test for sex differences? My understanding was that with an ANCOVA, the covariate is a continuous variable that you are controlling for, not looking for differences in. In that regard, given that sex is not continuous, can it be used as a covariate? I note that in the results, sex is defined as the "grouping variable" rather than the covariate. The analysis strategy should be clarified. 

      In lines 265-267, we explicitly state that the covariate factor was sex, which is mathematically correct based on the analyses we ran. We made an in-text error where we referred to sex as a grouping variable on Line 352, when it should have been the covariate. Thank you for the catch and we have corrected the manuscript.

      But, to reiterate, we are attempting to determine if the regression fits by sex are significantly different, which would be reported as a significant covariate. Sex is certainly a categorical variable, but the two measures at which we are comparing them against are continuous, so we believe we have the validity to run an ANCOVA here.

      Reviewer #3 (Public Review): 

      Summary: 

      In this manuscript by Haggerty and Atwood, the authors use a repeated binge drinking paradigm to assess how water and ethanol intake changes in male in female mice as well as measure changes in anterior insular cortex to dorsolateral striatum terminal activity using fiber photometry. They find that overall, males and females have similar overall water and ethanol intake, but females appear to be more efficient alcohol drinkers. Using fiber photometry, they show that the anterior insular cortex (AIC) to dorsolateral striatum projections (DLS) projections have sex, fluid, and lateralization differences. The male left circuit was most robust when aligned to ethanol drinking, and water was somewhat less robust. Male right, and female and left and right, had essentially no change in photometry activity. To some degree, the changes in terminal activity appear to be related to fluid exposure over time, as well as within-session differences in trial-by-trial intake. Overall, the authors provide an exhaustive analysis of the behavioral and photometric data, thus providing the scientific community with a rich information set to continue to study this interesting circuit. However, although the analysis is impressive, there are a few inconsistencies regarding specific measures (e.g., AUC, duration of licking) that do not quite fit together across analytic domains. This does not reduce the rigor of the work, but it does somewhat limit the interpretability of the data, at least within the scope of this single manuscript. 

      Strengths: 

      - The authors use high-resolution licking data to characterize ingestive behaviors. 

      - The authors account for a variety of important variables, such as fluid type, brain lateralization, and sex. 

      - The authors provide a nice discussion on how this data fits with other data, both from their laboratory and others'. 

      - The lateralization discovery is particularly novel. 

      Weaknesses: 

      - The volume of data and number of variables provided makes it difficult to find a cohesive link between data sets. This limits interpretability.

      We agree there is a lot of data and variables within the study design, but also believe it is important to display the null and positive findings with each other to describe the changes we measured wholistically across water and alcohol drinking.

      - The authors describe a clear sex difference in the photometry circuit activity. However, I am curious about whether female mice that drink more similarly to males (e.g., less efficiently?) also show increased activity in the left circuit, similar to males. Oppositely, do very efficient males show weaker calcium activity in the circuit? Ultimately, I am curious about how the circuit activity maps to the behaviors described in Figures 1 and 2. 

      In Figure 3C, we show that across the time window of drinking behaviors, that female mice who drink alcohol do have a higher baseline calcium activity compared to water drinking female mice, so we believe there are certainly alcohol induced changes in AIC to DLS within females, but there remains to be a lack of engagement (as measured by changes in amplitude) compared to males. So, when comparing consummatory patterns that are similar by sex, we still see the lack of calcium signaling near the drinking bouts, but small shifts in baseline activity that we aren’t truly powered to resolve (using an AUC or similar measurements for quantification) because the shifts are so small. Ultimately, we presume that the AIC to DLS inputs in females aren’t the primary node for encoding this behavior, and some recent work out of David Werner’s group (Towner et al. 2023) suggests that for males who drink, the AIC becomes a primary node of control, whereas in females, the PFC and ACC, are more engaged. Thus, the mapping of the circuit activity onto the drinking behaviors more generally represented in Figures 1 and 2 may be sexually dimorphic and further studies will be needed to resolve how females engage differential circuitry to encode ongoing binge drinking behaviors.

      - What does the change in water-drinking calcium imaging across time in males mean? Especially considering that alcohol-related signals do not seem to change much over time, I am not sure what it means to have water drinking change. 

      The AIC seems to encode many physiologically relevant, interoceptive signals, and the water drinking in males was also puzzling to us as well. Currently, we think it may be both the animals becoming more efficient at drinking out of the lickometers in early weeks and may also be signaling changes due to thirst states of taste associated with the fluid. While this is speculation, we need to perform more in-depth studies to determine how thirst states or taste may modulate AIC to DLS inputs, but we believe that is beyond the scope of this current study.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Line 45 - states alcohol use rates are increasing in females across the past half-decade. I thought this trend was apparent over the past half-century? Please consider revising this. 

      According to NIAAA, the rates of alcohol consumption in females compares to males has been closing for about the past 100 years now, but only recently are those trends starting to reverse, where females are drinking similar amounts or more than males.

      Placing more of the null findings into supplemental data would make the long paper more accessible to the reader. 

      In reference to reviewer’s three’s point as well, there is a lot of data we present, and we hope for others to use this data, both null and positive findings in their future work. As formatted on eLife’s website, we think it is important to place these findings in-line as well.

      Reviewer #2 (Recommendations For The Authors): 

      In addition to the points raised about analysis and interpretation in the Public Review, I have a minor concern about the written content. I find the final sentence of the introduction "together these findings represent targets for future pharmacotherapies.." a bit unjustified and meaningless. The findings are important for a basic understanding of alcohol drinking behaviour, but it's unclear how pharmacotherapies could target lateralised aic inputs into dls. 

      There are on-going studies (CANON-Pilot Study, BRAVE Lab, Stanford) for targeted therapies that use technologies like TMS and focused ultrasound to activate the AIC to alleviate alcohol cravings and decrease heavy drinking days. The difficulty with these next-generation therapeutics is often targeting, and thus we think this work may be of use to those in the clinic to further develop these treatments. We agree that this data does not support the development of pharmacotherapies in a traditional sense, and thus have removed the word and added text to reference TMS and ultrasound approaches to bolster this statement in lines 101+.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      We appreciate the feedback provided and refer to our previous response for detailed explanations regarding our decisions on some of the recommendations made by the referees and editors. We have introduced changes as follows:

      • We added a supplementary Figure to Figure 5 to show inhibition by Astemizole at the single channel level.

      • We have corrected Figure 7A, where the normalized current did not reach 1 as a maximum. We had overlooked that this is expected when the prepulse was -160 mV, and the IV is strongly biphasic, but not when coming from -100 mV. We are thankful for this observation, which served to identify that the values for one of the cells were inverted with respect to the others (the sequence of stimuli was different during recording, and this information got lost in the analysis procedure). We have corrected this and made sure that such a mistake had not happened anywhere else.

      • Finally, we have corrected a typo in the discussion, as indicated in the review.

      We include a version with changes marked and a clean version of the manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The authors present a useful analysis of the phenotype of sheep in which the muscle developmental regulator myostatin has been mutated in a FGF5 knockout background. The goal was to produce sheep with a "double-muscled" phenotype, yet the genetically engineered sheep exhibited meat with a smaller cross-sectional area and higher number of muscle fibers. The work extends the extensive body of knowledge already published in this area. The authors provide evidence using in vitro experiments that Fosl1 regulates myogenesis, but the strength of evidence relating to the muscle phenotype and underlying cellular and molecular mechanism is inadequate.

      Thanks for assessment. According to the reviewers' comments, we have supplemented and updated the data on muscle phenotypes, and the molecular mechanisms also have been supplemented accordingly, such as FOSL1 silencing and inhibition, as as well as possible secondary fusion of myoblasts regulated by calcium signaling. Meanwhile, considering the suggestions of editors and reviewers, we have also supplemented the data on serum MSTN regulation. Given that the phenotype of MSTN gene editing is mutation site dependent, we directly cultured skeletal muscle satellite cells using serum from WT and MF+/- sheep, and showed that the serum regulation cannot be ignored after MSTN_Del273C mutation with _FGF5 knockout.

      Public Review:

      Chen and collaborators first analysed in sheep embryonic gene editing using CRISPR-Cas9 technology to invalidate the two alleles of Mstn and Fgf5 genes by using different ratios of Cas9 mRNA and sgRNA. They showed that a ratio of 1:10 had highest efficiency and they successfully generated two sheep with biallelic mutations of both genes. Materials and Methods on the generation of gened edited sheep is entirely missing. The data on these gene edited sheep have been already published twice by the authors in different contexts. Other groups reported on gene editing of Mstn or Fgf5 in sheep embryos and the resulting phenotypes.

      We thank the reviewers for pointing out our negligence and shortcomings. We have provided detailed information on the generation method of gene editing sheep in the Materials and Methods. Briefly, gene-edited sheep were produced by injecting MSTN sgRNA, FGF5 sgRNA, and Cas9 mRNA into embryos in different ratio.

      Although the findings are interesting, they do not provide sufficiently new scientific information or advancements in producing genetically modified livestock with improved production characteristics. While the MSTNDel273 sheep exhibited an increased number of muscle fibers, the data provided did not demonstrate a significant improvement in meat productions, quality or quantity in the MSTNDel273 sheep vs WT.

      Thank you very much for your constructive comments. Considering the lack of data on improving production traits, we have further supplemented the data on meat yield and quality of MSTN_Del273C mutation with _FGF5 knockout sheep in Table S6-10. Although these improvements were not significant enough, our data showed increased meat production traits in MSTN_Del273C mutation with _FGF5 knockout sheep, such as the proportion of hind leg meat to carcass and the proportion of gluteus medius to carcass. For example, the proportion of hind leg meat was significantly increased by 21.2% (Table S7), and the proportion of gluteus medius in the carcass of MF+/- sheep was significantly (P<0.01) increased by 26.3% compared to WT sheep (Figure 2K). In addition, there were no significant (P>0.05) differences in pH, color, drip loss, cooking loss, shearing force, and amino acid content of the longissimus dorsi between WT and MF+/- sheep (Table S8-10). All these results demonstrated that the MSTN_Del273C mutation with _FGF5 knockout sheep had well-developed hip muscles with smaller muscle fibers, which do not affect meat quality, and this phenotype may be dominated by MSTN gene.

      The authors indicate that sgRNA design changes in addition to changing the molar ratio of Cas9MRNA:sgRNA improved the ability to generate biallelic homozygous mutant sheep; however, the data provided to not demonstrate any significant difference. Given the small number of sheep that were actually produced and evaluated,it is extremely difficult to demonstrate anything that was analyzed to be significantly (statistically) different between MSTNDel273 sheep and WT, yet the authors seem to ignore this in much of their discussion. There is no explanation as to why the authors started with sheep that were FGF5 knockouts. The reviewer assumes that this was simply a line of sheep available from previous studies and the goal was to produce sheep with both improved hair/wool characteristics in addition to improved muscle development. However, the use of FGF5 knockout sheep complicates the ability to accurately decipher the unique aspects associated with targeting only myostatin for knock-out. At minimum, this is a variable that has to be considered in the statistical analysis. No information is provided on the methods used to produce the MSTNDel273 sheep, which is fundamentally important. It is assumed they were produced by injecting one-cell zygotes then transferring these into surrogate females. The methods employed might have a profound effect on the outcome.

      We greatly appreciate your review. In the current study, we did not discuss the impact of changes in sgRNA design on the ability to generate biallelic homozygous mutant sheep. In fact, we focused on the delivery molar ratio of Cas9 mRNA to sgRNA and found that increasing the molar ratio of Cas9:sgRNA can improve the ability to produce homozygous biallelic mutations in sheep. We apologize for neglecting this statistical analysis, which was tested for significance of differences in the revised version by the chi-square test. Other restrictions related to the actual production and evaluation of the number of sheep were analyzed in our additional discussion. It should be explained to the reviewers that the gene-edited sheep we produced did not start with FGF5 knockout sheep. As hypothesized by the reviewers, we used a one-step method to simultaneously edit the two genes of MSTN and FGF5 to concomitantly increase muscle yield and improve wool characteristics in sheep, which resulted in knockout of the FGF5 gene and mutation of the MSTN gene. As speculated by the reviewers, the MSTN_Del273C mutation with _FGF5 knockout sheep was generated by injecting sgRNA and Cas9 mRNA of MSTN and FGF5 into a single fertilized egg and then transplanted into a surrogate mother. We have provided detailed information on the generation method of gene edited sheep in the Materials and Methods section.

      Authors genotyped one sheep with a biallelic three base pair deletion in Mstn exon 3 and a compound heterozygote mutation in Fgf5 with a 5 nucleotides deletion on one allele and 37 nucleotides deletion on the other allele, partially spanning over the same region. This sheep developed a double muscle phenotype, which was documented using photography and CT scan. The hair phenotype was not further addressed, but authors referred to a previous publication.

      Thank you for your review. In the current study, we only focused our perspective on the muscle phenotype, while the data on the hair phenotype involved another study. Therefore, we referred to our previous publication on hair phenotypes, in which the mutation locus in FGF5 gene-edited sheep is the same as in the current study.

      Authors performed morphometric studies on two distinct muscles, longissimus dorsi and gluteus medius, and found a profound fiber hypotrophy in the Mstn-/-;Fgf5-/- double mutants, with a shift from larger fiber diameter to smaller fiber sizes. Morphometric studies showed only a low percentage of fibers in wt and mutant sheep had fiber cross sectional areas larger than 800 µm2, whereas about 30% in wt and about 60% in the mutant had CSA of <400 µm2. The report of one case, without reproducing the phenotype in other sheep, is scientifically insufficient. The fiber sizes in wt sheep remains far below previously published reports in sheep (about 3-5 times smaller) and as compared to other species, which suggests a methodological error in morphometric methods.

      We greatly appreciate your careful review. There is indeed an error in morphological analysis of the MF-/- sheep longissimus dorsi and gluteus medius muscles. After carefully checked, we found that the reason for the fiber sizes in WT sheep remains far below previously published reports in sheep was due to the incorrect use of scale. Thus, we re-scanned the tissue sections and re-calculate the cross-sectional area of muscle fibers and the number of muscle fiber cells per unit area with the correct scale. In this case, the average cross-sectional area of muscle fibers in WT sheep was approximately 1800 μm2, which is consistent with the previous report. We once again salute the reviewing expert for such a careful and conscientious review. Considering the profound fiber hypotrophy in MSTN_Del273C mutation with _FGF5 knockout sheep as pointed out by the reviewer, we performed a statistical analysis on the proportion of centrally nucleated myofibres between WT and MF+/- sheep, which can characterize the occurrence of muscle fiber hypotrophy. The results showed that there was no significant difference in the proportion of centrally nucleated myofibres between WT and MF+/- sheep (Figure S2D). At the same time, we also analyzed the mRNA expression levels of muscle fiber hypotrophy and muscle atrophy related genes, such as MTM1, DMD, IGF1, SMN1, and GAA. Although the levels of MTM1, IGF1, SMN1, and GAA were significantly increased (Figure S2E), this elevation did not result in the occurrence of muscle fiber hypotrophy and muscle atrophy, but was beneficial for muscle formation. Therefore, we suggest that the phenomenon produced by MSTN_Del273C mutation with _FGF5 knockout may not be muscle fiber hypotrophy. Because MSTN_Del273C mutation with _FGF5 knockout significantly promotes the proliferation of sheep skeletal muscle satellite cells (Figure 3A-F), and more importantly, its muscle phenotype in MF-/- and MF+/- sheep were improved, including the "double-muscle" phenotype of the rump (Figure 2A), the proportion of gluteus medius in the carcass (Figure 2K), and the proportion of hind leg meat (Table S7).

      The authors also investigated the influence of Fgf5 mutation on muscle development. They determined fiber cross sectional area in heterozygous Fgf5 mutant (number of investigated animals not given) and conclude that Mstn mutation but not Fgf5 mutation caused the double muscle phenotype. Results are insufficient to support this conclusion. Firstly, authors investigated heterozygous FGF5 sheep and not homozygous mutants. Secondly, FGF5 has previously been shown to stimulate expansion of connective tissue fibroblasts and to inhibit skeletal muscle development during limb embryonic development (Clase et al. 2000). Of note, Mstn is also expressed during embryonic development. A combined knockout could therefore entail synergistic effects and cause muscle hyperplasia that is not found in individual knockout, a hypothesis that was not addressed by the authors.

      Thank you very much for your critical review, which is very valuable for improving the quality of our manuscript. We have given the number of animals studied in all figure legends. Given the lack of MSTN and FGF5 single gene edited sheep, both homozygous and heterozygous sheep, especially MSTN single gene edited sheep, we have weakened the view that MSTN mutations rather than FGF5 mutations lead to “double-muscle” phenotype in conclusion and discussion. As you have mentioned, our current data is indeed insufficient to support this conclusion. In addition, considering the expression of MSTN and FGF5 in embryonic development and their regulation of skeletal muscle development, we examined the expression of MSTN and FGF5 in individual development after MSTN_Del273C mutation with _FGF5 knockout (Figure S2A). However, these results are limited by the animals involved in embryonic development, especially single gene edited embryos. We greatly appreciate your very meaningful and valuable comments on the possible synergistic effects of combined knockdown. We will prepare MSTN and FGF5 single gene edited sheep to further explore possible synergistic effects in the following study.

      The authors generated and studied an F1 generation of mutant sheep with heterozyogous mutation in Mstn and Fgf5. In Mstn+/-;Fgf5+/-, gluteus medius muscle was found to be larger compared to wt sheep, whereas other muscles were smaller, and overall meat quantity did not change. Morphometric studies revealed a similar muscle fiber hypotrophy and muscle hyperplasia as in the Mstn-/-;Fgf5-/- gluteus muscle.

      Thank you for your comments. We found that the proportion of gluteus medius in MF+/- sheep was larger than that in WT sheep, and in addition, the proportion of hind leg meat also significantly increased (Table S7). Morphological analysis shows that MF+/- sheep exhibited a myofiber hyperplasia phenotype similar to MF-/- sheep.

      In the next part of results, authors investigated the presence of myostatin protein in homozygous Mstn muscle using immunohistochemistry and found no differences compared to wt, however, positive and negative controls are missing. The also determined Mstn transcription and protein quantity using WB in heterozygous Mstn muscle and found no difference. The authors did not provide data to explain of why the herein generated Mstn mutation causes muscle fiber hypotrophy, whereas most work on myostatin abrogation demonstrated fiber hypertrophy.

      Thank you very much for your constructive comments. Due to the lack of necessary positive and negative controls in immunohistochemistry study, we decided to delete the data on immunohistochemistry in the manuscript to further streamline it. In the current study, although mutations in MSTN lead to a decrease in the cross-sectional area of individual fibers, the number of muscle fibers per unit area were increased, and the final result was an increase in muscle volume and a “double-muscle” phenotype, as well as an increase in the proportion of gluteus medius to carcass (Figure 2K) and the proportion of hind leg meat (Table S7). Importantly, there was no significant difference in the proportion of centrally nucleated myofibres between WT and MF+/- sheep (Figure S2D), and the elevated expression levels of muscle fiber hypotrophy and muscle atrophy marker genes MTM1, IGF1, SMN1, and GAA are more beneficial for muscle health. Therefore, we support that this is not a muscle fiber hypotrophy. As for the phenotype of muscle fiber hypertrophy demonstrated by most myostatin abrogation studies, we analyzed the possible reasons in the discussion, that is, the effect of MSTN mutation on muscle fiber phenotype may be mutant site-dependent.

      Authors then isolated myoblasts from hind limbs of 3-month-old sheep fetuses and cultured in presence of 20% fetal bovine serum before switching to differentiation medium containing 2% horse serum. The cultures showed increased proliferation of Mstn+/-;Fgf5+/- myoblasts as well as downregulation of genes associated with muscle differentiation as well as reduced fusion index. No experiments were performed to assure whether the myostatin and FGF5 pathways were inhibited. No control experiments using supplementation with recombinant proteins and using growth factor depleted culture supplements were performed. As FGF5 and myostatin are secreted factors, evidence is missing whether this led to conditioning of the culture medium. Of note, previous work in mice demonstrated that the double muscle phenotype developed independent of satellite cells activity (Amthor et al. 2009).

      We greatly appreciate your valuable suggestions. In addition to detecting the MSTN pathway at the cellular level, we also assayed the expression of MSTN receptors and downstream Smad and Jun families in the gluteus medius, and found that MSTN_Del273C mutation with _FGF5 knockout led to upregulation of two receptors, while the expression of downstream Smad and Jun families was also inhibited to varying degrees (Figure S4A). Considering the possible serum regulation, we also supplemented the data on serum MSTN regulation. Given that the phenotype of MSTN gene editing is mutation site dependent, we directly cultured skeletal muscle satellite cells using serum from WT and MF+/- sheep. We found that serum from MF+/- sheep promoted the proliferation of skeletal muscle satellite cells (Figure S4D). MSTN_Del273C mutation with _FGF5 knockout promoted FOSL1 expression using WT sheep serum (Figure S4E), which was similar to the results of FBS culture and HS induction. The serum from MF+/- sheep strongly stimulated FOSL1 expression and the inhibition of MyoD1 (Figure S4F). These results indicate that serum regulation cannot be ignored after MSTN_Del273C mutation with _FGF5 knockout.

      Authors then performed RNA seq from Mstn+/-;Fgf5+/- muscle and found a number of differentially expressed genes, but none has been previously reported being involved in the myostatin signaling pathway, so the authors chose to only focus on FOSL1 and associated genes. Authors then demonstrated that Pdpn and Ankrd2 were upregulated during myogenic differentiation, whereas FOPSL1 was downregulated. Moreover, Fosl1 transcription was upregulated in myoblasts and myotubes from Mstn+/-;Fgf5+/- muscle. Authors showed an interaction between Fosl1 and Myod1. Moreover, authors demonstrated that Polsl1 directly binds to the Myod1 promoter. Authors also found decreased p38 MARPK protein levels in proliferating myoblasts from Mstn+/-;Fgf5+/- muscle and increased p38 MARPK in differentiating myotubes.

      In the revised version, we have streamlined this section by removing content such as PDPN, AKNRD2, and p38 MAPK, aiming to focus on the MEK-ERK-FOSL1 axis. Meanwhile, we further confirmed the regulatory effect of FOSL1 on MyoD1 by dual luciferase assay.

      Furthermore, gain-of-function by overexpressing FOSL1 promoted cell proliferation and inhibited differentiation, and tert-butylhydroquinone, an indirect activator of FOSL1 also inhibited myogenic differentiation. The findings do not support the idea that FOSL1 is not involved, but neither do they strongly support the involvement of FOSL1. The observations made by the authors could be co-incidental and not causative in nature.

      We greatly appreciate the valuable suggestions provided by the reviewers, which are of great significance for improving our manuscript. Considering the reviewers’ suggestions, we supplemented the FOSL1 loss-of-function experiments and found that interfering with FOSL1 can inhibit the proliferation and promote differentiation of skeletal muscle satellite cells, which is contrary to the results of overexpression of FOSL1 (Figure 6). Meanwhile, we also used the inhibitor PB98059 to inhibit the ERK pathway to indirectly inhibit the activity of FOSL1, and the results showed that inhibition of FOSL1 activity also promoted myogenic differentiation (Figure 7F-G). These results could further support the important role of FOSL1.

      The manuscript by Chen et al. demonstrated successful gene editing in sheep embryos to obtain biallelic mutation of Mstn and FGF5. The resulting double muscle phenotype resulted from fiber hypotrophy and hyperplasia, which contradicts findings in the literature. Chen et al. generated F1 heterozygous offsprings, in which Mstn transcription and translation did not change. Myoblasts from these animals showed increased proliferation and decreased differentiation, which authors interpreted as the underlying cellular mechanism of the double muscle phenotype. However, no work on muscle development in these animals is presented. Important in vitro control experiments are missing. Chen and collaborators found Fosl1 as a differentially expressed gene in Mstn+/-;Fgf5+/- muscle. Fosl1 drives myoblast proliferation and has direct regulatory effect on the Myod1 promoter. The cellular and molecular mechanism of Fosl1 during myogenesis is novel and solid evidence. However, data remain inadequate to conclude whether Fosl1 indeed acts downstream of myostatin.

      We greatly appreciate the reviewers for their insightful insights and very constructive suggestions, which were very helpful for further improving our data. In our study, although the mutation in MSTN resulted in a decrease in the cross-sectional area of individual muscle fibers, the number of muscle fibers per unit area increased, which ultimately resulted in an increase in muscle size and the development of a "double-muscle" phenotype. Therefore, we support that this is not a manifestation of muscle fiber dystrophy, and the detection of some marker genes for muscle fiber dystrophy and the proportion of central nucleus of muscle fibers also support this hypothesis (Figure S2E-F). In addition, the results such as a reduced cross-sectional area of per muscle fibers in our findings contradict the literature on muscle fiber hypertrophy, which may be due to phenotypic differences caused by mutations at different sites of MSTN, and perhaps may also be species-related. For example, the Belgian blue cattle with a natural mutation in the MSTN gene have an increased number of myofibers and a reduced myofiber cross-sectional area [1], and knockdown of the MSTN gene leads to an increase in the cross-sectional area of muscle fibers in mice, without affecting the number of muscle fibers [2,3], as we further described this in discussion. It should be noted that the possible complementary regulation of FGF5 cannot be ruled out either, but unfortunately, this makes the problem extraordinarily complex. We plan to produce single mutant sheep with segregation of the MSTN and FGF5 genes in subsequent studies and give full consideration to the current problem. Regarding the muscle development of gene-edited animals, due to the limitations of large animal conditions and limited editing individuals, we have not comprehensively evaluated the process of muscle development in vivo to further improve the potential cellular mechanisms of muscle phenotype, except for evaluating the expression of MSTN and FGF5 at the age of 3 months of individual development and the expression of MSTN at 12 months of age (Figure S2A). To determine whether FOSL1 indeed acts downstream of MSTN, we supplemented the expression levels of FOSL1 under serum regulation to support our conclusions (Figure S4D-F).

      [1] Wegner J, Albrecht E, Fiedler I, Teuscher F, Papstein HJ, Ender K. Growth- and breed-related changes of muscle fiber characteristics in cattle[J]. Journal of Animal Science, 2000,78:1485-1496.

      [2] Nishi M, Yasue A, Nishimatu S, Nohno T, Yamaoka T, Itakura M, Moriyama K, Ohuchi H, Noji S. A missense mutant myostatin causes hyperplasia without hypertrophy in the mouse muscle[J]. Biochemical and Biophysical Research Communications, 2002,293:247-251.

      [3] Zhu X, Hadhazy M, Wehling M, Tidball JG, McNally EM. Dominant negative myostatin produces hypertrophy without hyperplasia in muscle[J]. FEBS Letters, 2000,474:71-75.

      As the significant findings are minimal, the amount of text provided, figures and tables are disproportionally excessive. A large number of different molecular techniques are employed to try and decipher the mechanism(s) that result in the observed phenotype = double muscling. The authors focus on the MEK-ERK-FOSL1 pathway an suggest this the key pathway/mechanism resulting in the phenotype observed in MSTNDel273sheep. However, they provide very little solid evidence to support this notion.

      Thank you for your review. We have substantially streamlined the manuscript, removed some irrelevant information, and provided all unnecessary figures and tables as supplementary information. Meanwhile, we have added new data to further support that _MSTN_DelC273 mutation generates a muscle phenotype through the MEK-ERK-FOSL1 pathway.

      The manuscript is very long, complicated and difficult to read, given the minimum amount of significant information that is provided. It requires major rewriting to be published. Further, it misses information in material methods, on the generation of animals, on histological techniques and morphometric studies. There is no information provided on the sex of the animals produced and then analyzed. There are also a number of editorial mistakes e.g. the authors refer to tables S1-S4 in the materials and methods and results section, but and there is no table S1-S4 provided.

      Thank you for your review. We have greatly streamlined and significantly revised the manuscript. At the same time, we have supplemented detailed information on animal generation, histologic and morphological studies in materials and methods, as well as the information on gene-edited animal production, including gender, age, and so on. Finally, we reviewed the entire manuscript and updated any possible omissions or negligence, such as those oversights like tables S1-S4.

      Recommendations for the authors:

      Suggestions to improve the paper (see also public review):

      - Include the method part of generating the gene edited animals.

      We thank the editor and reviewers for pointing out our negligence. We have provided detailed information on the generation method of gene-edited sheep in Materials and Methods, which was produced by injecting MSTN sgRNA, FGF5 sgRNA, and Cas9 mRNA into embryos in different ratios.

      - Increase number of Mstn-/-;Fgf5-/- experimental animals allowing for acquisition of statistically relevant data. This is very important as the muscle phenotype of the F1 generation is not obvious. Authors should provide data that the Mstn mutation indeed invalidates myostatin signaling. They should provide data on myostatin protein Mstn transcription as well on myostatin target genes in Mstn-/-;Fgf5-/- sheep.

      Many thanks to the eidtor and reviewers for their constructive suggestions. The strategy of using MF-/- sheep to validate the transcription and target gene data of myostatin is indeed the best. However, we only generated one MF-/- sheep, which seriously limits the implementation of such an optimal strategy and may also make statistical analysis based on MF-/- sheep unreliable. Considering these factors, our current study mainly focuses on heterozygous MF+/- sheep. We are planning to generate single gene homozygous mutant sheep for MSTN and FGF5 gene separation in subsequent studies and to give full consideration to the current issue.

      - They should also provide data on myostatin target genes in muscles from heterozygous animals.

      Thank you for your very informative suggestions. We have quantitatively detected the mRNA expression levels of the receptors and downstream target genes of MSTN in the gluteus medius of heterozygous MF+/- sheep. Compared with WT sheep, the mRNA expression levels of type I receptor (ACVR1) and type II receptor (ACVR2A, ACVR2B) were highly significantly increased in the muscle of MF+/- sheep (Figure S4A), there was no significant change in mRNA expression levels in the Smand family (Figure S4B), whereas the mRNA expression levels of JunB of Jun family, a downstream target gene of MSTN, were significantly down regulated (Figure S4C). These results suggest that the effect of MSTN_Del273C with _FGF5 knockout may not be limited to MEK-ERK-FOSL1. Again, we would like to thank the editor and reviewers for their constructive suggestions, which provide a new direction for us to further deepen our insight into the mutations of MSTN gene.

      - The morphometric results on fiber CSA seem wrong. By looking at the fiber sizes and size bar in Figure 2 H would bring to far higher estimated CSA. There must be a systematic error in using the morphometric algorithm.

      Thank you very much for your careful review. There were indeed some errors in morphological analysis of the MF-/- sheep longissimus dorsi and gluteus medius. After checking, we found that the reason why the muscle fiber size was much lower than the data in the previously published sheep report was due to the incorrect use of scale bar. To this end, we re-scanned the tissue slices and used the correct scale bar to re-counted the cross-sectional area of muscle fibers and the number of muscle fiber cells per unit area. In this case, the average cross sectional area of muscle fibers in WT sheep was similar to the previous report.

      - The labeling of the ordinate of Fig. 2I is not readable (x1000 µm2, or x100 µm2?). Authors should make sure that they look at the same muscle part, as fiber sizes can highly vary depending on exact anatomical situation. In small laboratory animals, entire muscle cross sections are usually analyzed to prevent such bias. This may proof difficult in large animals, however, small muscles could easily be identified and cross sections of entire muscles be analyzed. As myostatin KO concerns all skeletal muscles, authors could consider muscle such as FDB or extraocular muscles.

      Thank you for your careful review and suggestions. The vertical axis of Figure 2I is in the units of ×1000 μm2, and each data point represents the actual measured area of each muscle fiber. Because there are significant differences in muscle fiber size, we visualized the measurement values of all individual muscle fiber areas, and the average value of the scatter plot was used as the average area of all muscle fibers. We did this to provide a more intuitively display the distribution of muscle fiber size.

      - The material of methods of muscle histology and morphometric studies must be included.

      Thank you for your suggestions. We have supplemented the methods of muscle histology and morphology study, as well as statistical methods for cross-sectional area and quantity of muscle fibers in the material methods.

      - In figures, numbers of experimental animals be given throughout, as well as number of technical repeats. The authors need to provide some minimal data on how the genetically engineered sheep were produced, in addition to how many, the sex etc.....and which of these were analyzed to obtain the data. It is impossible to know when reading this manuscript whether data involving, for example gene seq, westerns, microscopic images etc involves one sheep or some compilation of data.

      Thank you very much for your constructive suggestions, which is of great guiding significance for improving the quality of our manuscript. We have clearly stated the number of experimental animals and the number of any biological replicates in all figure legends. Meanwhile, we have provided detailed information on the generation method of gene edited sheep in the Materials and Methods, which was produced by injecting MSTN sgRNA, FGF5 sgRNA, and Cas9 mRNA into embryos in different ratios.

      - As authors work on Mstn;Fgf5 double KO animals, they should explore whether Fgf5 is expressed in developing sheep muscle, and whether combined KO entails a synergistic effect on muscle development.

      We detected the expression of FGF5 in muscle tissue of WT and MF+/- sheep at 3 months of age of individual development, which was significantly reduced compared to WT sheep (Figure S2A). We greatly appreciate your very meaningful and valuable comments on the possible synergistic effects of combined knockdown. Due to the limitations of single gene knockout of MSTN and FGF5 in sheep in our current study, especially their homozygous mutants. We will prepare MSTN and FGF5 single gene edited sheep to further explore possible synergistic effects in the following study.

      - The authors should address the question of why their mstn mutation causes fiber hypotrophy, whereas most other work reported the opposite. Why would herein generated mutation act differently? Does mutated myostatin gain a different biological effect? Does it bind to different receptors?

      Thank you very much for your valuable comment. Regarding the possibility of muscle fiber dystrophy in MSTN_Del273C mutation with _FGF5 knockout sheep, we have performed a statistical analysis of the proportion of central nucleus of muscle fibers in MF+/- sheep, which can characterize the occurrence of muscle dystrophy in some extent. The results showed that there was no significant difference in the proportion of central nucleus of muscle fibers between WT and MF+/- sheep (Figure S2E). At the same time, we also analyzed the mRNA expression levels of genes MTM1, DMD, IGF1, SMN1, and GAA related to muscle fiber dystrophy and muscle atrophy. Although the levels of MTM1, IGF1, SMN1, and GAA were significantly increased (Figure S2F), this elevation did not lead to the occurrence of muscle fiber dystrophy and muscle atrophy, but instead, it was beneficial for muscle formation. Therefore, we suggested that this phenomenon produced by MSTN_Del273C mutation with _FGF5 knockout may not be muscle fiber dystrophy, as MSTN_Del273C mutation with _FGF5 knockout significantly promoted the proliferation of sheep skeletal muscle satellite cells (Figure 3A-F). More importantly, MSTN_Del273C mutation with _FGF5 knockout improves the muscle phenotype of sheep, including the "double-muscle" phenotype of the rump (Figure 2A), the proportion of gluteus medius to the carcass (Figure 2K), and the proportion of hind leg meat (Table S7). In addition, we analyzed in discussion why the current mutation produces a phenotype different from other work reports, which we suggested that this may be due to different mutation sites. We provided a detailed analysis of this in discussion. It is indeed a very thought-provoking question about whether mutated myostatin acquire different biological effects and whether they bind to different receptors, which we plan to further reveal this in the homozygous MSTN and FGF5 mutant sheep.

      - Concerning the in vitro work, authors need to demonstrate whether Mstn and/or FGF5 signaling pathways are altered in myoblasts/myotubes. As both are secreted factors, authors need to show that serum conditioning is changing in myoblast cultures. Authors should perform cultures in which these factors are entirely suppressed and thus signaling pathway shut down. They could use growth factor depleted supplements and/or add myostatin and FGF5 inhibitors to the serum. The need to determine first the individual effect of myostatin and FGF5 and then challenge the combined effect. They also should perform the inverse experiment and supplement cultures with recombinant factors, both as individual approach and combined approach.

      We greatly appreciate your valuable suggestions. In addition to detecting the MSTN pathway at the cellular level, we also assayed the expression of MSTN receptors and downstream Smad and Jun families in the gluteus medius, and found that MSTN_Del273C mutation with _FGF5 knockout led to upregulation of two receptors, while the expression of downstream Smad and Jun families was also inhibited to varying degrees (Figure S4A). Considering the possible serum regulation, we also supplemented the data on serum MSTN regulation. Because we have previously tested inhibitors of MSTN and FGF5, but did not observe any effect, we suggest this may be due to the nonspecificity of the inhibitors, as there are no sheep specific MSTN and FGF5 inhibitors. Given that the phenotype of MSTN gene editing is mutation site dependent, we directly cultured skeletal muscle satellite cells using serum from WT and MF+/- sheep. We found that serum from MF+/- sheep promoted the proliferation of skeletal muscle satellite cells (Figure S4D). MSTN_Del273C mutation with _FGF5 knockout promoted FOSL1 expression using WT sheep serum (Figure S4E), which was similar to the results of FBS culture and HS induction. The serum from MF+/- sheep strongly stimulated FOSL1 expression and the inhibition of MyoD1 (Figure S4F). These results indicate that serum regulation cannot be ignored after MSTN_Del273C mutation with _FGF5 knockout.

      - With above suggested additional experiments, authors would also be able to demonstrate, whether Fosl1 is indeed triggered in response to myostatin and/or FGF5 signaling.

      To determine whether FOSL1 indeed acts downstream of MSTN, we supplemented the expression levels of FOSL1 under serum regulation to support our conclusions. We found that the serum from MF+/- sheep strongly stimulated FOSL1 expression and the inhibition of MyoD1 (Figure S4F).

      - Authors used t-test despite in several tests despite low sample number, which violates as such the assumption of equal variance. Non-parametric tests should be used in this case.

      Thank you very much for your valuable comments. We apologize for the previous incorrect use of statistical methods. In the revised version, we have re-analyzed all data. Before performing student’s t-test, we first evaluated the assumptions of normal distribution and equal variance. Two-tailed student’s t-tests were used only for data that conformed to normal distribution and homogeneity of variance, otherwise corrected Welch's t-tests were performed.

      - Authors should state in the legends which statistical test was used.

      Thank you for your suggestion. We have clearly stated the statistical testing method used in all figure legends, which is indeed necessary and important.

      In general, this manuscript should be dramatically scaled back in terms of content, eliminating unnecessary text, figures and tables that do not play a significant role in the findings that were significant. There is some interesting information and data here that can add to the overall base of knowledge surrounding the production of genetically engineered livestock in which myostatin has been targeted for mutation. However, the authors need to focus on their findings that were significant and strongly supported by the data and statistical analysis. Some discussion of findings that support their ideas/hypothesis, but are not statistically significant is fine. But it should not make up the majority of the manuscript which is the case here.

      Thank you for your valuable suggestions, which are essential for improving the quality of our manuscript. We have greatly streamlined and significantly revised the manuscript, removed unnecessary text, figures, and tables.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the Authors):

      Arpin is a negative regulator of Arp2/3 activity. Here the authors investigated the role of arpin in vascular permeability using appropriate cultured human and murine endothelial monolayers and successfully developed an arpin KO mice. The results clearly show arpin is expressed in blood vessels (not clear about lymphatics but given leaky vessels, one wonders). The data show that arpin is important for vessel barrier function yet its genetic loss still leads to viable animals in the C57Blk strain albeit with leaky blood vessels. The data are well presented and controls are included. However, the evidence that arpin loss/knockdown causes increased actin functions independent of Arp2/3 is based on pharmacological data and is indirect. Authors conclude ROCK1 activity is elevated and the cause of lost barrier function by arpin reduction. I do have one suggestion for the authors that involves a new study in these animals, which could strengthen their proposed mechanism that the vascular defects are independent of Arp2/3 activity and rather involve ROCK1 but not ZIPK.

      (1) If arpin is working via ROCK1, as the authors infer, perhaps treatment of arpin-/- mice with ROCK1 inhibitor(s) would attenuate vessel permeability while HS38 treatment would not. This type of study would strengthen the conclusion that ROCK1, but not ZIPK, was involved. Including CK666 if active in mouse cells, could also be tested.

      To analyze vascular permeability in vivo, we performed Miles assays in arpin+/+ and arpin-/- mice using the inhibitors of ROCK1 (Y27632) and ZIPK (HS38). Both Y27632 and HS38 reduced the permeability caused by absence of arpin (new Figure 8E), thus confirming what we observed before in HUVEC (shown in old Figure 7). CK666 did not change the permeability in arpin-/- mice, thus confirming the conclusion that arpin does not regulate vascular permeability via Arp2/3 but rather via ROCK1/ZIPK-mediated stress fiber formation (page 13).

      (2) Fig 5. Data demonstrate that Arpin regulates actin filament formations and permeability in HUVEC, but this does not demonstrate its occurring in an Arp2/3-independent manner. If I understand your data this is indirect evidence. One needs more information to reach this conclusion. Can authors measure Arp2/3 directly and then test whether arpin knockdown and CK666 have the same capacity to reduce Arp2/3 activity in vitro.

      Arp2/3 activity cannot be measured directly. The commonly used approach is therefore Arp2/3 inhibition via CK666. Our new in vivo permeability assays (see answer above) together with our HUVEC data in Figure 5 clearly show that CK666 does not have the same effect as arpin knock-down, and neither does CK666 rescue the effects of arpin deficiency in vitro and in vivo. Together, these findings clearly suggest that arpin does not regulate endothelial permeability via Arp2/3.

      Minor issues:

      Fig 2, 3 or other Figs: In presented western blots, all proteins should include appropriate mw labels.

      Thank you. Molecular weights have been added to all Western blots.

      Fig 2. Suggest that like your arpin analysis, amounts of AP1AP and PICK1 at baseline and TNF-treatment by blotting should be included. A minor point is yellow color for labels does not stand out and should be changed to another color - as the authors used in Fig 2C.

      We have included Western blots and quantifications for PICK1 in Figure S1A and S1C. An antibody against AP1AP was unfortunately not available.

      The yellow color has been changed to purple for better visibility.

      Fig 2C. The arpin loss at junctions and actin filaments (Figure 2C) is very minor even though it reached statistical significance. It really is not an obvious loss from your 3 color overlay.

      Thank you. It is indeed hard to see. We included now magnifications in Figure 2C that better show the loss of arpin at junctions.

      Fig 8, text 303-310 shows in vivo evidence of lung congestion and edema. Also appear to be inflammatory cells present in images. If these are inflammatory cells, it begs the question if these mice have an abnormal complete blood cell count (CBC). Suggest adding CBC data for arpin-/- vs control arpin +/+ mice in Fig 8.

      The pathologist observed the presence of lymphocytes and macrophages, indicating the possibility of a (low level) chronic inflammation in arpin-deficient lungs. However, we now also performed hemograms of the mice (new Table S2) that showed no significant difference in the blood cell count of arpin-/- and arpin+/+ mice. Thus, the presence of lymphocytes and macrophages cannot be explained simply by higher leukocyte counts (page 14).

      Line 289, pg 13, Fig 8: Lung levels of arpin are not shown in Fig 8B. Authors must mean another fig?

      Sorry. Arpin protein levels in lungs are shown in figure 8C. This has been corrected on page 13.

      Reviewer #2 (Recommendations For The Authors):

      This is a solid piece of work that adds a small amount of additional factual information to our understanding of cell-cell junctions. The experimental work is of good quality and is sufficient to support the aims of the paper. I think the value of the work is to add this small amount of new knowledge to the archive. I do not believe that further experimental work would add to the paper - it's done. But this doesn't have the impact or completeness for this journal. It belongs in a for-the-record journal.

      We appreciate your overall positive evaluation and your comments that our study represents a solid piece of work with good quality experimental work. However, we are not sure what you mean by “it belongs in a for-the-record journal”. Anyway, we agree that our study does not reveal a complete mechanism of how arpin regulates actin stress fibers, but we respectfully disagree that our study only adds a “small amount of additional factual information”. We may not have been very clear about it, but we present in this study several new discoveries and although some are descriptive in nature that does not make them trivial or less important. We provide for the first time experimental evidence that: 1) arpin is expressed in endothelial cells in vitro and in vivo, and downregulated during inflammation; 2) presence of arpin is required for proper endothelial permeability regulation and junction architecture; 3) arpin exerts these functions in an Arp2/3-independent manner; 4) arpin controls actomyosin contractility in a ROCK1- and ZIPK-dependent fashion; 5) arpin knock-out mice are viable and breed and develop normally but show histological characteristics of a vascular phenotype and increased vascular permeability that can be rescued by inhibition of ROCK1 and ZIPK. The fact that arpin fulfills its functions in endothelial cells independently of the Arp2/3 complex is of special relevance as previously the only known function of arpin was the inhibition of the Arp2/3 complex. Thus, we believe that our study adds a significant amount of new information to the literature. Thank you very much.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Summary Responses: Besides the WT allele, equivalent to the mouse TMEM173 gene, the human TMEM173 gene has two common alleles: the HAQ and AQ alleles carried by billions of people. The main conclusions and interpretation, summarized in the Title and Abstract, are i) Different from the WT TMEM173 allele, the HAQ or AQ alleles are resistant to STING activation-induced cell death; ii) STING residue 293 is critical for cell death; iii) HAQ, AQ alleles are dominant to the SAVI allele; iv) One copy of the AQ allele rescues the SAVI disease in mice. We propose that STING research and STING-targeting immunotherapy should consider human TMEM173 heterogeneity. These interpretations and conclusions were based on Data and Logic. We welcome alternative, logical interpretations and collaborations to advance the human TMEM173 research.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This manuscript by Aybar-Torres et al investigated the effect of common human STING1 variants on STING-mediated T cell phenotypes in mice. The authors previously made knock-in mice expressing human STING1 alleles HAQ or AQ, and here they established a new knock-in line Q293. The authors stimulated cells isolated from these mice with STING agonists and found that all three human mutant alleles resist cell death, leading to the conclusion that R293 residue is essential for STING-mediated cell death (there are several caveats with this conclusion, more below). The authors also bred HAQ and AQ alleles to the mouse Sting1-N153S SAVI mouse and observed varying levels of rescue of disease phenotypes with the AQ allele showing more complete rescue than the HAQ allele. The Q293 allele was not tested in the SAVI model. They conclude that the human common variants such as HAQ and AQ have a dominant negative effect over the gain-of-function SAVI mutants.

      Strengths:

      The authors and Dr. Jin's group previously made important observations of common human STING1 variants, and these knock-in mouse models are essential for understanding the physiological function of these alleles.

      Weaknesses:

      However, although some of the observations reported here are interesting, the data collectively does not support a unified model. The authors seem to be drawing two sets of conclusions from in vitro and in vivo experiments, and neither mechanism is clear. Several experiments need better controls, and these knock-in mice need more comprehensive functional characterization.

      (1) In Figure 1, the authors are trying to show that STING agonist-induced splenocytes cell death is blocked by HAQ, AQ and Q alleles. The conclusion at line 134 should be splenocytes, not lymphocytes. Most experiments in this figure were done with mixed population that may involve cell-to-cell communication. Although TBK1-dependence is likely, a single inhibitor treatment of a mixed population is not sufficient to reach this conclusion.

      We greatly appreciate Reviewer 1's insights. We changed the “lymphocytes” to “splenocytes” (line 133) as suggested. We respectfully disagree with Reviewer 1’s comments on TBK1. First, we used two different TBK1 inhibitors: BX795 and GSK8612. Second, because BX795 also inhibits PDK1, we used a PDK1 inhibitor GSK2334470; Third, both BX795 and GSK8612 completely inhibited diABZI-induced splenocyte cell death (Figure 1B) (lines 128 – 133). The logical conclusion is “TBK1 activation is required for STING-mediated mouse spleen cell death ex vivo”. (line 117).

      Our discovery that the common human TMEM173 alleles are resistant to STING activation-induced cell death is a substantial finding. It further strengthens the argument that the HAQ and AQ alleles are functionally distinct from the WT allele 1-3. We wish to underscore the crucial message of this study-that 'STING research and STING-targeting immunotherapy should consider TMEM173 heterogeneity in humans' (line 37), which has been largely overlooked in current STING clinical trials 4.

      Regarding STING-Cell death, as we stated in the Introduction (lines 65-77). i) STING-mediated cell death is cell type-dependent 5-7 and type I IFNs-independent 5,7,8. ii) The in vivo biological significance of STING-mediated cell death is not clear 7,8. iii) The mechanisms of STING-Cell death remain controversial. Multiple cell death pathways, i.e., apoptosis, necroptosis, pyroptosis, ferroptosis, and PANoptosis, are proposed 7,9,10. SAVI/HAQ, SAVI/AQ prevented lymphopenia and alleviated SAVI disease in mice. Thus, the manuscript provides some answers to the biological significance of STING-cell death in vivo, which is new. Regarding the molecular mechanism, splenocytes from Q293/Q293 mice are resistant to STING cell death. The logical conclusion is that the amino acid 293 is critical for STING cell death (line 29).

      Extensive studies are needed, beyond the scope of this manuscript, on how aa293 and TBK1 mediates STING-Cell death to resolve the controversies in the STING-cell death fields (e.g. apoptosis, necroptosis, pyroptosis, ferroptosis, and PANoptosis).

      (2) Q293 knock-in mouse needs to be characterized and compared to HAQ and AQ. Is this mutant expressed in tissues? Does this mutant still produce IFN and other STING activities? Does the protein expression level altered on Western blot? Is the mutant protein trafficking affected? In the authors' previous publications and some of the Western blot here, expression levels of each of these human STING1 protein in mice are drastically different. HAQ and AQ also have different effects on metabolism (pmid: 36261171), which could complicate interoperation of the T cell phenotypes.

      These are very important questions that require rigorous investigations that are beyond the scope of this manuscript. This manuscript, titled “The common TMEM173 HAQ, AQ alleles rescue CD4 T cellpenia, restore T-regs, and prevent SAVI (N153S) inflammatory disease in mice” does not focus on Q293 mice. We have been investigating the common human TMEM173 alleles since 2011 from the discovery 11 , mouse model 1,3, human clinical trial 2, and human genetics studies 3. This manuscript is another step towards understanding these common human TMEM173 alleles with the new discovery that HAQ, AQ alleles are resistant to STING cell death.

      (3) HAQ/WT and AQ/WT splenocytes are protected from STING agonist-induced cell death equally well (Figure 1G). HAQ/SAVI shows less rescue compared to AQ/SAVI. These are interesting observations, but mechanism is unclear and not clearly discussed. E.g., how does AQ protect disease pathology better than HAQ (that contains AQ)? Does Q293 allele also fully rescue SAVI?

      In this manuscript, Figure 6 shows AQ/SAVI had more T-regs than HAQ/SAVI (lines 251 – 261). In our previous publication on HAQ, AQ knockin mice, we showed that AQ T-regs have more IL-10 than HAQ T-regs 3. Thus, increased IL-10+ Tregs in AQ mice may contribute to an improved phenotype in AQ/SAVI compared to HAQ/SAVI. However, we are not excluding other contributions (e.g. metabolic difference) (lines 332-335). We are exploring these possibilities.  

      (4) Figure 2 feels out of place. First of all, why are the authors using human explant lung tissues? PBMCs should be a better source for lymphocytes. In untreated conditions, both CD4 and B cells show ~30% dying cells, but CD8 cells show 0% dying cells. This calls for technical concerns on the CD8 T cell property or gating strategy because in the mouse experiment (Figure 1A) all primary lymphocytes show ~30% cell death at steady-state. Second, Figure 2C, these type of partial effect needs multiple human donors to confirm. Three, the reconstitution of THP1 cells seems out of place. STING-mediated cell death mechanism in myeloid and lymphoid cells are likely different. If the authors want to demonstrate cell death in myeloid cells using THP1, then these reconstituted cell lines need to be better validated. Expression, IFN signaling, etc. The parental THP1 cells is HAQ/HAQ, how does that compare to the reconstitutions? There are published studies showing THP1-STING-KO cells reconstituted with human variants do not respond to STING agonists as expected. The authors need to be scientifically rigorous on validation and caution on their interpretations.

      Figure 2 is necessary because it reveals the difference between mouse and human STING cell death, which is critical to understand STING in human health and diseases (lines 160-161). Figure 2A-2B showed that STING activation killed human CD4 T cells, but not human CD8 T cells or B cells. This observation is different from Figure 1A, where STING activation killed mouse CD4, CD8 T cells, and CD19 B cells, revealing the species-specific STING cell death responses. Regarding human CD8 T cells, as we stated in the Discussion (lines 323-325), human CD8 T cells (PBMC) are not as susceptible as the CD4 T cells to STING-induced cell death 8. We used lung lymphocytes that showed similar observations (Figure 2A). For Figure 2C, we used 2 WT/HAQ and 3 WT/WT individuals (lines 738-739). We generate HAQ, AQ THP-1 cells in STING-KO THP-1 cells (Invivogen,, cat no. thpd-kostg) (lines 380-387).

      A recent study found that a new STING agonist SHR1032 induces cell death in STING-KO THP-1 cells expressing WT(R232) human STING 10 (line 182). SHR1032 suppressed THP1-STING-WT(R232) cell growth at GI50: 23 nM while in the parental THP1-STING-HAQ cells, the GI50 of SHR1032 was >103 nM 10. Cytarabine was used as an internal control where SHR1032 killed more robustly than cytarabine in the THP1-STING-WT(R232) cells but much less efficiently than cytarabine in the THP-1-STING-HAQ cells 10. 

      Our manuscript rigorously uses mouse splenocytes, human lung lymphocytes, THP-1 reconstituted with HAQ, AQ, and HAQ/SAVI, AQ/SAVI mice, to demonstrate that the common human HAQ, AQ alleles are resistant to STING cell death in vitro and in vivo.

      We agree with Reviewer 1 that STING-mediated cell death mechanisms in myeloid and lymphoid cells may be different and likely contribute to the different mechanisms proposed in STING cell death research 7,9,10. Our study focuses on the in vivo STING-mediated T cellpenia.

      (5) Figure 2G, H, I are confusing. AQ is more active in producing IFN signaling than HAQ and Q is the least active. How to explain this?

      We stated in the Introduction that “AQ responds to CDNs and produce type I IFNs in vivo and in vitro 3,12,13 ”(line 92-93). We reported that the AQ knock in mice responded to STING activation 3. We previously showed that there was a negative natural selection on the AQ allele in individuals outside of Africa 3. 28% of Africans are WT/AQ but only 0.6% East Asians are WT/AQ 3. In contrast, the HAQ allele was positively selected in non-Africans 3. Investigation to understand the mechanisms and biological significance of these naturally selected human TMEM173 alleles has been ongoing in the lab.

      (6) The overall model is unclear. If HAQ, AQ and Q are loss-of-function alleles and Q is the key residue for STING-mediated cell death, then why AQ is the most active in producing IFN signaling and AQ/SAVI rescues disease most completely? If these human variants act as dominant negatives, which would be consistent with the WT/het data, then how do you explain AQ is more dominant negative than HAQ?

      In this manuscript, Figure 6 shows AQ/SAVI had more T-regs than HAQ/SAVI (lines 251 – 261). In our previous publication on HAQ, AQ knockin mice, we showed that AQ T-regs have more IL-10 and mitochondria activity than HAQ T-regs 3. Nevertheless, we are not excluding other contributions (e.g. metabolic difference) by the AQ allele (lines 332-335). Last, we used modern human evolution to discover the dominance of these common human STING alleles. In modern humans outside Africans, HAQ was positively selected while AQ was negatively selected 3. However, AQ is likely dominant to HAQ because there is no HAQ/AQ individuals outside Africa. The genetic dominance of common human TMEM173 allele is a new concept. More investigation is ongoing.

      (7) As a general note, SAVI disease phenotypes involve multiple cell types. Lymphocyte cell death is only one of them. The authors' characterization of SAVI pathology is limited and did not analyze immunopathology of the lung.

      Both radioresistant parenchymal and/or stromal cells and hematopoietic cells influence SAVI pathology in mice 14,15. Nevertheless, the lack of CD 4 T cells, including the anti-inflammatory T-regs, likely contributes to the inflammation in SAVI mice and patients 16. We characterized lung function, lung inflammation (Figure 4), lung neutrophils, and inflammatory monocyte infiltration (Figure S5) (lines 232-235).

      (8) Line 281, the discussion on HIV T cell death mechanism is not relevant and over-stretching. This study did not evaluate viral infection in T cells at all. The original finding of HAQ/HAQ enrichment in HIV/AIDS was 2/11 in LTNP vs 0/11 in control, arguably not the strongest statistics.

      Several publications have linked STING to HIV pathogenesis 17-22  (line 271). CD4 T cellpenia is a hallmark of AIDS. The manuscript studies STING activation-induced T cellpenia in vivo. It is not stretching to ask, for example, does preventing STING T cell death (e.g HAQ, AQ alleles) can restore CD4 T cell counts and improve care for AIDS patients?

      Reviewer #2 (Public Review):

      Aybar-Torres and colleagues utilize common human STING alleles to dissect the mechanism of SAVI inflammatory disease. The authors demonstrate that these common alleles alleviate SAVI pathology in mice, and perhaps more importantly use the differing functionality of these alleles to provide insight into requirements of SAVI disease induction. Their findings suggest that it is residue A230 and/or Q293 that are required for SAVI induction, while the ability to induce an interferon-dependent inflammatory response is not. This is nicely exemplified by the AQ/SAVI mice that have an intact inflammatory response to STING activation, yet minimal disease progression. As both mutants seem to be resistant STING-dependent cell death, this manuscript also alludes to the importance of STING-dependent cell death, rather than STING-dependent inflammation, in the progression of SAVI pathology. While I have some concerns, I believe this manuscript makes some important connections between STING pathology mouse models and human genetics that would contribute to the field.

      Some points to consider:

      (1) While the CD4+ T cell counts from HAQ/SAVI and AQ/SAVI mice suggest that these T cells are protected from STING-dependent cell death, an assay that explores this more directly would strengthen the manuscript. This is also supported by Fig 2C, but I believe a strength of this manuscript is the comparison between the two alleles. Therefore, if possible, I would recommend the isolation of T cells from these mice and direct stimulation with diABZI or other STING agonist with a cell death readout.

      Please see the new Figure S3 for cell death by diABZI, DMXAA in Splenocytes from WT/WT, WT/HAQ, HAQ/SAVI, AQ/SAVI mice. The HAQ/SAVI and AQ/SAVI splenocytes showed similar partial resistance to STING activation-induced cell death (lines 214-216).

      (2) Related to the above point - further exemplifying that the Q293 locus is essential to disease, even in human cells, would also strengthen the paper. It seems that CD4 T cell loss is a major component of human SAVI. While not co_mpletely necessary, repeating the THP1 cell death experiments from Fig 2 with a human T cell line would round out the study nicely._

      We examined HAQ, AQ mouse splenocytes, HAQ human lung lymphocytes, THP-1 reconstituted with HAQ, AQ, and HAQ/SAVI, AQ/SAVI mice, to demonstrate that the common human HAQ, AQ alleles are resistant to STING cell death in vitro and in vivo. Additional human T cell line work does not add too much. We hope to conduct more human PBMC or lung lymphocytes STING cell death experiments from HAQ, AQ individuals as we continue the human STING alleles investigation.

      (3) While I found the myeloid cell counts and BMDM data interesting, I think some more context is needed to fully loop this data into the story. Is myeloid cell expansion exemplified by SAVI patients? Do we know if myeloid cells are the major contributors to the inflammation these patients experience? Why should the SAVI community care about the Q293 locus in myeloid cells?

      This is likely a misunderstanding. We use BMDM for the purpose of comparing STING signaling (TBK1, IRF3, NFkB, STING activation) by WT/SAVI, HAQ/SAVI, AQ/SAVI. Ideally, we would like to compare STING signaling in CD4 T cells from WT/SAVI to HAQ/SAVI, AQ/SAVI mice. However, WT/SAVI has no CD4 T cells. Doing so, we are making the assumption that the basic STING signaling (TBK1, IRF3, NFkB, STING activation) is conserved between T cells and macrophages.

      (4) The functional assays in Figure 4 are exciting and really connect the alleles to disease progression. To strengthen the manuscript and connect all the data, I would recommend additional readouts from these mice that address the inflammatory phenotype shown in vitro in Figure 5. For example, measuring cytokines from these mice via ELISA or perhaps even Western blots looking for NFkB or STING activation would be supportive of the story. This would also allow for some tissue specificity. I believe looking for evidence of inflammation and STING activation in the lungs of these mice, for example, would further connect the data to human SAVI pathology.

      Reviewer 2 suggests looking for evidence of inflammation and STING activation in the lungs of HAQ/SAVI, AQ/SAVI. We would like to elaborate further. First, anti-inflammatory treatments, e.g. steroids, DMARDs, IVIG, Etanercept (TNF), rituximab, Nifedipine, amlodipine, et al., all failed in SAVI patients 23. JAK inhibitors on SAVI had mixed outcomes (lines 55-58). Second, Figure S5 examined lung neutrophils and inflammatory monocyte infiltration. Interestingly, while AQ/SAVI mice had a better lung function than HAQ/SAVI mice (Figure 4D, 4E vs 4H, 4I), HAQ/SAVI and AQ/SAVI lungs had comparable neutrophils and inflammatory monocyte infiltration (Figure S5). Last, SAVI is classified as type I interferonopathy 23, but the lung diseases of SAVI are mainly independent of type I IFNs 24-27. The AQ allele suppresses SAVI in vivo.  Understanding the mechanisms by which AQ rescues SAVI may lead to curative care for SAVI patients.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      One suggestion is to streamline this study by focusing on STING-mediated cell death only in CD4 T cells. The authors can use in vitro PBMC isolated human T cells, ex vivo T cells from the knock-in mice, and in vivo T cells from the SAVI breeding. The current manuscript includes myeloid cell death, Tregs, complex SAVI disease pathology, which is too confusing and too complex to explain with the varying effect from the three human STING1 variants.

      We sincerely appreciate Reviewer 1’s suggestion. The goal of our human STING alleles research has always been translational, i.e. improving human health. Even as a monogenetic disease, the SAVI pathology is still complex. For example, thought as a type I Interferonopathy, SAVI is largely independent of type I IFNs. Similarly, STING-activation-induced cell death, while contribute to SAVI, is not the whole story, as the Reviewer pointed out in the Comment 3 & 6 &7. HAQ/SAVI mice still died early and had lung dysfunction (Figure 4). In contrast, AQ/SAVI mice restore lifespan and lung function. We had Figure 6 show different T-regs between AQ/SAVI and HAQ/SAVI mice. In addition, AQ mice had more IL-10+ T-regs than HAQ mice 3. Therefore, we are excited about developing AQ-based curative therapy for SAVI patients (preventing cell death and inducing immune tolerance).  Again, we thank the Reviewer for the suggestion. Additional research is ongoing.

      Reviewer #2 (Recommendations For The Authors):

      Minor points

      (1) Generation of THP1 cells with the human STING alleles is missing from methods.

      We added the protocol in the methods (lines 380-387). THP-1 KO line stable expressing WT STING was first described by Weikang Tao’s group 10.

      (2) Some abbreviations are not expanded (CDA).

      CDA is expanded as cyclic di-AMP (e.g. line 375).

      References.

      (1) Patel, S. et al. The Common R71H-G230A-R293Q Human TMEM173 Is a Null Allele. J Immunol 198, 776-787 (2017).

      (2) Sebastian, M. et al. Obesity and STING1 genotype associate with 23-valent pneumococcal vaccination efficacy. JCI Insight 5 (2020).

      (3) Mansouri, S. et al. MPYS Modulates Fatty Acid Metabolism and Immune Tolerance at Homeostasis Independent of Type I IFNs. J Immunol 209, 2114-2132 (2022).

      (4) Sivick, K. E. et al. Comment on "The Common R71H-G230A-R293Q Human TMEM173 Is a Null Allele". J Immunol 198, 4183-4185 (2017).

      (5) Gulen, M. F. et al. Signalling strength determines proapoptotic functions of STING. Nat Commun 8, 427 (2017).

      (6) Kabelitz, D. et al. Signal strength of STING activation determines cytokine plasticity and cell death in human monocytes. Sci Rep 12, 17827 (2022).

      (7) Murthy, A. M. V., Robinson, N. & Kumar, S. Crosstalk between cGAS-STING signaling and cell death. Cell Death Differ 27, 2989-3003 (2020).

      (8) Kuhl, N. et al. STING agonism turns human T cells into interferon-producing cells but impedes their functionality. EMBO Rep 24, e55536 (2023).

      (9) Li, C., Liu, J., Hou, W., Kang, R. & Tang, D. STING1 Promotes Ferroptosis Through MFN1/2-Dependent Mitochondrial Fusion. Front Cell Dev Biol 9, 698679 (2021).

      (10) Song, C. et al. SHR1032, a novel STING agonist, stimulates anti-tumor immunity and directly induces AML apoptosis. Sci Rep 12, 8579 (2022).

      (11) Jin, L. et al. Identification and characterization of a loss-of-function human MPYS variant. Genes Immun 12, 263-269 (2011).

      (12) Yi, G. et al. Single nucleotide polymorphisms of human STING can affect innate immune response to cyclic dinucleotides. PLoS One 8, e77846 (2013).

      (13) Patel, S. et al. Response to Comment on "The Common R71H-G230A-R293Q Human TMEM173 Is a Null Allele". J Immunol 198, 4185-4188 (2017).

      (14) Gao, K. M. et al. Endothelial cell expression of a STING gain-of-function mutation initiates pulmonary lymphocytic infiltration. Cell Rep 43, 114114 (2024).

      (15) Gao, K. M., Motwani, M., Tedder, T., Marshak-Rothstein, A. & Fitzgerald, K. A. Radioresistant cells initiate lymphocyte-dependent lung inflammation and IFNgamma-dependent mortality in STING gain-of-function mice. Proc Natl Acad Sci U S A 119, e2202327119 (2022).

      (16) Hu, W. et al. Regulatory T cells function in established systemic inflammation and reverse fatal autoimmunity. Nat Immunol 22, 1163-1174 (2021).

      (17) Monroe, K. M. et al. IFI16 DNA sensor is required for death of lymphoid CD4 T cells abortively infected with HIV. Science 343, 428-432 (2014).

      (18) Doitsh, G. et al. Cell death by pyroptosis drives CD4 T-cell depletion in HIV-1 infection. Nature 505, 509-514 (2014).

      (19) Jakobsen, M. R., Olagnier, D. & Hiscott, J. Innate immune sensing of HIV-1 infection. Curr Opin HIV AIDS 10, 96-102 (2015).

      (20) Silvin, A. & Manel, N. Innate immune sensing of HIV infection. Curr Opin Immunol 32, 54-60 (2015).

      (21) Altfeld, M. & Gale, M., Jr. Innate immunity against HIV-1 infection. Nat Immunol 16, 554-562 (2015).

      (22) Krapp, C., Jonsson, K. & Jakobsen, M. R. STING dependent sensing - Does HIV actually care? Cytokine Growth Factor Rev 40, 68-76 (2018).

      (23) Liu, Y. et al. Activated STING in a vascular and pulmonary syndrome. N Engl J Med 371, 507-518 (2014).

      (24) Luksch, H. et al. STING-associated lung disease in mice relies on T cells but not type I interferon. J Allergy Clin Immunol 144, 254-266 e258 (2019).

      (25) Stinson, W. A. et al. The IFN-gamma receptor promotes immune dysregulation and disease in STING gain-of-function mice. JCI Insight 7 (2022).

      (26) Warner, J. D. et al. STING-associated vasculopathy develops independently of IRF3 in mice. J Exp Med 214, 3279-3292 (2017).

      (27) Fremond, M. L. et al. Overview of STING-Associated Vasculopathy with Onset in Infancy (SAVI) Among 21 Patients. J Allergy Clin Immunol Pract 9, 803-818 e811 (2021).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this work, the authors provide a comprehensive description of transcriptional regulation in Pseudomonas syringae by investigating the binding characteristics of various transcription factors. They uncover the hierarchical network structure of the transcriptome by identifying top-, middle-, and bottom-level transcription factors that govern the flow of information in the network. Additionally, they assess the functional variability and conservation of transcription factors across different strains of P. syringae by studying DNA-binding characteristics. These findings notably expand our current knowledge of the P. syringae transcriptome.

      The findings associated with crosstalk between transcription factors and pathways, and the diversity of transcription factor functions across strains provide valuable insights into the transcriptional regulatory network of P. syringae. However, these results are at times underwhelming as their significance is unclear. This study would benefit from a discussion of the implications of transcription factor crosstalk on the functioning of the organism as a whole. Additionally, the implications of variability in transcription factor functions on the phenotype of the strains studied would further this analysis.<br /> Overall, this manuscript serves as a key resource for researchers studying the transcriptional regulatory network of P. syringae.

      Thank you for your positive comments.

      Reviewer #2 (Public Review):

      Summary:

      The phytopathogenic bacterium Pseudomonas syringae is comprised of many pathovars with different host plant species and has been used as a model organism to study bacterial pathogenesis in plants. Transcriptional regulation is key to plant infection and adaptation to host environments by this bacterium. However, researchers have focused on a limited number of transcription factors (TFs) that regulate virulence-related pathways. Thus, a comprehensive, systems-level understanding of regulatory interactions between transcription factors in P. syringae has not been achieved.

      This study by Sun et al performed ChIP-seq analysis of 170 out of 301 TFs in P. syringae pv. syringae 1448A and used this unique dataset to infer transcriptional regulatory networks in this bacterium. The network analyses revealed hierarchical interactions between TFs, various network motifs, and co-regulation of target genes by TF pairs, which collectively mediate information flow. As discussed, the structure and properties of the P. syringae transcriptional regulatory networks are somewhat different from those identified in humans, yeast, and E. coli, highlighting the significance of this study. Further, the authors made use of the P. syringae transcriptional regulatory networks to find TFs of unknown functions to be involved in virulence-related pathways. For some of these TFs, their target specificity and biological functions, such as motility and biofilm formation, were experimentally validated. Of particular interest is the finding that despite conservation of TFs between P. syringae pv. syringae 1448A, P. syringae pv. tomato DC3000, P. syringae pv. syringae B728a, and P. syringae pv. actinidiae C48, some of the conserved TFs show different repertoires of target genes in these four P. syringae strains.

      Thank you for your positive comments.

      Strengths:

      This study presents a systems-level analysis of transcriptional regulatory networks in relation to P. syringae virulence and metabolism, and highlights differences in transcriptional regulatory landscapes of conserved TFs between different P. syringae strains, and develops a user-friendly database for mining the ChIP-seq data generated in this study. These findings and resources will be valuable to researchers in the fields of systems biology, bacteriology, and plant-microbe interactions.

      Thank you for your positive comments.

      Weaknesses:

      No major weaknesses were found, but some of the results may need to be interpreted with caution. ChIP-seq was performed with bacterial strains overexpressing TFs. This may cause artificial binding of TFs to promoters which may not occur when TFs are expressed at physiological levels. Another caution is applied to the interpretation of the biological functions of TFs. The biological roles of the tested TFs are based on in vitro experiments. Thus, functional relevance of the tested TFs during plant infection and/or survival under natural environmental conditions remains to be demonstrated.

      Thank you for your comments, and we agree with the reviewer. To eliminate the artificial binding of TFs, we performed EMSA to verify the analyzed targets. Our EMSA results confirmed the analyzed binding peaks.

      For the verification experiments of the biological functions of TFs, we also performed in vivo motility assay and biofilm production assay (Figures 3b-d). To further detect the biological functions of TFs, we performed plant infection assay of TF PSPPH2193 under natural environmental condition (bean leaves). As shown in Figures S6c and g, both the motility and the virulence of P. syringae in ∆PSPPH2193 strain was significantly reduced compared with WT strain. These results showed that TF PSPPH2193 positively regulated the pathogenicity of P. syringae via modulating the bacterial motility.

      Reviewer #3 (Public Review):

      Summary:

      This study aims to understand gene regulation of the plant bacterial pathogen Pseudomonas syringae. Although the function of some TFs has been characterized in this strain, a global picture of the gene regulatory network remains elusive. The authors conducted a large-scale ChIP-seq analysis, covering 170 out of 301 TFs of this strain, and revealed gene regulatory hierarchy with functional validation of some previously uncharacterized TFs.

      Thank you for your positive comments.

      Strengths:

      - This study provides one of the largest ChIP-seq datasets for a single bacterial strain, covering more than half of its TFs. This impressive resource enabled comprehensive systems-level analysis of the TF hierarchy.

      - This study identified novel gene regulation and function with validations through biochemical and genetic experiments.

      - The authors attempted on broad analyses including comparisons between different bacterial strains, providing further insights into the diversity and conservation of gene regulatory mechanisms.

      Thank you for your positive comments.

      Weaknesses:

      (1) Some conclusions are not backed by quantitative or statistical analyses, and they are sometimes overinterpreted.

      Thank you for your comments. We used hypergeometric test in this analysis. Although only one gene was enriched in some pathways, the adjusted p-value was less than 0.05. We added the details in the revised manuscript.

      (2) Some figures and analyses are not well explained, and I was not able to understand them.

      Thank you for your comments, and we are sorry for the confusion. We defined ‘indirect interaction’ as ‘co-association’ and ‘cooperativity’ as ‘if the common target of two TFs is from a TF’. We added the definition of "indirect interaction" and "cooperativity" in the revised legend.

      For Figure S3a, the low co-association scores and large peak numbers of these top-level TFs indicated that top-level TFs preferred to solely regulate target genes, but not to co-regulate with other top-level TFs. PSPPH4700 was an example to show that top-level TFs with low co-association scores and large peak numbers tend to solely regulate target genes, but not to co-regulate with other top-level TFs. We revised the sentence to ‘For example, the top-level TF PSPPH4700 yielded over 1,700 peaks but cooperated with only 24 top-level TFs with low co-association scores about 0.05 (Supplementary Table 2b).’.

      We analyzed high co-association scores of 125 TFs in three levels and further determined the co-association patterns. To identify the tendency of co-association of all these 125 TFs, the co-association patterns were classified into 4 clusters. Bottom-level TFs tend to co-regulate target genes with other TFs. We revised the sentence in the revised manuscript.

      For Figure 2b, in C1, C2 and C4, many bottom-level TFs performed co-association pattern with other TFs, especially bottom TFs (showed in C4). To explore the regulatory pattern in C3, the peak locations in target genes of MexT were analyzed with those of TFs in C3. Seven top-level TFs (PSPPH1435, PSPPH1758, PSPPH2193, PSPPH2454, PSPPH4638, PSPPH4998 and PSPPH3411), three middle-level TFs (PSPPH1100, PSPPH5132 and PSPPH5144) and four bottom-level TFs (PSPPH0700, PSPPH2300, PSPPH2444 and PSPPH2580) were compared with MexT. MexT showed higher co-association scores (more than 60 scores) with more top-level-TFs. Therefore, we demonstrated that MexT performed closer co-association relationships with top-level TFs. We added the statement in the revised manuscript.

      For Figure 1a, the hierarchical network showed different number of TFs in three levels (54 top-level TFs, 62 middle-level TFs and 147 bottom-level TFs), which indicated that more than half of TFs (bottom-level TFs) tend to be regulated by other TFs and then directly bound to target genes. This finding showed a downward regulatory direction of transcription regulation in P. syringae. We revised the statement in the revised manuscript.

      (3) The Method section lacks depth, especially in data analyses. It is strongly recommended that the authors share their analysis codes so that others can reproduce the analyses.

      Thank you for your comments, and we defined the intergenic region before each TF sequence as the promoter region. As pHM1 plasmid carries its own constitutive promoter (lacZ promoter), we amplified the TF-coding sequence and cloned into site following the promoter. The TF protein expression was activated by the promoter of plasmid. Psph 1448A was used for our main ChIP-seq. We added the details in the revised manuscript.

      For Figure S3, we performed GO analysis on genes that were co-bound by TF pairs. We added the details in the revised manuscript.

      We shared our analysis codes on the website (https://github.com/dengxinb2315/PS-PATRnet-code) in the Data Availability.

      Recommendations for the authors

      Reviewer #1 (Recommendations For The Authors):

      (1) The specific strain of Pseudomonas syringae used in the study outside of the evolutionary analysis should be specified in the abstract and main text.

      Thank you for your suggestion. We revised the statements in abstract and main text to specific strains.

      (2) The language used throughout the manuscript should be revised for clarity, conciseness, and readability.

      Thank you for your suggestion. We have revised the language used throughput the manuscript by a scientific editor who is a native speaker of English.

      (2) Line 688: Replace "80C" with "-80C".

      Thank you for your correction. We revised ‘80℃’ to ‘-80℃’. Please see Line 713.

      (3) Line 172 - 173: The abbreviations TT, MM, BB, TM, TB, and MB need to be expanded in the main text before their use.

      Thank you for your suggestion. We added the abbreviations TT, MM, BB, TM, TB, and MB in the manuscript. Please see Lines 172-174.

      Reviewer #2 (Recommendations For The Authors):

      Major points

      (1) The name of the P. syringae strains used in each experiment/analysis should be explicitly stated (most experiments were carried out with P. syringae strain 1448A). This should also be applied to the introduction where many papers on P. syringae are cited without clear indication of strain names. I think this amendment is essential because target genes and thus biological functions of TFs could be different between P. syringae strains, as shown in the present study.

      Thank you for your suggestion. We revised the P. syringae strains in the citations throughout the manuscript.

      (2) How many TFs were analyzed throughout the study? Most sentences including line 22 in the abstract say 170, but I also found some say 270 (for example, line 106 and line 149). The legend of Figure 1 says 262. More detailed information is required regarding the datasets used for each analysis.

      Thank you for your suggestion. The number of TFs analyzed by ChIP-seq in this research is 170, the number of TFs analyzed by HT-SELEX in our previous research is 100. Hierarchical analysis integrated data from ChIP-seq and HT-SELEX which included 270 TFs. As 8 TFs did not show hierarchical characteristic, the legend of Figure 1 said 262 TFs. We added the data source in the revised manuscript. Please see Lines 104, 147, 160 and 1082.

      (3) Figure 1b: Please define "indirect interaction" and "cooperativity" in the legend as well as in the text. I only found the definition of "direct interaction".

      Sorry for the missing information. We defined ‘indirect interaction’ and ‘cooperativity’ as ‘co-association’ and ‘if the common target of two TFs is from a TF’, respectively. We added the definition of "indirect interaction" and "cooperativity" in the revised legend. Please see Lines 174-176, 1084-1086.

      (4) I found it very interesting that conserved TFs show different repertoires of target genes in different P. syringae strains. This suggests the rewiring of transcriptional regulatory networks in P. syringae strains, but the underlying mechanism is not explored in the current manuscript. It can be easily tested whether these conserved TFs bind to similar or different motifs by motif enrichment analysis. If they bind to similar motifs, it is possible that the promoter sequences of their target genes have diversified. Addressing or at least discussing these points would provide molecular insights into the diversification of the transcriptional regulatory networks in P. syringae. Similarly, functional enrichment analysis of target genes can be used to test whether the conserved TFs regulate different biological processes.

      Thank you for your suggestion. We added the motif analysis and functional enrichment analysis of target genes of TFs (PSPPH3122 and PSPPH4127) in different P. syringae strains. We found two different motifs (AGACN4GATCAA and CGGACGN3GATCA) in 1448A and DC3000 strains, respectively. We also performed the GO analysis and found the specific functions of PSPPH3122 in Psph 1448A compared with Pst DC3000 and Pss B728a strains, including recombinase activity and DNA recombination. For PSPPH4127, we found four different motifs in four P. syringae strains. GO analysis showed its relationship with recombinase activity in Psph 1448A strain, and RNA binding, structural constituent of ribosome, translation and ribosome in Pss B728a strain. These results indicated the highly functional diversity of TFs in P. syringae. We added these points in the Results part, and Figure S9-S10 in the revised manuscript. Please see Lines 497-509.

      (5) Related to point 4, it would be quite useful if a list of orthologous genes of 1448A TFs in the other tested P. syringae strains were provided. Such information may also enhance the utility of the database developed in this study.

      Thank you for your suggestion. We added the list of orthologous genes of 301 Psph 1448A TFs in the other tested P. syringae strains in the Supplementary Table 5. Please see Lines 467 and Supplementary Table 5.

      (6) Lines 243-246: It is unclear how these functional enrichment analyses were performed. Did you use target genes regulated by individual TFs or those coregulated by pairs of TFs? Please add more information for the sake of readers.

      Thank you for your suggestion. We performed the functional enrichment analyses by hypergeometric test (BH-adjusted p < 0.05) via using target genes regulated by individual TFs. We added the details in the Results part. Please see Lines 248-252, 270, 1194-1195, 1199-1200 and 1205-1206.

      Minor points

      (1) Lines 167-168: I may not understand correctly, but you might want to say "downward-pointing edges" instead of "upward-pointing edges".

      Thank you for correction. We revised the ‘upward-pointing edges’ to ‘downward-pointing edges’. Please see Line 166.

      (2) Line 174: "physical interactions" should be amended to "direct interactions".

      Thank you for correction. We revised the ‘physical interactions’ to ‘direct interactions’. Please see Line 177.

      (3) Line 224: Could you please explain why bacterial growth in plant tissues is considered an example of "multi-stability"?

      Thank you for your suggestion. We are sorry for the incorrect statement. We showed ‘plant intercellular spaces’ as ‘multi-stability’. We revised the sentence to ‘These auto-regulators are important and always act as repressors in scenarios of multi-stability, such as plant intercellular spaces’. Please see Lines 224-226.

      (4) Line 254-257: Here, the definition of "tether binding" is introduced, but it is not very clear to me. In my understanding, tethered binding is an indirect binding of a TF to a target gene through protein-protein interaction with other TF that directly binds to the promoter of the target gene.

      Thank you for your suggestion, and we agree with you. We referred to the paper published in 2012 (Wang et al., 2012) and revised the statement of ‘tether binding’ to ‘This finding suggested that these TFs indirectly regulated target genes through protein-protein interaction with other TFs that directly binds to the promoters of target genes, a phenomenon defined as tethered binding’. Please see Lines 259-262.

      (5) Lines 341-343: Figure 3b shows qRT-PCR of hopAE1, not hrpR.

      Thank you for your correction. We revised ‘hrpR’ to ‘hopAE1’. Please see Line 349.

      (6) Lines 500 and Figure 6b: It is hard to see edges from module 12 to others. So, it would be better to provide numeric information (number of TFs and target genes) in the text.

      Thank you for your suggestion. Module 12 includes 22 TFs and 318 target genes. We added the statement of numeric information about Module 12 in the revised manuscript. Please see Lines 536-537.

      (7) Line 519: Figure S4b is not the EMSA data for PSPPH3798. Should it be Figure S4e?

      Thank you for your correction. We revised to ‘Figure S4e’. Please see Line 545.

      (8) Line 522: Figure S6b is not relevant to the statement here.

      Thank you for your correction. We deleted the ‘Figure S6b’ here. Please see Line 547.

      (9) Line 593: prokaryotic transcriptional regulatory networks -> eukaryotic transcriptional regulatory networks?

      Thank you for your correction. We revised ‘prokaryotic transcriptional regulatory networks’ to ‘eukaryotic transcriptional regulatory networks’. Please see Line 618.

      (10) Figure S3 requires images of higher resolution. Especially, values for the color codes are not readable or very hard to see.

      Thank you for your suggestion. To make the images clearer, we enlarged the images, change the color codes, and divided it into three figures. Please see the revised Figures S3-S5 and corresponding Figure legends at Lines 1191-1206.

      Reviewer #3 (Recommendations For The Authors):<br /> (1) Some conclusions are not backed by quantitative or statistical analyses, and they are sometimes overinterpreted.

      L221: "Taken together, the simplest and most effective submodule M1 and the coregulatory submodule M13 played crucial roles in the transcriptional regulation of TFs in P. syringae."

      The authors did not provide any evidence supporting the functional importance of any of these submodules. M13 is most enriched within the locked loop, but its size is much smaller than simple loops. What evidence supports the importance of this particular submodule?

      Thank you for your suggestion. In eukaryote (Saccharomyces cerevisiae) and prokaryote (Escherichia coli) which have the best characterized transcriptional regulation networks, the feed-forward loop (called M13 in this article) appear numerous times in the networks and perform different biological functions. M1 appeared most frequently by an order of magnitude than other modules. We revised the sentence to ‘Taken together, the most numerous but simplest submodule M1 played a crucial role in the transcriptional regulation of TFs in P. syringae.’ Please see Lines 222-224.

      L223: "...we found 92 auto-regulators...These auto-regulators are important and always act as repressors in scenarios of multi-stability, such as in plant intercellular spaces where bacteria grow (Figure 1d)(Alon, 2007). These regulators are regarded as bistable switches that further influence the expression of downstream genes."<br /> Are these claims supported by any evidence?

      Thank you for your suggestion. We referred to the following articles:

      (1) Alon. Nature Reviews Genetics. 2007(Alon, 2007).

      That transcription factors repress the transcription of their target genes was considered as negative regulation. These negative autoregulators account for half of the repressors in E. coli and occur in many eukaryotes. The repressors controlled the concentration of the target production through suppressing its expression, which accelerated back to the steady state of cells.

      (2) Becskei. et al. Nature. 2000; Rosenfeld et al. Journal of Molecular Biology. 2002 (Becskei & Serrano, 2000; Rosenfeld, Elowitz, & Alon, 2002).

      Fluorescent assay confirmed that the negative autoregulatory module (negative autoregulator TetR) spent less time to the log phase than unregulated group, which reduced cell-to-cell fluctuations in the steady-state level of the transcription factor. Some negative autoregulators were showed here, such as LexA, CysB and SrlA-D.

      In our research, we also identified many autoregulators including CysB and LexA2 (annotated as LexA repressor). We revised the sentence to ‘In addition, we found 92 auto-regulators in our hierarchy network. These auto-regulators are important and always act as repressors in scenarios of multi-stability, such as plant intercellular spaces (Figure 1d) (Alon, 2007). For example, LexA and CysB as negative autoregulators were indicated to reduce cell-to-cell fluctuations in the steady-state level of the transcription factor (Becskei & Serrano, 2000; Rosenfeld et al. 2002).’. Please see Lines 224-229.

      L265: "This finding indicated that the bottom-level TFs, which were more easily regulated, tended to cooperate with downstream genes and other intra-level TFs."<br /> Could the authors provide more explanation to reach this conclusion from the data? Analyzing the number of highly co-accessing TFs does not sufficiently support this conclusion. The clustering of TFs (C1-C4) is incomplete, and each TF level (Top/Middle/Bottom) contains different numbers of TFs. Since the authors calculated all-by-all co-association scores for these 125 TFs, they can group these scores into 6 possible combinations (TT, TM, TB, MM, MB, BB) and show the distribution of co-association scores.

      Thank you for your suggestion. We indicated that the bottom-level TFs preferred to regulate the target genes through the cooperation with other TFs. To further support the claim, we analyzed the proportion of the bottom TF interaction in all the TF pairs interactions and direct interaction based on results in Figure 1B. The interactions of bottom TFs were 43% and 49%, respectively. However, the interactions of top TFs and middle TFs were only 20% and 28%, respectively. We revised the statement ‘Based on the analysis in Figure 1B, we found that the proportions of bottom-level TF interaction in all the TF pair interactions and direct interaction were 43% and 49%. These results indicated that the bottom-level TFs tended to regulate downstream genes through cooperating with other level TFs.’ in the revised manuscript. Please see Lines 269-272.

      As not every TF performed co-association with other TFs, we only collected 125 TFs with co-association scores. For the numbers of TF in each level, we divided TFs into three levels according to hierarchy height. Hierarchy height from -1 to -0.3 represented bottom level; hierarchy height from -0.3 to 0.3 represented middle level ; hierarchy height from 0.3 to 1 represents top level. Each level was equally divided by height scores. We suggested that different numbers of TFs in three levels indicated the characteristic of transcriptional regulation in P. syringae.

      Thank you for your suggestion. As the co-association patterns were determined by co-association scores of the same TFs, we first grouped the co-association scores into 3 possible TF pairs (TT, MM, and BB, in Figures S3a, S4a and S5a). Our results indicated that higher co-association scores preferred to occur in bottom-level TFs. We revised the statement in the revised manuscript. Please see Lines 244-252.

      (2) Some figures and analyses are not well explained, and I was not able to understand them.

      Figure 1b: The terms "direct," "indirect," and "cooperativity" require further clarification as their definitions in the text (L169-183) are unclear to me. This ambiguity hampers the evaluation of the authors' discussion regarding TF-TF interactions (L561-584), an important theme of this study. The figure includes concepts discussed in later sections (e.g., cooperativity), making it difficult to understand. A diagram explaining these concepts would be highly helpful for readers to understand.

      Sorry for the missing information. We defined ‘indirect interaction’ as ‘co-association’, ‘cooperativity’ as ‘if the common target of two TFs is from a TF’. We added the definition of "indirect interaction" and "cooperativity" in the revised manuscript and legend. Please see Lines 174-176 and 1085-1087.

      L253: "Notably, we found that TFs at the top level, without cooperating TFs, exhibited a large number of binding peaks (Figure S3a)."

      I could not understand this sentence. Did the authors mean that top-level TFs with a large number of peaks showed a low level of co-association? If so, does this data suggest that these TFs do not tend to cooperate with other TFs? I was confused by the discussion in L253-L261.

      Thank you for your comment, and we agree with you. The low co-association scores and large peak numbers of these top-level TFs indicated that top-level TFs preferred to solely regulate target genes, but not to co-regulate with other top-level TFs.

      Thank you for your comment. From L253-256, PSPPH4700 was an example to show that top-level TFs with low co-association scores and large peak numbers tend to solely regulate target genes, but not to co-regulate with other top-level TFs. We revised the sentence to ‘For example, the top-level TF PSPPH4700 yielded over 1,700 peaks, but cooperated with only 24 top-level TFs with low co-association scores about 0.05 (Supplementary Table 2b).’.

      From L257-261, we analyzed high co-association scores of 125 TFs in three levels and further determined the co-association patterns. To identify the tendency of co-association of all these 125 TFs, the co-association patterns were classified into 4 clusters. Bottom-level TFs tend to co-regulate target genes with other TFs. We revised the sentence. Please see Lines 262-264, 265-266 and 269-272.

      L287: "The analysis of the peak locations of MexT demonstrated that MexT showed closer co-association relationships with top-level TFs (Figure 2b)."

      I could reach this conclusion by seeing Figure 2b. Additional explanation and/or data visualization would be appreciated.

      Thank you for your suggestion. In C1, C2 and C4, many bottom-level TFs performed co-association pattern with other TFs, especially bottom TFs (showed in C4). To explore the regulatory pattern in C3, the peak locations in target genes of MexT were analyzed with those of TFs in C3. Seven top-level TFs (PSPPH1435, PSPPH1758, PSPPH2193, PSPPH2454, PSPPH4638, PSPPH4998 and PSPPH3411), three middle-level TFs (PSPPH1100, PSPPH5132 and PSPPH5144) and four bottom-level TFs (PSPPH0700, PSPPH2300, PSPPH2444 and PSPPH2580) were compared with MexT. MexT showed higher co-association scores (more than 60 scores) with more top-level-TFs. Therefore, we demonstrated that MexT performed closer co-association relationships with top-level TFs. We added the statement in the revised manuscript. Please see Lines 291-296.

      Figure 6cd: What kind of enrichment analysis did the authors perform? Was any statistical test used? The figure only shows the number of genes, and sometimes the number is only 1 for a functional category. Can it be considered as significant enrichment?

      Thank you for your comment. We used hypergeometric test in this analysis. Although only one gene was enriched in some pathways, the adjusted p-value was less than 0.05. We added the details in the revised manuscript. Please see Lines 533-534.

      L169: "The hierarchical network revealed a downward information flow, suggesting the prioritization of collaboration between different hierarchy levels."<br /> Can the authors please explain the logic behind this statement more in detail?

      Thank you for your comment. The hierarchical network showed different number of TFs in three levels (54 top-level TFs, 62 middle-level TFs and 147 bottom-level TFs), which indicated that more than half of TFs (bottom-level TFs) tend to be regulated by other TFs and then directly bound to target genes. This finding showed a downward regulatory direction of transcription regulation in P. syringae. We revised the statement in the revised manuscript. Please see Lines 167-170.

      (3) The Method section lacks depth, especially on data analyses.

      How did the authors define promoter regions of each gene? How were operons treated in their analyses? Was P. syringae 1448A used for their main ChIP-seq?

      Thank you for your comment. We defined the intergenic region before each TF sequence as the promoter region.

      As pHM1 plasmid carries its own constitutive promoter (lacZ promoter), we amplified the TF-coding sequence and cloned into the site following the promoter. The TF protein expression was activated by the promoter of plasmid.

      P. syringae 1448A was used for our main ChIP-seq. We added the details in the revised manuscript. Please see Lines 705 and 727-730.

      Figure S3: I am not sure how the GO analyses were done. For example, in the case of the top-level TF PSPPH4700, did the authors perform GO analysis on genes that are co-bound by PSPPH4700 and any other top-level TFs?

      Thank you for your comment and we agree with you. We performed GO analysis on genes that were co-bound by TF pairs in the same level. We added the details in the revised manuscript. Please see Lines 248-252.

      The analysis presented in Figure 6a needs more explanation of the methodology employed by the authors.

      Thank you for your comment. We added more details for the analysis in Figure 6a. Please see Lines 514-522.

      It is strongly recommended that the authors share their analysis codes so that others can reproduce the analyses.

      Thank you for your comment. We shared our analysis codes on the website (https://github.com/dengxinb2315/PS-PATRnet-code) in the Data Availability. Please see Lines 800-801.

      (4) Other:

      Figure 3: I suggest putting additional panel labels to facilitate the interpretation of the figure.

      Thank you for your suggestion. We added detailed labels in the revised Figures 3 and 4. Please see in the revised Figures 3 and 4.

      I spotted several potential errors:

      L106: 170 TFs?

      Thank you for your comment, and we are sorry for the missing details. For the hierarchical network, we integrated the DNA-binding data of 170 TFs in this study and 100 TFs in our previous SELEX research. We added the details in the revised manuscript. Please see Lines 104, 147 and 159-160.

      L592: P. syringae not E. coli?

      Thank you for your comment. Here we discussed the hierarchical characteristics in E. coli. We revised the statement in the revised manuscript. Please see Line 618.

      L593: eukaryotic not prokaryotic?

      Thank you for your correction. Here we discussed the feedforward loops in our study. We revised the statement in the revised manuscript. Please see Line 618.

      References

      Alon, U. (2007). Network motifs: theory and experimental approaches. Nature Reviews Genetics, 8(6), 450-461.

      Becskei, A., & Serrano, L. (2000). Engineering stability in gene networks by autoregulation. Nature, 405(6786), 590-593.

      Rosenfeld, N., Elowitz, M. B., & Alon, U. (2002). Negative autoregulation speeds the response times of transcription networks. Journal of molecular biology, 323(5), 785-793.

      Wang, J., Zhuang, J., Iyer, S., Lin, X., Whitfield, T. W., Greven, M. C., . . . Cheng, Y. (2012). Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors. Genome research, 22(9), 1798-1812.

    1. Author response:

      Reviewer #1:

      This review evaluates the SCellBOW framework, which applies phenotype algebra to obtain vectors from cancer subclusters or user-defined subclusters.

      Strengths:

      SCellBOW employs an innovative application of NLP-inspired techniques to analyze scRNA-seq data, facilitating the identification and visualization of phenotypically divergent cell subpopulations. The framework demonstrates robustness in accurately representing various cell types across multiple datasets, highlighting its versatility and utility in different biological contexts. By simulating the impact of specific malignant subpopulations on disease prognosis, SCellBOW provides valuable insights into the relative risk and aggressiveness of cancer subpopulations, which is crucial for personalized therapeutic strategies. The identification of a previously unknown and aggressive AR−/NElow subpopulation in metastatic prostate cancer underscores the potential of SCellBOW in uncovering clinically significant findings.

      Major concerns:

      The reliance on bulk RNA-seq data as a reference raises concerns about potentially misleading results due to the presence of RNA expression from immune cells in the TME. It is unclear if SCellBOW adequately addresses this issue, which could affect the accuracy of the cancer subcluster vectors.

      To address the concern about potentially misleading results due to the TME when using bulk RNA-seq data as a reference:

      a. We account for systematic biases between the single-cell and bulk transcriptomics readouts by creating pseudo-bulk profiles for single-cell clusters, enabling more accurate comparisons.

      b. We encode expressions into word vectors and co-embed them together. By doing this, we mitigate any possibility of systematic differences in the embedding.

      c. It is imperative that we subject both single-cell and bulk data through the same treatments because otherwise, it will be difficult to perform algebraic operations on them.

      d. We rely on tumor bulk transcriptomics data from TCGA due to its high sample size and patient meta-data such as information pertaining to patient survival.

      We will discuss this in the revised manuscript.

      The method of extracting vectors in phenotype algebra appears to be a straightforward subtraction operation. This simplicity might limit its efficiency in excluding associations with phenotypes from specific subpopulations, potentially leading to inaccurate interpretations of the data.

      Vector algebra operations are not done in the gene expression space (i.e., gene expression vectors associated with tumor samples), rather we process the single cell and bulk expression profiles through multiple steps (pseudo-bulk vector generation for single cell clusters, mapping gene expression values to word frequencies as better understood by the Doc2vec neural networks etc.) to ensure their embeddings are consistent and capture intricate phenotypic information. We have demonstrated this through rigorous validation of the clusters yielded on various types of healthy and diseased samples. Furthermore, we have demonstrated the consistency of the vector algebra operations on known cancer subtypes in breast cancer, glioblastoma, and prostate cancer.

      We will discuss this in the revised manuscript.

      The review would benefit from additional validation studies to assess the effectiveness of SCellBOW in distinguishing between cancerous and non-cancerous signals, particularly in heterogeneous tumor environments.

      In our study, we are primarily interested in signals from malignant cells. However, we may consider scRNA-seq data with stromal cells and test whether SCellBOW can identify the influence of different stromal cell types on cancer aggressiveness.

      Further clarification on how SCellBOW handles mixed-cell populations within bulk RNA-seq data would strengthen the evaluation of its applicability and reliability in diverse research settings.

      We will elaborate on our discussion in the Result as well as Discussion sections.

      Reviewer #2:

      The authors developed a novel tool, SCellBOW, to perform cell clustering and infer survival risks on individual cancer cell clusters from the single-cell RNA seq dataset. The key ideas/techniques used in the tool include transfer learning, bag of words (BOW), and phenotype algebra which is similar to word algebra from natural language processing (NLP). Comparisons with existing methods demonstrated that SCellBOW provides superior clustering results and exhibits robust performance across a wide range of datasets. Importantly, a distinguishing feature of SCellBOW compared to other tools is its ability to assign risk scores to specific cancer cell clusters. Using SCellBOW, the authors identified a new group of prostate cancer cells characterized by a highly aggressive and dedifferentiated phenotype.

      Strengths:

      The application of natural language processing (NLP) to single-cell RNA sequencing (scRNA-seq) datasets is both smart and insightful. Encoding gene expression levels as word frequencies is a creative way to apply text analysis techniques to biological data. When combined with transfer learning, this approach enhances our ability to describe the heterogeneity of different cells, offering a novel method for understanding the biological behavior of individual cells and surpassing the capabilities of existing cell clustering methods. Moreover, the ability of the package to predict risk, particularly within cancer datasets, significantly expands the potential applications.

      Major concerns:

      Given the promising nature of this tool, it would be beneficial for the authors to test the risk-stratification functionality on other types of tumors with high heterogeneity, such as liver and pancreatic cancers, which currently lack clinically relevant and well-recognized stratification methods. Additionally, it would be worthwhile to investigate how the tool could be applied to spatial transcriptomics by analyzing cell embeddings from different layers within these tissue

      (1) Our selection of glioblastoma and breast cancer for this study was primarily driven by the focus on extensively studied and well-defined cancer types. To demonstrate the effectiveness of our model, we tested it on advanced prostate cancer, which currently lacks clinically relevant and well-recognized stratification methods. This application to metastatic prostate cancer serves as a proof of concept, illustrating our model's potential to provide valuable insights into cancer types where established stratification approaches are limited or absent. However, as suggested by the Reviewer, we will try to incorporate results for liver cancer, subject to the availability of adequate data for model building.

      (2) Regarding the application of our tool to spatial transcriptomics, we have already analyzed data from Digital Spatial Profiling (DSP). The article is already quite complex and involved, and we are afraid the inclusion of spatial transcriptomics may amount to a significant extension of the method. To this end, although we will discuss the future possibilities, we will skip the method validity check on spatial transcriptomics data.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      (1) The reviewers asked to clarify the BTH assay: The fused T25 and T18 domains must be in the cytoplasmic to complement successfully. The authors stated that the N terminus of Aeg1 transverses the membrane once, which means that the T25-Aeg1 will have T25 in the periplasm. However, T18C vector fusion with other division proteins will have T18C of ZipA in the periplasm (ZipA's N terminus is on the periplasmic side of the inner membrane) while that of FtsN in the cytoplasm (FtsN's N terminus is in the cytoplasm). As such, it isn't easy to understand why T25-Aeg1 showed positive results for both ZipA and FtsN. Note that FtsL, FtsB, and FtsI all have the same topology as FtsN but showed negative results. It is possible that these fusion proteins do not fold correctly, and hence, the results cannot be interpreted directly. The authors did not address this concern but only cited that BTH is a commonly used assay for protein-protein interactions.

      In response to the editor's comments and the concerns raised by the reviewer, we have performed two sets of Aeg1-T25 fusion experiments to determine whether the Aeg1 topology impacts protein interactions measured by bacterial two-hybrid (BTH) assays. In the first set of experiments, we fused the T25 domain to the N-terminus of Aeg1 and still observed strong binding of Aeg1 to ZipA and FtsN, respectively. Similar results were obtained from the second set of experiments in which the T25 domain was fused to the C-terminus of Aeg1.

      These results indicate that the precise topology of Aeg1 does not significantly impact its ability to engage these binding partners. Aeg1 is predicted to harbor a single transmembrane domain, however, the precise location of this transmembrane segment differs in predictions made by different algorithms. The SMART Web site (1) predicted the transmembrane region to be located at the N-terminus of Aeg1 (7-29 aa). In contrast, Phobius, based on HMM (2, 3)suggested the transmembrane segment is situated more centrally within the Aeg1 protein (134-151 aa), and further proposed that the N-terminus may function as a signal peptide. This latter prediction also provides a potential explanation for the larger-than-expected molecular weight of the Aeg1 truncation mutant observed in the Western blot shown in Fig 1C. The removal of the putative signal peptide may have altered the protein structure, affecting its electrophoretic mobility. As a result, we are more inclined to favor the topology model for Aeg1 predicted by Phobius.

      (2) It is still difficult to identify the midcell localization patterns of Aeg1 and other division proteins from microscopy images (Fig. 4C and Fig. 5A). In Fig 4C, only ZipA and Aeg1 formed clear, regular band-like colocalization patterns. Others formed irregular co-localized puncta along the cell length, different from the expected midcell localization patterns. Cells also appeared to be much longer than WT cells, suggesting cell division defects. The most likely reason for these aberrant localization patterns and filamentous cells is that GFP/mCherry-fusions of these division proteins are not functional and become dominant negative, interfering with proper cell division. The authors need to test the functionality of these fusion proteins before they can be used for imaging. (The authors also mislabeled Hoechst and the division protein GFP panels labels in this figure.)

      Thank you for raising this important point. To examine the functionality of the fluorescence protein fusion constructs, we have painstakingly performed conditional knockout of the genes of interest (zipA, ftsB, ftsL, and ftsN) in A. baumannii strains inducibly expressing the corresponding fusion protein. We found that these fluorescence protein fusions were able to fully rescue the growth of the mutant lacking the corresponding fts gene (Figure 4-figure supplement 1). Concurrently, we have also successfully knocked out the aeg1 gene under conditions in trans expression of an mCherry-Aeg1 fusion protein, which was able to effectively rescue the growth defects of the Δa_eg1_ mutant (Figure 4-figure supplement 1). We then introduced the functional fluorescence protein fusions into wild-type cells and observed the co-localization of Aeg1 with the relevant Fts proteins. The results showed that Aeg1 indeed co-localized with ZipA, FtsB, FtsL, and FtsN (Fig.4E, red arrows), but occasional non-co-localization was also observed (Fig.4E, white arrows).

      We have utilized the functional fluorescence protein fusion constructs to analyze the localization of relevant Aeg1-interacting proteins in the Δ_aeg1_ strain upon Aeg1 depletion. Our results showed that the depletion of Aeg1 indeed impacted the midcell localization of the several Aeg1-interacting Fts proteins.

      References

      (1) Letunic I, Khedkar S, Bork P. SMART: recent updates, new developments and status in 2020. Nucleic acids research. 2021;49:D458-d60.doi: 10.1093/nar/gkaa937.

      (2) Käll L, Krogh A, Sonnhammer EL. A combined transmembrane topology and signal peptide prediction method. Journal of molecular biology. 2004;338:1027-36.doi: 10.1016/j.jmb.2004.03.016

      (3) Käll L, Krogh A, Sonnhammer EL. Advantages of combined transmembrane topology and signal peptide prediction--the Phobius web server. Nucleic acids research. 2007;35:W429-32.doi: 10.1093/nar/gkm256

    1. Author response:

      Reviewer #1:

      (1) Clarification of Novelty and Contribution:

      - We agree that the novelty of our study could have been better articulated. We will more clearly define the specific gaps in knowledge our study addresses. We will also clarify the novelty in our analysis of the correlational structure of gene expression under stress.

      (2) Methodological Details:

      - We acknowledge the need for additional detail in the methods section regarding the estimation of G, E, and GxE effects. We will expand this section to include the software used (R), the specific ANOVA models applied, and how significance was determined. We will also clarify which effects were treated as fixed or random effects.

      (3) Terminology Consistency:

      - We will thoroughly review the manuscript to ensure consistent use of selection-related terminology. This will involve distinguishing between quantitative genetics terms (e.g., irectional, stabilizing) and molecular evolution terms (e.g., positive, purifying) to avoid any confusion.

      (4) Bias in Conditional Neutrality and Antagonistic Pleiotropy:

      - We appreciate the suggestion to clarify the discussion around conditional neutrality (CN) and antagonistic pleiotropy (AP). We will elaborate on the inherent bias in detecting CN and P and specify how we adjusted P-value thresholds. Additionally, we will try to refine the discussion to address the concerns raised about the comparison of gene expression and local adaptation, incorporating relevant literature.

      Reviewer #2:

      (1) Sensitivity of Fitness Proxy:

      - We acknowledge the limitations of using the total filled grain number as a fitness proxy. We will include a discussion on the potential sensitivity of our results to this choice.

      (2) Cis- and trans-eQTL Contributions:

      - We appreciate the suggestion to report effect sizes in addition to the frequency of cis- and trans-eQTLs. We will incorporate this into our analysis and discuss whether our conclusions regarding the predominance of trans-eQTLs in expression variation hold when considering effect sizes.

      (3) Cis-Trans Relationship Analysis:

      - Since we wanted to estimate compensating vs. reinforcing effects, this essentially entails identifying genes that have opposing directionality of cis and trans-effects. To get the total trans-effect we decided to take the mean effect of trans-eQTLs. This mean was only used to identify the compensating/reinforcing genes and although the mean effects diminishes the effect of small trans-eQTLs, this metric was not used in downstream analyses.

      Reviewer #3:

      (1) Integration of Analyses:

      - We acknowledge that the manuscript currently presents some analyses in a somewhat independent manner. Although it would be ideal to have a central hypothesis/message, our study is meant to broadly outline the various responses and fitness effects of salinity stress on rice. Throughout the manuscript, we have also included comparisons between our findings and that of our previous studies on drought stress to highlight any consistent themes or novel insights.

      (2) X-by-Environment Effects:

      - We do plan to consider fitting models that explicitly incorporate X-by-environment interactions to provide a more detailed understanding of the genetics of plasticity between the two environments, but it is beyond the scope of this paper. This will be explored in a separate report.

      (3) Gene Grouping Methods:<br /> - We will try to discuss the pros and cons of using PCA versus gene co-expression network analysis (e.g., WGCNA) for grouping genes. We will also explore applying WGCNA in our analysis to see if it offers any additional insights or clarity.

      Reviewer #4:

      (1) Selection Analysis Across Environments:

      - We do plan to consider fitting models that explicitly incorporate G×E interactions to provide a more detailed understanding of the genetics of plasticity between the two environments, but it is beyond the scope of this paper. This will be explored in a separate report.

      (2) Gene Expression Trade-Offs Terminology:

      - We will revise our terminology to better reflect the nature of the trade-offs observed, and explore variation in covariance between phenotype and fitness between the two environments.

      (3) Biological Processes and Decoherence:

      - We will explore applying WGCNA in our analysis to see if it offers any additional insights or clarity.

      (4) Underutilization of Organismal Traits:

      - We did perform GWAS for all the traits measured in both environments, but did not find any significant hits. We will examine whether selection of co-expression modules are correlated with the traits, and may incorporate it in our manuscript depending on the results.

      (5) Detailed eQTL Analysis:

      - We will expand our eQTL analysis to include detailed statistics at the molecular trait level, including the phenotypic variance explained by cis- and trans-eQTLs and how these vary by environment.

      Although we focus on salinity conditions in our cis-trans compensation analysis in the main results, we have provided comparisons for all our eQTL analyses between normal and salinity conditions in the main text (with figures as supplementary).<br /> We are confident that these revisions will significantly strengthen our manuscript and address the concerns raised by the reviewers. We look forward to submitting a revised version that better communicates the significance and robustness of our findings.<br /> Thank you again for your valuable feedback.

    1. Author response:

      eLife assessment

      The authors present a potentially useful approach of broad interest arguing that anterior cingulate cortex (ACC) tracks option values in decisions involving delayed rewards. The authors introduce the idea of a resource-based cognitive effort signal in ACC ensembles and link ACC theta oscillations to a resistance-based strategy. The evidence supporting these new ideas is incomplete and would benefit from additional detail and more rigorous analyses and computational methods.

      The reviewers have provided several excellent suggestions and pointed out important shortcomings of our manuscript. We are grateful for their efforts. To address these concerns, we are planning a major revision to the manuscript. In the revision, our goal is to address each of the reviewer’s concerns and codify the evidence for resistance- and resource-based control signals in the rat anterior cingulate cortex. We have provided a nonexhaustive list we plan to address in the point by point responses below.   

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Young (2.5 mo [adolescent]) rats were tasked to either press one lever for immediate reward or another for delayed reward.

      Please note that at the time of testing and training that the rats were > 4 months old.

      The task had a complex structure in which (1) the number of pellets provided on the immediate reward lever changed as a function of the decisions made, (2) rats were prevented from pressing the same lever three times in a row. Importantly, this task is very different from most intertemporal choice tasks which adjust delay (to the delayed lever), whereas this task held the delay constant and adjusted the number of 20 mg sucrose pellets provided on the immediate value lever.

      Several studies parametrically vary the immediate lever (PMID: 39119916, 31654652, 28000083, 26779747, 12270518, 19389183). While most versions of the task will yield qualitatively similar estimates of discounting, the adjusting amount is preferred as it provides the most consistent estimates (PMID: 22445576). More specifically this version of the task avoids contrast effects of that result from changing the delay during the session (PMID: 23963529, 24780379, 19730365, 35661751) which complicates value estimates.

      Analyses are based on separating sessions into groups, but group membership includes arbitrary requirements and many sessions have been dropped from the analyses.

      We are in discussions about how to address this valid concern. This includes simply splitting the data by delay. This approach, however, has conceptual problems that we will also lay out in a full revision.  

      Computational modeling is based on an overly simple reinforcement learning model, as evidenced by fit parameters pegging to the extremes.

      We apologize for not doing a better job of explaining the advantages of this type of model for the present purposes. Nevertheless, given the clear lack of enthusiasm, we felt it was better to simply update the model as suggested by the Reviewers. The straightforward modifications have now been implemented and we are currently in discussion about how the new results fit into the larger narrative.

      The neural analysis is overly complex and does not contain the necessary statistics to assess the validity of their claims.

      We plan to streamline the existing analysis and add statistics, where required, to address this concern.

      Strengths:

      The task is interesting.

      Thank you for the positive comment

      Weaknesses:

      Behavior:

      The basic behavioral results from this task are not presented. For example, "each recording session consisted of 40 choice trials or 45 minutes". What was the distribution of choices over sessions? Did that change between rats? Did that change between delays? Were there any sequence effects? (I recommend looking at reaction times.) Were there any effects of pressing a lever twice vs after a forced trial?

      Animals tend to make more immediate choices as the delay is extended, which is reflected in Figure 1. We will add more detail and additional statistics to address these questions. 

      This task has a very complicated sequential structure that I think I would be hard pressed to follow if I were performing this task.

      Human tasks implement a similar task structure (PMID: 26779747). Please note the response above that outlines the benefits of using of this task.   

      Before diving into the complex analyses assuming reinforcement learning paradigms or cognitive control, I would have liked to have understood the basic behaviors the rats were taking. For example, what was the typical rate of lever pressing? If the rats are pressing 40 times in 45 minutes, does waiting 8s make a large difference?

      This is a good suggestion. However, rats do not like waiting for rewards, even small delays. Going from the 4 à 8 sec delay results in more immediate choices, indicating that the rats will forgo waiting for a smaller reinforcer at the 8 sec delay as compared to the 4 sec.  

      For that matter, the reaction time from lever appearance to lever pressing would be very interesting (and important). Are they making a choice as soon as the levers appear? Are they leaning towards the delay side, but then give in and choose the immediate lever? What are the reaction time hazard distributions?

      These are excellent suggestions. We are looking into implementing them.

      It is not clear that the animals on this task were actually using cognitive control strategies on this task. One cannot assume from the task that cognitive control is key. The authors only consider a very limited number of potential behaviors (an overly simple RL model). On this task, there are a lot of potential behavioral strategies: "win-stay/lose-shift", "perseveration", "alternation", even "random choices" should be considered.

      The strategies the Reviewer mentioned are descriptors of the actual choices the rats made. For example, perseveration means the rat is choosing one of the levers at an excessively high rate whereas alternation means it is choosing the two levers more or less equally, independent of payouts. But the question we are interested in is why? We are arguing that the type of cognitive control determines the choice behavior but cognitive control is an internal variable that guides behavior, rather than simply a descriptor of the behavior. For example, the animal opts to perseverate on the delayed lever because the cognitive control required to track ival is too high. We then searched the neural data for signatures of the two types of cognitive control.

      The delay lever was assigned to the "non-preferred side". How did side bias affect the decisions made?

      The side bias clearly does not impact performance as the animals prefer the delay lever at shorter delays, which works against this bias.

      The analyses based on "group" are unjustified. The authors compare the proportion of delayed to immediate lever press choices on the non-forced trials and then did k-means clustering on this distribution. But the distribution itself was not shown, so it is unclear whether the "groups" were actually different. They used k=3, but do not describe how this arbitrary number was chosen. (Is 3 the optimal number of clusters to describe this distribution?) Moreover, they removed three group 1 sessions with an 8s delay and two group 2 sessions with a 4s delay, making all the group 1 sessions 4s delay sessions and all group 2 sessions 8s delay sessions. They then ignore group 3 completely. These analyses seem arbitrary and unnecessarily complex. I think they need to analyze the data by delay. (How do rats handle 4s delay sessions? How do rats handle 6s delay sessions? How do rats handle 8s delay sessions?). If they decide to analyze the data by strategy, then they should identify specific strategies, model those strategies, and do model comparison to identify the best explanatory strategy. Importantly, the groups were session-based, not rat based, suggesting that rats used different strategies based on the delay to the delayed lever.

      These are excellent points and, as stated above, we are in the process revisiting the group assignments in an effort allay these criticisms.

      The reinforcement learning model used was overly simple. In particular, the RL model assumes that the subjects understand the task structure, but we know that even humans have trouble following complex task structures. Moreover, we know that rodent decision-making depends on much more complex strategies (model-based decisions, multi-state decisions, rate-based decisions, etc). There are lots of other ways to encode these decision variables, such as softmax with an inverse temperature rather than epsilon-greedy. The RL model was stated as a given and not justified. As one critical example, the RL model fit to the data assumed a constant exponential discounting function, but it is well-established that all animals, including rodents, use hyperbolic discounting in intertemporal choice tasks. Presumably this changes dramatically the effect of 4s and 8s. As evidence that the RL model is incomplete, the parameters found for the two groups were extreme. (Alpha=1 implies no history and only reacting to the most recent event. Epsilon=0.4 in an epsilon-greedy algorithm is a 40% chance of responding randomly.)

      Please see our response above. We agree that the approach was not justified, but we do not agree that it is invalid. Simply stated, a softmax approach gives the best fit to the choice behavior, whereas our epsilon-greedy approach attempted to reproduce the choice behavior using a naïve agent that progressively learns the values of the two levers on a choice-by-choice basis. The epsilon-greedy approach can therefore tell us whether it is possible to reproduce the choice behavior by an agent that is only tracking ival. Given our discovery of an ival-tracking signal in ACC, we believed that this was a critical point (although admittedly we did a poor job of communicating it). However, we also appreciate that important insights can be gained by fitting a model to the data as suggested. In fact, we had implemented this approach initially and are currently reconsidering what it can tell us in light of the Reviewers comments.

      The authors do add a "dbias" (which is a preference for the delayed lever) term to the RL model, but note that it has to be maximal in the 4s condition to reproduce group 2 behavior, which means they are not doing reinforcement learning anymore, just choosing the delayed lever.

      Exactly. The model results indicated that a naïve agent that relied only on ival tracking would not behave in this manner. Hence it therefore was unlikely that the G1 animals were using an ival-tracking strategy, even though a strong ival-tracking signal was present in ACC.

      Neurophysiology:

      The neurophysiology figures are unclear and mostly uninterpretable; they do not show variability, statistics or conclusive results.

      While the reviewer is justified in criticizing the clarity of the figures, the statement that “they do not show variability, statistics or conclusive results” is demonstrably false. Each of the figures presented in the manuscript, except Figure 3, are accompanied by statistics and measures of variability. This comment is hyperbolic and not justified.  

      Figure 3 was an attempt to show raw neural data to better demonstrate how robust the ivalue tracking signal is.

      As with the behavior, I would have liked to have seen more traditional neurophysiological analyses first. What do the cells respond to? How do the manifolds change aligned to the lever presses? Are those different between lever presses?

      We provide several figures describing how neurons change firing rates in response to varying reward. We are unsure what the reviewer means by “traditional analysis”, especially since this is immediately followed by a request for an assessment of neural manifolds. That said, we are developing ways to make the analysis more intuitive and, hopefully, more “traditional”.

      Are there changes in cellular information (both at the individual and ensemble level) over time in the session?

      We provide several analyses of how firing rate changes over trials in relation to ival over time in the session.

      How do cellular responses differ during that delay while both levers are out, but the rats are not choosing the immediate lever?

      It is not clear to us how this analysis addresses our hypothesis regarding control signals in ACC.

      Figure 3, for example, claims that some of the principal components tracked the number of pellets on the immediate lever ("ival"), but they are just two curves. No statistics, controls, or justification for this is shown. BTW, on Figure 3, what is the event at 200s?

      Figure 3 will be folded into one of the other figures that contains the summary statistics.

      I'm confused. On Figure 4, the number of trials seems to go up to 50, but in the methods, they say that rats received 40 trials or 45 minutes of experience.

      This analysis included force trials. The max of the session is 40 choice trials. We will clarify in the revised manuscript. 

      At the end of page 14, the authors state that the strength of the correlation did not differ by group and that this was "predicted" by the RL modeling, but this statement is nonsensical, given that the RL modeling did not fit the data well, depended on extreme values. Moreover, this claim is dependent on "not statistically detectable", which is, of course, not interpretable as "not different".

      We plan to revisit this analysis and the RL model.

      There is an interesting result on page 16 that the increases in theta power were observed before a delayed lever press but not an immediate lever press, and then that the theta power declined after an immediate lever press.

      Thank you for the positive comment.

      These data are separated by session group (again group 1 is a subset of the 4s sessions, group 2 is a subset of the 8s sessions, and group 3 is ignored). I would much rather see these data analyzed by delay itself or by some sort of strategy fit across delays.

      Provisional analysis indicates that the results hold up over delays, rather than the groupings in the paper. We will address this in a full revision of the manuscript.

      That being said, I don't see how this description shows up in Figure 6. What does Figure 6 look like if you just separate the sessions by delay?

      We are unclear what the reviewer means by “this description”.

      Discussion:

      Finally, it is unclear to what extent this task actually gets at the questions originally laid out in the goals and returned to in the discussion. The idea of cognitive effort is interesting, but there is no data presented that this task is cognitive at all. The idea of a resourced cognitive effort and a resistance cognitive effort is interesting, but presumably the way one overcomes resistance is through resource-limited components, so it is unclear that these two cognitive effort strategies are different.

      We view the strong evidence for ival tracking presented herein as a potentially critical component of resource based cognitive effort. We hope to clarify how this task engaged cognitive effort more clearly.  

      The authors state that "ival-tracking" (neurons and ensembles that presumably track the number of pellets being delivered on the immediate lever - a fancy name for "expectations") "taps into a resourced-based form of cognitive effort", but no evidence is actually provided that keeping track of the expectation of reward on the immediate lever depends on attention or mnemonic resources. They also state that a "dLP-biased strategy" (waiting out the delay) is a "resistance-based form of cognitive effort" but no evidence is made that going to the delayed side takes effort.

      There is a well-developed literature that rats and mice do not like waiting for delayed reinforcers. We contend that enduring something you don’t like takes effort.

      The authors talk about theta synchrony, but never actually measure theta synchrony, particularly across structures such as amygdala or ventral hippocampus. The authors try to connect this to "the unpleasantness of the delay", but provide no measures of pleasantness or unpleasantness. They have no evidence that waiting out an 8s delay is unpleasant.

      We will better clarify how our measure of Theta power relates to synchrony. There is a well-developed literature that rats and mice do not like waiting for delayed reinforcers.

      The authors hypothesize that the "ival-tracking signal" (the expectation of number of pellets on the immediate lever) "could simply reflect the emotional or autonomic response". Aside from the fact that no evidence for this is provided, if this were to be true, then, in what sense would any of these signals be related to cognitive control?

      This is proposed as an alternative explanation to the ivalue signal. We provide this as a possibility, never a conclusion. We will clarify this in the revised text. 

      Reviewer #2 (Public Review):

      Summary:

      This manuscript explores the neuronal signals that underlie resistance vs resource-based models of cognitive effort. The authors use a delayed discounting task and computational models to explore these ideas. The authors find that the ACC strongly tracks value and time, which is consistent with prior work. Novel contributions include quantification of a resource-based control signal among ACC ensembles, and linking ACC theta oscillations to a resistance-based strategy.

      Strengths:

      The experiments and analyses are well done and have the potential to generate an elegant explanatory framework for ACC neuronal activity. The inclusion of local-field potential / spike-field analyses is particularly important because these can be measured in humans.

      Thank you for the endorsement of our work.

      Weaknesses:

      I had questions that might help me understand the task and details of neuronal analyses.

      (1) The abstract, discussion, and introduction set up an opposition between resource and resistance based forms of cognitive effort. It's clear that the authors find evidence for each (ACC ensembles = resource, theta=resistance?) but I'm not sure where the data fall on this dichotomy.

      a. An overall very simple schematic early in the paper (prior to the MCML model? or even the behavior) may help illustrate the main point.

      b. In the intro, results, and discussion, it may help to relate each point to this dichotomy.

      c. What would resource-based signals look like? What would resistance based signals look like? Is the main point that resistance-based strategies dominate when delays are short, but resource-based strategies dominate when delays are long?

      d. I wonder if these strategies can be illustrated? Could these two measures (dLP vs ival tracking) be plotted on separate axes or extremes, and behavior, neuronal data, LFP, and spectral relationships be shown on these axes? I think Figure 2 is working towards this. Could these be shown for each delay length? This way, as the evidence from behavior, model, single neurons, ensembles, and theta is presented, it can be related to this framework, and the reader can organize the findings.

      These are excellent suggestions, and we intend to implement each of them, where possible.

      (2) The task is not clear to me.

      a. I wonder if a task schematic and a flow chart of training would help readers.

      Yes, excellent idea, we intend to include this.

      b. This task appears to be relatively new. Has it been used before in rats (Oberlin and Grahame is a mouse study)? Some history / context might help orient readers.

      Indeed, this task has been used in rats in several prior studies in rats. Please see the following references (PMID: 39119916, 31654652, 28000083, 26779747, 12270518, 19389183).

      c. How many total sessions were completed with ascending delays? Was there criteria for surgeries? How many total recording sessions per animal (of the 54?)

      Please note that the delay does not change within a session. There was no criteria for surgery. In addition, we will update Table 1 to make the number of recording sessions more clear.

      d. How many trials completed per session (40 trials OR 45 minutes)? Where are there errors? These details are important for interpreting Figure 1.

      Every animal in this data set completed 40 trials. We will update the task description to clarify this issue. There are no errors in this task, but rather the task is designed to the tendency to make an impulsive choice (smaller reward now). We will provide clarity to this issue in the revision of the manuscript.   

      (3) Figure 1 is unclear to me.

      a. Delayed vs immediate lever presses are being plotted - but I am not sure what is red, and what is blue. I might suggest plotting each animal.

      We will clarify the colors and look into schemes to graph the data set.

      b. How many animals and sessions go into each data point?

      This information is in Table 1, but this could be clearer, and we will update the manuscript.

      c. Table 1 (which might be better referenced in the paper) refers to rats by session. Is it true that some rats (2 and 8) were not analyzed for the bulk of the paper? Some rats appear to switch strategies, and some stay in one strategy. How many neurons come from each rat?

      Table 1 is accurate, and we can add the number of neurons from each animal.

      d. Task basics - RT, choice, accuracy, video stills - might help readers understand what is going into these plots

      e. Does the animal move differently (i.e., RTs) in G1 vs. G2?

      We will look into ways to incorporate this information.

      (4) I wasn't sure how clustered G1 vs. G2 vs G3 are. To make this argument, the raw data (or some axis of it) might help.

      a. This is particularly important because G3 appears to be a mix of G1 and G2, although upon inspection, I'm not sure how different they really are

      b. Was there some objective clustering criteria that defined the clusters?

      c. Why discuss G3 at all? Can these sessions be removed from analysis?

      These are all excellent suggestions and points. We plan to revisit the strategy to assign sessions to groups, which we hope will address each of these points.

      (5) The same applies to neuronal analyses in Fig 3 and 4

      a. What does a single neuron peri-event raster look like? I would include several of these.

      b. What does PC1, 2 and 3 look like for G1, G2, and G3?

      c. Certain PCs are selected, but I'm not sure how they were selected - was there a criteria used? How was the correlation between PCA and ival selected? What about PCs that don't correlate with ival?

      d. If the authors are using PCA, then scree plots and PETHs might be useful, as well as comparisons to PCs from time-shuffled / randomized data.

      We will make several updates to enhance clarity of the neural data analysis, including adding more representative examples. We feel the need to balance the inclusion of representative examples with groups stats given the concerns raised by R1.

      (6) I had questions about the spectral analysis

      a. Theta has many definitions - why did the authors use 6-12 Hz? Does it come from the hippocampal literature, and is this the best definition of theta?. What about other bands (delta - 1-4 Hz), theta (4-7 Hz); and beta - 13- 30 Hz? These bands are of particular importance because they have been associated with errors, dopamine, and are abnormal in schizophrenia and Parkinson's disease.

      This designation comes mainly from the hippocampal and ACC literature in rodents. In addition, this range best captured the peak in the power spectrum in our data. Note that we focus our analysis on theta give the literature regarding theta in the ACC as a correlate of cognitive controls (references in manuscript). We did interrogate other bands as a sanity check and the results were mostly limited to theta. Given the scope of our manuscript and the concerns raised regarding complexity we are concerned that adding frequency analyses beyond theta obfuscates the take home message. However, we think this is worthy, and we will determine if this can be done in a brief, clear, and effective manner.

      b. Power spectra and time-frequency analyses may justify the authors focus. I would show these (y-axis - frequency, x-axis - time, z-axis, power).

      This is an excellent suggestion that we look forward to incorporating. 

      (7) PC3 as an autocorrelation doesn't seem the to be right way to infer theta entrainment or spike-field relationships, as PCA can be vulnerable to phantom oscillations, and coherence can be transient. It is also difficult to compare to traditional measures of phase-locking. Why not simply use spike-field coherence? This is particularly important with reference to the human literature, which the authors invoke.

      Excellent suggestion. We will look into the phantom oscillation issue. Note that PCA provided a way to classify neurons that exhibited peaks in the autocorrelation at theta frequencies. While spike-field coherence is a rigorous tool, it addresses a slightly different question (LFP entrainment). Notwithstanding, we plan to address this issue.  

      Reviewer #3 (Public Review):

      Summary:

      The study investigated decision making in rats choosing between small immediate rewards and larger delayed rewards, in a task design where the size of the immediate rewards decreased when this option was chosen and increased when it was not chosen. The authors conceptualise this task as involving two different types of cognitive effort; 'resistance-based' effort putatively needed to resist the smaller immediate reward, and 'resource-based' effort needed to track the changing value of the immediate reward option. They argue based on analyses of the behaviour, and computational modelling, that rats use different strategies in different sessions, with one strategy in which they consistently choose the delayed reward option irrespective of the current immediate reward size, and another strategy in which they preferentially choose the immediate reward option when the immediate reward size is large, and the delayed reward option when the immediate reward size is small. The authors recorded neural activity in anterior cingulate cortex (ACC) and argue that ACC neurons track the value of the immediate reward option irrespective of the strategy the rats are using. They further argue that the strategy the rats are using modulates their estimated value of the immediate reward option, and that oscillatory activity in the 6-12Hz theta band occurs when subjects use the 'resistance-based' strategy of choosing the delayed option irrespective of the current value of the immediate reward option. If solid, these findings will be of interest to researchers working on cognitive control and ACCs involvement in decision making. However, there are some issues with the experiment design, reporting, modelling and analysis which currently preclude high confidence in the validity of the conclusions.

      Strengths:

      The behavioural task used is interesting and the recording methods should enable the collection of good quality single unit and LFP electrophysiology data. The authors recorded from a sizable sample of subjects for this type of study. The approach of splitting the data into sessions where subjects used different strategies and then examining the neural correlates of each is in principle interesting, though I have some reservations about the strength of evidence for the existence of multiple strategies.

      Thank you for the positive comments.

      Weaknesses:

      The dataset is very unbalanced in terms of both the number of sessions contributed by each subject, and their distribution across the different putative behavioural strategies (see table 1), with some subjects contributing 9 or 10 sessions and others only one session, and it is not clear from the text why this is the case. Further, only 3 subjects contribute any sessions to one of the behavioural strategies, while 7 contribute data to the other such that apparent differences in brain activity between the two strategies could in fact reflect differences between subjects, which could arise due to e.g. differences in electrode placement. To firm up the conclusion that neural activity is different in sessions where different strategies are thought to be employed, it would be important to account for potential cross-subject variation in the data. The current statistical methods don't do this as they all assume fixed effects (e.g. using trials or neurons as the experimental unit and ignoring which subject the neuron/trial came from).

      This is an important issue that we plan to address with additional analysis in the manuscript update.

      It is not obvious that the differences in behaviour between the sessions characterised as using the 'G1' and 'G2' strategies actually imply the use of different strategies, because the behavioural task was different in these sessions, with a shorter wait (4 seconds vs 8 seconds) for the delayed reward in the G1 strategy sessions where the subjects consistently preferred the delayed reward irrespective of the current immediate reward size. Therefore the differences in behaviour could be driven by difference in the task (i.e. external world) rather than a difference in strategy (internal to the subject). It seems plausible that the higher value of the delayed reward option when the delay is shorter could account for the high probability of choosing this option irrespective of the current value of the immediate reward option, without appealing to the subjects using a different strategy.

      Further, even if the differences in behaviour do reflect different behavioural strategies, it is not obvious that these correspond to allocation of different types of cognitive effort. For example, subjects' failure to modify their choice probabilities to track the changing value of the immediate reward option might be due simply to valuing the delayed reward option higher, rather than not allocating cognitive effort to tracking immediate option value (indeed this is suggested by the neural data). Conversely, if the rats assign higher value to the delayed reward option in the G1 sessions, it is not obvious that choosing it requires overcoming 'resistance' through cognitive effort.

      The RL modelling used to characterise the subject's behavioural strategies made some unusual and arguably implausible assumptions:

      i) The goal of the agent was to maximise the value of the immediate reward option (ival), rather than the standard assumption in RL modelling that the goal is to maximise long-run (e.g. temporally discounted) reward. It is not obvious why the rats should be expected to care about maximising the value of only one of their two choice options rather than distributing their choices to try and maximise long run reward.

      ii) The modelling assumed that the subject's choice could occur in 7 different states, defined by the history of their recent choices, such that every successive choice was made in a different state from the previous choice. This is a highly unusual assumption (most modelling of 2AFC tasks assumes all choices occur in the same state), as it causes learning on one trial not to generalise to the next trial, but only to other future trials where the recent choice history is the same.

      iii) The value update was non-standard in that rather than using the trial outcome (i.e. the amount of reward obtained) as the update target, it instead appeared to use some function of the value of the immediate reward option (it was not clear to me from the methods exactly how the fival and fqmax terms in the equation are calculated) irrespective of whether the immediate reward option was actually chosen.

      iv) The model used an e-greedy decision rule such that the probability of choosing the highest value option did not depend on the magnitude of the value difference between the two options. Typically, behavioural modelling uses a softmax decision rule to capture a graded relationship between choice probability and value difference.

      v) Unlike typical RL modelling where the learned value differences drive changes in subjects' choice preferences from trial to trial, to capture sensitivity to the value of the immediately rewarding option the authors had to add in a bias term which depended directly on this value (not mediated by any trial-to-trial learning). It is not clear how the rat is supposed to know the current trial ival if not by learning over previous trials, nor what purpose the learning component of the model serves if not to track the value of the immediate reward option.

      Given the task design, a more standard modelling approach would be to treat each choice as occurring in the same state, with the (temporally discounted) value of the outcomes obtained on each trial updating the value of the chosen option, and choice probabilities driven in a graded way (e.g. softmax) by the estimated value difference between the options. It would be useful to explicitly perform model comparison (e.g. using cross-validated log-likelihood with fitted parameters) of the authors proposed model against more standard modelling approaches to test whether their assumptions are justified. It would also be useful to use logistic regression to evaluate how the history of choices and outcomes on recent trials affects the current trial choice, and compare these granular aspects of the choice data with simulated data from the model.

      Each of the issues outlined above with the RL model a very important. We are currently re-evaluating the RL modeling approach in light of these comments. Please see comments to R1 regarding the model as they are relevant for this as well.

      There were also some issues with the analyses of neural data which preclude strong confidence in their conclusions:

      Figure 4I makes the striking claim that ACC neurons track the value of the immediately rewarding option equally accurately in sessions where two putative behavioural strategies were used, despite the behaviour being insensitive to this variable in the G1 strategy sessions. The analysis quantifies the strength of correlation between a component of the activity extracted using a decoding analysis and the value of the immediate reward option. However, as far as I could see this analysis was not done in a cross-validated manner (i.e. evaluating the correlation strength on test data that was not used for either training the MCML model or selecting which component to use for the correlation). As such, the chance level correlation will certainly be greater than 0, and it is not clear whether the observed correlations are greater than expected by chance.

      This is an astute observation and we plan to address this concern. We agree that cross-validation may provide an appropriate tool here.

      An additional caveat with the claim that ACC is tracking the value of the immediate reward option is that this value likely correlates with other behavioural variables, notably the current choice and recent choice history, that may be encoded in ACC. Encoding analyses (e.g. using linear regression to predict neural activity from behavioural variables) could allow quantification of the variance in ACC activity uniquely explained by option values after controlling for possible influence of other variables such as choice history (e.g. using a coefficient of partial determination).

      This is also an excellent point that we plan to address the manuscript update.

      Figure 5 argues that there are systematic differences in how ACC neurons represent the value of the immediate option (ival) in the G1 and G2 strategy sessions. This is interesting if true, but it appears possible that the effect is an artefact of the different distribution of option values between the two session types. Specifically, due to the way that ival is updated based on the subjects' choices, in G1 sessions where the subjects are mostly choosing the delayed option, ival will on average be higher than in G2 sessions where they are choosing the immediate option more often. The relative number of high, medium and low ival trials in the G1 and G2 sessions will therefore be different, which could drive systematic differences in the regression fit in the absence of real differences in the activity-value relationship. I have created an ipython notebook illustrating this, available at: https://notebooksharing.space/view/a3c4504aebe7ad3f075aafaabaf93102f2a28f8c189ab9176d4807cf1565f4e3. To verify that this is not driving the effect it would be important to balance the number of trials at each ival level across sessions (e.g. by subsampling trials) before running the regression.

      Excellent point and thank you for the notebook. We explored a similar approach previously but did not pursue it to completion. We will re-investigate this issue.

    1. Author response:

      Reviewer #3 (Public Review):

      (1) Conditions on growth and interaction rates for feasibility and stability. The authors approach this using a mean field approximation, and it is important to note that there is no particular temperature dependence assumed here: as far as it goes, this analysis is completely general for arbitrary Lotka-Volterra interactions.

      However, the starting point for the authors' mean field analysis is the statement that "it is not possible to meaningfully link the structure of species interactions to the exact closed-form analytical solution for [equilibria] 𝑥^*_𝑖 in the Lotka-Volterra model.

      I may be misunderstanding, but I don't agree with this statement. The time-independent equilibrium solution with all species present (i.e. at non-zero abundances) takes the form

      x^* = A^{-1}r

      where A is the inverse of the community matrix, and r is the vector of growth rates. The exceptions to this would be when one or more species has abundance = 0, or A is not invertible. I don't think the authors intended to tackle either of these cases, but maybe I am misunderstanding that.

      So to me, the difficulty here is not in writing a closed-form solution for the equilibrium x^*, it is in writing the inverse matrix as a nice function of the entries of the matrix A itself, which is where the authors want to get to. In this light, it looks to me like the condition for feasibility (i.e. that all x^* are positive, which is necessary for an ecologically-interpretable solution) is maybe an approximation for the inverse of A---perhaps valid when off-diagonal entries are small. A weakness then for me was in understanding the range of validity of this approximation, and whether it still holds when off-diagonal entries of A (i.e. inter-specific interactions) are arbitrarily large. I could not tell from the simulation runs whether this full range of off-diagonal values was tested.

      We thank the reviewer for pointing this out and we agree that the language used is imprecise. The GLV model is solvable using the matrix inversion method but as they note, this does not give an interpretable expression in terms of the system parameters. This is important as we aim to build understanding of how these parameters (which in turn depend on temperature) affect the richness in communities. We have made this clearer in lines 372-379.

      In regards to the validity of the approximation we have significantly increased the detail of the method in the manuscript, including the assumptions it makes (lines 384-393). In general the method assumes that any individual interaction has a weak effect on abundance. This will fail when the variation in interactions becomes too strong but should be robust to changes in the average interaction strength across the community.

      As a secondary issue here, it would have been helpful to understand whether the authors' feasible solutions are always stable to small perturbations. In general, I would expect this to be an additional criterion needed to understand diversity, though as the authors point out there are certain broad classes of solutions where feasibility implies stability.

      As the reviewer notes previous work using the GLV model by ? has shown that stability almost surely implies stability in the GLV. Thus we expect that our richness estimates derived from feasibility will closely resemble those from stabiltiy. We have amended the maintext to make this argument clear on lines 321-335.

      (2) I did not follow the precise rationale for selecting the temperature dependence of growth rate and interaction rates, or how the latter could be tested with empirical data, though I do think that in principle this could be a valuable way to understand the role of temperature dependence in the Lotka-Volterra equations.

      First, as the authors note, "the temperature dependence of resource supply will undoubtedly be an important factor in microbial communities"

      Even though resources aren't explicitly modeled here, this suggests to me that at some temperatures, resource supply will be sufficiently low for some species that their growth rates will become negative. For example, if temperature dependence is such that the limiting resource for a given species becomes too low to balance its maintenance costs (and hence mortality rate), it seems that the net growth rate will be negative. The alternative would be that temperature affects resource availability, but never such that a limiting resource leads to a negative growth rate when a taxon is rare.

      On the other hand, the functional form for the distribution of growth rates (eq 3) seems to imply that growth rates are always positive. I could imagine that this is a good description of microbial populations in a setting where the resource supply rate is controlled independently of temperature, but it wasn't clear how generally this would hold.

      We thank the reviewer for their comment. The assumption of positive growth rates is indeed a feature of the Boltzmann-Arrhenius model of temperature dependence. We use the Boltzmann-Arrhenius model due to the dependence of growth on metabolic rate. As metabolic rate is ultimately determined by biochemical kinetics its temper- ature dependence is well described by the Boltzmann-Arrhenius. In addition to this reasoning there is a wealth of empirical evidence supporting the use of the Boltzmann- Arrhenius to describe the temperature dependence of growth rate in microbes.

      Ultimately the temperature dependence of resource supply is not something we can directly consider in our model. As such we have to assume that resource supply is sufficient to maintain positive growth rates in the community. Note that this assump- tion only requires resource supply is sufficient to maintain positive growth rates (i.e. the maximal growth rate of species in isolation) not that resource supply is sufficient to maintain growth in the presence of intra- and interspecific competition. We have updated the manuscript in lines 156-159 to make these assumptions more clear.

      Secondly, while I understand that the growth rate in the exponential phase for a single population can be measured to high precision in the lab as a function of temperature, the assumption for the form of the interaction rates' dependence on temperature seems very hard to test using empirical data. In the section starting L193, the authors seem to fit the model parameters using growth rate dependence on temperature, but then assume that it is reasonable to "use the same thermal response for growth rates and interactions". I did not follow this, and I think a weakness here is in not providing clear evidence that the functional form assumed in Equation (4) actually holds.

      The reviewer is correct, it is very difficult to measure interaction coefficients experi- mentally and to our knowledge there is little to no data available on their empirical temperature responses. We as a best guess use the observed variation in thermal physiology parameters for growth rate as a proxy assuming that interactions must also depend on metabolic rates of the interacting species (see also response to com- ment 8).

    1. Author response:

      Reviewer #3 (Public Review):

      The paper by Rai and colleagues examines the transcriptional response of Candida glabrata, a common human fungal pathogen, during interaction with macrophages. They use RNA PolII profiling to identify not just the total transcripts but instead focus on the actively transcribing genes. By examining the profile over time, they identify particular transcripts that are enriched at each timepoint, and build a hierarchical model for how a transcription factor, Xbp1, may regulate this response. Due to technical difficulties in identifying direct targets of Xbp1 during infection, the authors then turn to the targets of Xbp1 during cellular quiescence.

      The authors have generated a large and potentially impactful dataset, examining the responses of C. glabrata during an important host-pathogen interface. However, the conclusions that the authors make are not well supported by the data. The ChIP-seq is interesting, but the authors make conclusions about the biological processes that are differentially regulated without testing them experimentally. Because Candida glabrata has a significant percent of the genome without GO term annotation, the GO term enrichment analysis is less useful than in a model organism. To support these claims, the authors should test the specific phenotypes, and validate that the transcriptional signature is observed at the protein level.

      Additionally, the authors should also include images of the infections, along with measurements of phagocytosis, to show that the time points are the appropriate. At 30 minutes, are C. glabrata actually internalized or just associated? This may explain the difference in adherence genes at the early timepoint. For example, in Lines 123-132, the authors could measure the timing of ROS production by macrophages to determine when these attacks are deployed, instead of speculating based on the increased transcription of DNA damage response genes. Potentially, other factors could be influencing the expression of these proteins. At the late stage of infection, the authors should measure whether the C. glabrata cells are proliferating, or if they have escaped the macrophage, as other fungi can during infection. This may explain some of the increase in transcription of genes related to proliferation.

      An additional limitation to the interpretation of the data is that the authors should put their work in the context of the existing literature on C. albicans temporal adaptation to macrophages, including recent work from Munoz (doi: 10.1038/s41467-019-09599-8), Tucey (doi: 10.1016/j.cmet.2018.03.019), and Tierney (doi: 10.3389/fmicb.2012.00085), among others.

      When comparing the transcriptional profile between WT and xbp1 mutant, it is not clear whether the authors compared the strains under non-stress conditions. The authors should include an analysis of the wild-type to xbp1 mutants in the absence of macrophage stress, as the authors claims of precocious transcription may be a function of overall decreased transcriptional repression, even in the absence of the macrophage stress. The different cut-offs used to call peaks in the two strain backgrounds is also somewhat concerning-it is not clear to me whether that will obscure the transcriptional signature of each of the strains. Additionally, the authors go on to show that the xbp1 mutant has a significant proliferation defect in macrophages, so potentially this could confound the PolII binding sites if the cells are dying.

      In the section on hierarchical analysis of transcription factors, at least one epistasis experiment should have been performed to validate the functional interaction between Xbp1 and a particular transcription factor. If the authors propose a specific motif, they should test this experimentally through EMSA assays to fully test that the motif is functional.

      The jump from macrophages to quiescent culture is also not well justified. If the transcriptional program is so dynamic during a timecourse of macrophage infection, it is hard to translate the findings from a quiescent culture to this host environment.

      Overall, there is a strong beginning and the focus on active transcription in the macrophage is an exciting approach. However, the conclusions need additional experimental evidence.

      We thank this reviewer’s critical analysis of our manuscript and the comments.

      We fully agree that the jump from macrophages to quiescent culture is also not well justified. We have successfully performed CgXbp1 ChIP-seq during macrophage infection and have rewritten the manuscript according to the new results. With the CgXbp1 ChIP-seq data during macrophage infection added, we have removed the data related to quiescence to focus the paper on the macrophage response. Because of this, we have also removed the DNA binding motif analysis from this work and will report the findings in a separate manuscript comparing CgXbp1 bindings between macrophage response and quiescence.

      As mentioned above, the RNAPII ChIP-seq time course experiment compared RNAP occupancies at different times during infection to the first infection time point. We did not calculate relative to the data in the absence of stress (e.g. before infection), because Xbp1 was expressed at a low level and induced by stresses. Hence its role under no stress conditions is expected to be less than inside macrophages. In addition, up-regulation of its target genes depends on the presence of their transcriptional activators under the experimental conditions, which is going to be very different in normal growth media (RPMI or YPD; i.e. before infection) versus inside macrophages. Hence, comparing to normal growth media would not show the real CgXbp1 effects and/or the CgXbp1 effect might be different. In fact, this can be seen from the new RNAseq analysis of wildtype and Cgxbp1∆ C. glabrata cells in the presence and absence of fluconazole (which are added to the revised manuscript to study CgXbp1’s role on fluconazole resistance). The result shows that CgXbp1 (which was expressed at a low level) has a very small effect on global expression and the up-regulated genes are mainly related to transmembrane transport. More importantly, the effect of the Cgxbp1∆ mutant on TCA cycle and amino acid biosynthesis genes’ expression during macrophage infection is not observed when the mutant is grown under normal growth conditions (YPD without fluconazole). Therefore, the results show that CgXbp1 has condition-specific effects on global gene expression, which is also dependent on the transcriptional activators present in the cell. The result of the new RNAseq analysis of wildtype and Cgxbp1∆ C. glabrata cells in the absence of fluconazole is described in lines 329-339 as follows: “On the other hand, 135 genes were differentially expressed in the Cgxbp1∆ mutant during normal exponential growth (i.e. no fluconazole treatment) (Figure 6c) with up-regulated genes highly enriched with the “transmembrane transport” function and down- regulated genes associated with different metabolic processes (e.g. carbohydrate, glycogen and trehalose) (e.g. carbon metabolism, nucleotide metabolism, and transmembrane transport, etc.) (Supplementary Table 12). Interesting, the TCA cycle and amino acid biosynthesis genes, whose expressions were accelerated in the Cgxbp1∆ mutant during macrophage (Figure 3C, 3D), were not affected by the loss of CgXbp1 function under normal growth conditions (i.e. in YPD media without fluconazole) (Supplementary Figure 11, Supplementary Table 11), suggesting that the overall (direct and indirect) effects of CgXbp1 are condition-specific.”

      For the comment about RNAPII bindings affected by dying cells, our observation of reduced proliferation does not mean that the cells were dying, because we did observe increase in cell numbers over time (i.e. the cells were proliferating) but the rate of proliferation was slower in the Cgxbp1∆ mutant comparing to wildtype. Presumably, the reduced proliferation and/or growth within macrophages is due to poorer adaptation in and compromised response to macrophages.

      We have also discussed our findings in the context of the suggested (and other) literatures in various parts of the Discussion.

      Reviewer #4 (Public Review):

      Macrophages are the first line of defense against invading pathogens. C. glabrata must interact with these cells as do all pathogens seeking to establish an infection. Here, a ChIP-seq approach is used to measure levels of RNA polymerase II levels across Cg genes in a macrophage infection assay. Differential gene expression is analyzed with increasing time of infection. These differentially expressed genes are compared at the promoter level to identify potential transcription factors that may be involved in their regulation. A factor called CgXbp1 on the basis of its similar with the S. cerevisiae Xbp1 protein is characterized. ChIP-seq is done on CgXbp1 using in vitro grown cells and a potential binding site identified. Evidence is provided that CgXbp1 affects virulence in a Galleria system and that this factor might impact azole resistance.

      As the authors point out, candidiasis associated with C. glabrata has dramatically increased in the recent past. Understanding the unique aspects of this Candida species would be a great value in trying to unravel the basis of the increasing fungal disease caused by C. glabrata. The use of ChIP-seq analysis to assess the time-dependent association of RNA polymerase II with Cg genes is a nice approach. Identification of CgXbp1 as a potential participant in the control of this gene expression program is also interesting. Unfortunately, this work suffers by comparison to a significant amount of previous effort that renders the progress detailed here incremental at best.

      I agree that their ChIP-seq time course of RNA polymerase II distribution across the Cg genome is both elegant and an improvement on previous microarray experiments. However, these microarray experiments were carried out 14 years ago and while the current work is certainly at higher resolution, little more can be gleaned from the current work. The authors argue that standard transcriptional analysis is compromised by transcript stability effects. I would suggest that, while no approach is without issues, quite a bit has been learned from approaches like RNA-seq and there are recent developments to this technique that allow for a focus on newly synthesized mRNA (thiouridine labeling).

      The CgXbp1 characterization relies heavily on work from S. cerevisiae. This is disappointing as conservation of functional links between C. glabrata and S. cerevisiae is not always predictable.

      The effects caused by loss of CgXBP1 on virulence (Figure 4) may be statistically significant but are modest. No comparison is shown for another gene that has already been accepted to have a role in virulence to allow determination of the biological importance of this effect.

      The phenotypic effects of the loss of XBP1 on azole resistance look rather odd (Figure 6). The appearance of fluconazole resistant colonies in the xbp1 null strain occurs at a very low frequency and seems to resemble the appearance of rho0 cells in the population. The vast majority of xbp1 null cells do not exhibit increased growth compared to wild-type in the presence of fluconazole.

      Irrespective of the precise explanation, more analysis should be performed to confirm that CgXbp1 is negatively regulating the genes suggested in Figure 6A to be responsible for the increased fluconazole resistance.

      Additionally, the entire analysis of CgXbp1 is based on ChIP-seq performed using cells grown under very different conditions that the RNA polymerase II study. Evidence should be provided that the presumptive CgXbp1 target genes actually impact the expression profiles established earlier.

      We thank this reviewer’s critical analysis of our manuscript. We have done the following to address the comments. As a result, the manuscript is significantly improved.

      • The ChIP-seq data of Xbp1 in macrophage has been successfully generated and the result is now presented in Figure 2C-2F, and lines 182-227 of the revised manuscript. With the addition, we have removed the ChIPseq data related to quiescent from the revised manuscript and re-written the manuscript focusing on the role of Xbp1 in macrophage.

      • We agree that the conservation of functional links between C. glabrata and S. cerevisiae is not always predictable. That’s the reason why we did not solely rely on the S. cerevisiae network for inferring Xbp1’s functions, and had undertaken several different ways (e.g. ChIP-seq of Xbp1 and characterization of the Cgxbp1∆ mutant) to delineate its functions.

      • We also agree that the virulence effect is modest, but it is, nevertheless, an effect that may contribute to the overall virulence of C. glabrata. Since virulence is a pleiotropic trait involving many genes and every gene affects different aspects of the complex process, we feel that it is not fair to penalize a given gene based on its (weaker) effect relative to another gene. Therefore, we respectfully disagree that another gene should be included for benchmarking the effect.

      • We have measured C. glabrata cell numbers in a time course experiment. The result (presented in Figure 4A) showed that there was an increase in cell number at the end of the macrophage infection time course experiment (e.g. 8 hr). We have highlighted this information on lines 278-283.

      • Additional analysis of the fluconazole resistance phenotype of the Cgxbp1∆ mutant has been added, including standard MIC assays. The results are presented in Figure 5C-5E.

      • As suggested and to understand the role of CgXbp1 on fluconazole resistance, we have now carried out RNAseq analysis of WT and the Cgxbp1∆ mutant in the presence and absence of fluconazole. The genes differentially controlled in the Cgxbp1∆ mutant have been identified and a proposed model on how CgXbp1 affects fluconazole resistance is added to Figure 7 in the revised manuscript.

    1. Author response:

      Reviewer #1 (Public Review):

      The authors conducted cross-species comparisons between the human brain and the macaque brain to disentangle the specific characteristics of structural development of the human brain. Although previous studies had revealed similarities and differences in brain anatomy between the two species by spatially aligning the brains, the authors made the comparison along the chronological axis by establishing models for predicting the chronological ages with the inputting brain structural features. The rationale is actually clear given that brain development occurs over time in both. More interestingly, the model trained on macaque data was better able to predict the age of humans than the human-trained model was at predicting macaque age. This revealed a brain cross-species age gap (BCAP) that quantified the discrepancy in brain development between the two species, and the authors even found this BCAP measure was associated with performance on behavioral tests in humans. Overall, this study provides important and novel insights into the unique characteristics of human brain development. The authors have employed a rigorous scientific approach, reflecting diligent efforts to scrutinize the patterns of brain age models across species. The clarity of the rationale, the interpretability of the methods, and the quality of the presentation all contribute to the strength of this work.

      We are grateful to your helpful and thorough review and for being so positive about our manuscript. Following your recommendations, we have added more analytic details that have strengthened our paper. We would like to thank you for your input.

      Reviewer #2 (Public Review):

      In the current study, Li et al. developed a novel approach that aligns chronological age to a cross-species brain age prediction model to investigate the evolutionary effect. This method revealed some interesting findings, like the brain-age gap of the macaque model in predicting human age will increase as chronological age increases, suggesting an evolutionary alignment between the macaque brain and the human brain in the early stage of development. This study exhibits ample novelty and research significance. However, I still have some concerns regarding the reliability of the current findings.

      We thank you for the positive and appreciative feedback on our work and the insightful comments, which we have addressed below.

      Question 1: Although the authors named their new method a "cross-species" model, the current study only focused on the prediction between humans and macaques. It would be better to discuss whether their method can also generalize to cross-species examination of other species (e.g., C. elegans), which may provide more comprehensive evolutionary insights. Also, other future directions with their new method are worth discussing.

      We appreciate your insightful comment regarding the generalizability of our model to other species. As you said, we indeed only performed human-macaque cross-species study not including other species. In our study, we only focused human and macaque because macaque is considered to be one of the closest primates to humans except chimpanzees and thus is considered to be the best model for studying human brain evolution. However, our proposed method has limitations that limit its generalizability for other species, e.g., C. elegans. First, our model was trained using MRI data, which limits its applicability to species for which such data is unavailable. This technological requirement brings a barrier to broaden cross-species application. Second, our current model is based on homologous brain atlases that are available for both humans and macaques. The lack of comparable atlases for other species further restricts the model's generalizability. We have discussed this limitation in the revised manuscript and outlined potential future directions to overcome these challenges. This includes discussing the need for developing comparable imaging techniques and standardized brain atlases across a wider range of species to enhance the model's applicability and broaden our understanding of cross-species neurodevelopmental patterns.

      On page 15, lines 11-18

      “However, the existing limitation should be noted regarding the generalizability of our proposed approach for cross-species brain comparison. Our current model relies on homologous brain atlases, and the lack of comparable atlases for other species restricts its broader applicability. To address this limitation, future research should focus on developing prediction models that do not depend on atlases. For instance, 3D convolutional neural networks could be trained directly on raw MRI data for age prediction. These deep learning models may offer greater flexibility for cross-species applications once the training within species is complete. Such advancements would significantly enhance the model's adaptability and expand its potential for comparative neuroscience studies across a wider range of species.”

      Question 2: Algorithm of prediction model. In the method section, the authors only described how they chose features, but did no description about the algorithm (e.g., supporting vector regression) they used. Please add relevant descriptions to the methods.

      Thank you for your comment. We apologize for not providing sufficient details about the model training process in our initial submission. In our study, we used a linear regression model for prediction. We have provided more details regarding the algorithm of prediction model in our response to Reviewer #1. For your convenience, we have attached them below.

      For details on the algorithm of prediction model:

      “A linear regression model was adopted for intra- and inter-species age prediction. The linear regression model was built including the following three main steps: 1) Feature selection: a total of two steps are required to extract the final features. The first step is preliminary extraction. First, all the human or macaque participants were divided into 10-fold and 9-fold was used for model training and 1-fold for model test. The preliminary features were chosen by identifying the significantly age-associated features with p < 0.01 during calculating Pearson’s correlation coefficients between all the 260 features and actual ages of the 9-fold subjects. This process was repeated 100 times. Since we obtained not exactly the same preliminary features each time, we thus further analyzed the preliminary features using two methods to determine the final features: common features and minimum mean absolute error (min MAE). Common features are the preliminary features that were selected in all the 100 times during preliminary model training. The min MAE features were the preliminary features that with the smallest MAE value during the 100 times model test for predicting age. After the above feature selections, we obtained two sets of features: 62 macaque features and 225 human features (common features) and 117 macaque features and 239 human features (min MAE). In addition, to further exclude the influences of unequal number of features in human and macaque, we also selected the first 62 features in human and macaque to test the model prediction performances. 2) Model construction: we conducted age prediction linear model using 10-fold cross-validation based on the selected features for human and macaque separately. The linear model parameters are obtained using the training set data and applied to the test set for prediction. The above process is also repeated 100 times. 3) Prediction: with the above results, we obtained the optimal linear prediction models for human and macaque. Next, we performed intra-species and inter-species brain age prediction, i.e., human model predicted human age, human model predicted macaque age, macaque model predicted macaque age and macaque model predicted human age. Three sets of features (62 macaque features and 225 human features; 117 macaque features and 239 human features; 62 macaque features and 62 human features) were used to test the prediction models for cross-validation and to exclude effects of different number of features in human and macaque. In the main text, we showed the results of brain age prediction, brain developmental and evolutional analyses based on common features and the results obtained using other two types of features were shown in supplementary materials. The prediction performances were evaluated by calculating the Pearson’s correlation and MAE between actual ages and predicted ages.”

      Question 3: Sex difference. The sex difference results are strange to me. For example, in the second row of Figure Supplement 3A, different models show different correlation patterns, but why their Pearson's r is all equal to 0.3939? If they are only typo errors, please correct them. The authors claimed that they found no sex difference. However, the results in Figure Supplement 3 show that, the female seems to have poorer performance in predicting macaque age from the human model. Moreover, accumulated studies have reported sex differences in developing brains (Hines, 2011; Kurth et al., 2021). I think it is also worth discussing why sex differences can't be found in the evolutionary effect.

      Reference:

      Hines, M. (2011). Gender development and the human brain. Annual review of neuroscience, 34, 69-88.

      Kurth, F., Gaser, C., & Luders, E. (2021). Development of sex differences in the human brain. Cognitive Neuroscience, 12(3-4), 155-162.

      It is recommended that the authors explore different prediction models for different species. Maybe macaques are suitable for linear prediction models, and humans are suitable for nonlinear prediction models.

      Thank you for pointing the typos out and comments on sex difference. In Figure Supplement 3A, there are typos for Pearson’s r values and we have corrected it in updated Figure 2-figure supplement 3. For details, please see the updated Figure 2-figure supplement 3 and the following figure.

      Regarding gender effects, we acknowledge your point about the importance of gender differences in understanding brain evolution and development. In our study, however, our primary goal was to develop a robust age prediction model by maximizing the number of training samples. To mitigate gender-related effects in our main results, we incorporated gender information as a covariate in the ComBat harmonization process. We conducted a supplementary analysis just to demonstrate the stability of our proposed cross-species age prediction model by separating the data with gender variable not to investigate gender differences. Although our results demonstrated that gender-specific models could still significantly predict chronological age, we refrained from emphasizing these models' performance in gender-specific species comparisons due to difficulty in explanation for the predicted gender difference. For cross-species prediction, whether a higher Pearson’s r value between actual age and predicted age could reflect conserved evolution for male or female is not convincing. In addition, we adopted same not different prediction models for human and macaque aiming to establish a comparable model between species. Generally speaking, the nonlinear model could obtain better prediction accuracy than linear model. If different species used different models, it is unfair to perform cross-species prediction. Importantly, our study aimed to developed new index based on the same prediction models to quantify brain evolution difference, i.e., brain cross-species age gap (BCAP) instead of traditional statistical analyses. Different prediction models for different species may introduce bias causing by prediction methods and thus impacting the accuracy of BCAP. Thus, we adopted the linear model with best prediction performances for intra-species prediction in this study for cross-species prediction. Although our main goal in this study is to set up stable cross-species prediction model and the models built using either male or female subjects showed good performances during cross-species prediction, however, as your comment, how to unbiasedly characterize evolutionary gender differences using machining learning approaches needs to be further investigated since there are many reports about the gender difference in developing brain in humans. In fact, whether macaque brains have the same gender differences as humans is an interesting scientific question worth studying. Thus, we have included a discussion on how to use machining learning method to study the evolutionary gender difference in our revised manuscript.

      On page 15, lines 18-23 and page 16, line 1-4

      “Many studies have reported sex differences in developing human brains (Hines, 2011; Kurth, Gaser, & Luders, 2021), however, whether macaque brains have similar sex differences as humans is still unknown. We used machining learning method for cross-species prediction to quantify brain evolution and the established prediction models are stable even when only using male or female data, which may indicate that the proposed cross-species prediction model has no evolutionary sex difference. Although the stable prediction model can be established in either male or female participants for cross-species prediction, this indeed does not mean that there are no evolutionary sex differences due to lack of quantitative comparative analysis. In the future, we need to develop more objective, quantifiable and stable index for studying sex differences using machining learning methods to further identify sex differences in the evolved brain”

      Reviewer #3 (Public Review):

      The authors identified a series of WM and GM features that correlated with age in human and macaque structural imaging data. The data was gathered from the HCP and WA studies, which was parcellated in order to yield a set of features. Features that correlated with age were used to train predictive intra and inter-species models of human and macaque age. Interestingly, while each model accurately predicted the corresponding species age, using the macaque model to predict human age was more accurate than the inverse (using the human model to predict macaque age). In addition, the prediction error of the macaque model in predicting human age increased with age, whereas the prediction error of the human model predicting macaque age decreased with age.

      After elaboration of the predictive models, the authors classified the features for prediction into human-specific, macaque-specific and common to human and macaque, where they most notably found that macaque-only and common human-macaque areas were located mainly in gray matter, with only a few human-specific features found in gray matter. Furthermore, the authors found significant correlations between BCAP and picture vocabulary (positive correlation) test and visual sensitivity (negative correlation) test. Several white matter tracts (AF, OR, SLFII) were also identified showing a correlation with BCAP.

      Thank you for providing this excellent summary. We appreciate your thorough review and concise overview of our work.

      STRENGTHS AND WEAKNESSES

      The paper brings an interesting perspective on the evolutionary trajectories of human and non-human primate brain structure, and its relation to behavior and cognition. Overall, the methods are robust and support the theoretical background of the paper. However, the overall clarity of the paper could be improved. There are many convoluted sentences and there seems to be both repetition across the different sections and unclear or missing information. For example, the Introduction does not clearly state the research questions, rather just briefly mentions research gaps existing in the literature and follows by describing the experimental method. It would be desirable to clearly state the theoretical background and research questions and leave out details on methodology. In addition, the results section repeats a lot of what is already stated in the methods. This could be further simplified and make the paper much easier to read.

      In the discussion, authors mention that "findings about cortex expansion are inconsistent and even contradictory", a more convincing argument could be made by elaborating on why the cortex expansion index is inadequate and how BCAP is more accurate.

      Thank you for highlighting the interesting aspects of our work. We are sorry for the lack of the clarity in certain parts of our manuscript. Following your valuable suggestions, we have revised the manuscript to reduce unnecessary repetitions and provide a clearer statement of our research question in Introduction. Specifically, unlike previous analyses of human and macaque evolution using comparative neuroscience, this study embeds chronological axis into the cross-species evolutionary analysis process. It constructed a linear prediction model of brain age for humans and macaques, and quantitatively described the degree of evolution. The brain structure based cross-species age prediction model and cross-species brain age differences proposed in this study further eliminate the inherent developmental effects of humans and macaques on cross-species evolutionary comparisons, providing new perspectives and approaches for studying cross-species development. Regarding the existing repetition in the results section, we have simplified them for the clarity. Regarding the comparison between the cortex expansion index and BCAP, we would like to emphasize that the cortex expansion index was derived without fully considering cross-species alignment along the chronological axis. Specifically, this index does not correspond to a specific developmental stage, but rather focuses on a direct comparison between the two species. In contrast, BCAP addresses this limitation by utilizing a prediction model to establish alignment (or misalignment) between species at the individual level. Therefore, BCAP may serve as a more flexible and nuanced tool for cross-species brain comparison.

      STUDY AIMS AND STRENGTH OF CONCLUSIONS

      Overall, the methods are robust and support the theoretical background of the paper, but it would be good to state the specific research questions -even if exploratory in nature- more specifically. Nevertheless, the results provide support for the research aims.

      Thank you for excellent suggestion. We have revised our introduction to state the specific research question as mentioned above.

      IMPACT OF THE WORK AND UTILITY OF METHODS AND DATA TO THE COMMUNITY

      This study is a good first step in providing a new insight into the neurodevelopmental trajectories of humans and non-human primates besides the existing cortical expansion theories.

      Thank you for your encouraging comment.

      ADDITIONAL CONTEXT:

      It should be clearly stated both in the abstract and methods that the data used for the experiment came from public databases.

      Thank you for your suggestion. We have added this information in both abstract and method. For details, please see page 2, line 9 in Abstract section; page 16, lines 10-11 and page 17, lines 6-10 in Materials and Method section.

    1. Author response:

      Reviewer #1 (Public Review):

      Using structural analysis, Bonchuk and colleagues demonstrate that the TTK-like BTB/POZs of insects form stable hexameric assemblies composed of trimers of POZ dimers, a configuration observed consistently across both homomultimers and heteromultimers, which are known to be formed by TTK-like BTB/POZ domains. The structural data is comprehensive, unambiguous, and further supported by theoretical fold prediction analyses. In particular the judicious complementation of experiments and fold prediction is commendable. This study now adds an important cog that might help generalize the general principles of the evolution of multimerization in members of this fold family.

      I strongly feel that enhancing the inclusivity of the discussion would strengthen the paper. Below, I suggest some additional points for consideration for the same.

      Major points.

      1) It would be valuable to discuss alternative multimer assembly interfaces, considering the diverse ways POZs can multimerize. For instance, the Potassium channel POZ domains form tetramers. A comparison of their inter-subunit interface with that of TTK and non-TTK POZs could provide insightful contrasts.

      Thanks for the suggestion, we added this important comparison, as well as comparison with recently published structures of filament-forming BTB domains.

      2) The so-called TTK motif, despite its unique sequence signature, essentially corresponds to the N-terminal extension observed in other "non-TTK" proteins such as Miz-1. Given Miz-1's structure, it becomes evident that the utilization of the N-terminal extension for dimerization is shared with the TTK family, suggesting a common evolutionary origin in metazoan transcription factors. Early phylogenetic trees (e.g. in PMID: 9917379) support the grouping of the TTK-like POZs with other animal Transcription factors containing POZ domains such as those with Kelch repeats further suggesting that the extension might be ancestral. Structural investigations by modeling prominent examples or comparing known structures of similar POZ domains, could support this inference. Control comparisons with POZ domains from fungi, plants and amoebozoans like Dictyostelium could offer additional insights.

      We performed AlphaFold2-Multimer modeling of dimers of all BTB domains from the most ancestral metazoan clades, Placozoa and Porifera, along with BTBs from Choanoflagellates – the closest to first metazoans unicellular eukaryotes. The presence of N-terminal beta-sheet was evaluated. KLHL-BTBs are present in all eukaryotes and likely are predecessors of ZBTB domains. According to AlphaFold modeling of dimers, all KLHL-BTB domains of plants and basal metazoans have alpha1 helix, but most of these domains from do not possess additional N-terminal beta-strand (beta1) characteristic for ZBTB domains. We found only one KLHL-BTB (Uniprot ID: AA9VCT1_MONBE) with such N-terminal extension in Choanoflagellate proteome, one in Dictyostelium proteome (Q54F31_DICDI), and 7 (out of 43 BTB domains in total) and 13 (out of 81) such domains in Trichoplax and Amphimedon proteomes correspondingly. There was no significant sequence similarity of beta1 element at the level of primary sequence. However, most of these domains bear 3-box/BACK extension and represent typical KLHL-BTBs which are member of E3 ubiquitin-ligase complexes, they are often associated with protein-protein interacting MATH domain or WD40 repeats. We found only one protein in Trichoplax proteome with beta1 strand devoid of 3-box/BACK (B3RQ74_TRIAD), thus resembling ZBTB topology. Thus, likely emergence of BTB domains of this subtype occurred early in Metazoan evolution. At this point ZBTBs were not yet associated with zinc-fingers. According to our survey, actual fusion of ZBTB domain with zinc-finger domains occurred in the evolution of earlier bilaterian organisms since proteins with such domain architecture are not found in Radiata but are present in basal Protostomia and Deuterostomia clades. TTK-type sequence is characteristic only for Arthropoda and emerged early in their evolution. We added all these data to the article.

      3) Exploring the ancestral presence of the aforementioned extension in metazoan transcription factors could serve as a foundation for understanding the evolutionary pathway of hexamerization. This analysis could shed light on exposed structural regions that had the potential to interact post-dimerization with the N-terminal extension and also might provide insights into the evolution of multimer interfaces, as observed in the Potassium channel.

      We added this important comparison as well as comparison with recent structures of filament-forming BTB domains.

      4) Considering the role of conserved residues in the multimer interface is crucial. Reference to conserved residues involved in multimer formation, such as discussed in PMID: 9917379, would enrich the discussion.

      We updated our description of multimer interface with respect to conservation of residues.

      Reviewer #2 (Public Review):

      BTB domains are protein-protein interaction domains found in diverse eukaryotic proteins, including transcription factors. It was previously known that many of the Drosophila transcription factor BTB domains are of the TTK-type - these are defined as having a highly-conserved motif, FxLRWN, at their N-terminus, and they thereby differ from the mammalian BTB domains. Whereas the well-characterised mammalian BTB domains are dimeric, several Drosophila TTK-BTB domains notably form multimers and function as chromosome architectural proteins. The aims of this work were (i) to determine the structural basis of multimerisation of the Drosophila TTK-BTB domains, (ii) to determine how different Drosophila TTK-BTB domains interact with each other, and (iii) to investigate the evolution of this subtype of BTB domain.

      The work significantly advances our understanding of the biology of BTB domains. The conclusions of the paper are mostly well-supported, although some aspects need clarification:

      Hexameric organisation of the TTK-type BTB domains:

      Using cryo-EM, the authors showed that the CG6765 TTK-type BTB domain forms a hexameric assembly in which three "classic" BTB dimers interact via a beta-sheet interface involving the B3 strand. This is particularly interesting, as this region of the BTB domain has recently been implicated in protein-protein interactions in a mammalian BTB-transcription factor, MIZ1. SEC-MALS analysis indicated that the LOLA TTK-type BTB domain is also hexameric, and SAXS data was consistent with a hexameric assembly of the CG6765- and LOLA BTB domains.

      The data regarding the hexameric organisation is convincing. However, interpreting the role of specific regions of the BTB domain is difficult because the description of the molecular contacts lacks depth.

      Heteromeric interactions between TTK-type BTB domains:

      The authors use yeast two-hybrid assays to study heteromeric interactions between various Drosophila TTK-type BTB domains. Such assays are notorious for producing false positives, and this needs to be mentioned. Although the authors suggest that the heteromeric interactions are mediated via the newly-identify B3 interaction interface, there is no evidence to support this, since mutation of B3 yielded insoluble proteins.

      We are aware that Y2H can give false positive results in cases where the BTB domain fused to the DNA binding domain can activate reporter genes. Therefore, all tested BTB domains were examined for their ability to activate transcription. Furthermore, in our study, assays with non-TTK-type BTB domains, which showed almost no interactions, provide additional negative control. We have added a corresponding disclaimer in the text. We agree that our data do not explain the basis for heteromeric interactions. Design of mutations in B3 beta-sheet proved to be complicated, using of biochemical methods to study the principles of heteromer assembly also does not seem to be feasible since most TTK-type BTBs tend to form aggregates and are difficult to be expressed and purified. But most important issue is that demonstrated ability of heteromer assembly through B3 in few tested pairs cannot be applied for all pairs, some of them still may use different mechanism. We used AlphaFold to predict possible mechanisms of heteromer assemblies. AlphaFold suggested that usage of both B3 and conventional dimerization interfaces for heteromeric interactions are possible in various cases, with preference of one over another in different pairs. Thus, most likely the presence of two potential heteromerization interfaces extends the heteromerization capability of these domains. We changed the text accordingly.

      Evolution of the TTK-type BTB domains:

      The authors carried out a bioinformatics analysis of BTB proteins and showed that most of the Drosophila BTB transcription factors (24 out of 28) are of the TTK-type. They investigated how the TTK-type BTB domains emerged during evolution, and showed that these are only found in Arthropoda, and underwent lineage-specific expansion in the modern phylogenetic groups of insects. These findings are well-supported by the evidence.

    1. Author response:

      Reviewer #1 - Public Review

      This report describes work aiming to delineate multi-modal MRI correlates of psychopathology from a large cohort of children of 9-11 years from the ABCD cohort. While uni-modal characterisations have been made, the authors rightly argue that multi-modal approaches in imaging are vital to comprehensively and robustly capture modes of large-scale brain variation that may be associated with pathology. The primary analysis integrates structural and resting-state functional data, while post-hoc analyses on subsamples incorporate task and diffusion data. Five latent components (LCs) are identified, with the first three, corresponding to p-factor, internal/externalising, and neurodevelopmental Michelini Factors, described in detail. In addition, associations of these components with primary and secondary RSFC functional gradients were identified, and LCs were validated in a replication sample via assessment of correlations of loadings.

      1.1) This work is clearly novel and a comprehensive study of associations within this dataset. Multi-modal analyses are challenging to perform, but this work is methodologically rigorous, with careful implementation of discovery and replication assessments, and primary and exploratory analyses. The ABCD dataset is large, and behavioural and MRI protocols seem appropriate and extensive enough for this study. The study lays out comprehensive associations between MRI brain measures and behaviour that appear to recapitulate the established hierarchical structure of psychopathology.

      We thank Reviewer 1 for appreciating our methods and findings, and we address their suggestions below:

      1.2) The work does have weaknesses, some of them acknowledged. There is limited focus on the strength of observed associations. While the latent component loadings seem reliably reproducible in the behavourial domain, this is considerably less the case in the imaging modalities. A considerable proportion of statistical results focuses on spatial associations in loadings between modalities - it seems likely that these reflect intrinsic correlations between modalities, rather than associations specific to any latent component.

      We appreciate the Reviewer’s comment, and minimized the reporting of correlations between the loadings from the different modalities in the revised Results (specifically subsections on LC1, LC2, and LC3). We now refer to Table S4 in each subsection for this information: “Spatial correlations between modality-specific loadings are reported in Supplementary file 1c.”

      For completeness, we report the intrinsic correlations between the different modalities in Supplementary file 1c (P.19):

      “Lastly, although the current work aimed to reduce intrinsic correlations between variables within a given modality through running a PCA before the PLS approach, intrinsic correlations between measures and modalities may potentially be a remaining factor influencing the PLS solution. We, thus, provided an additional overview of the intrinsic correlations between the different neuroimaging data modalities in the supporting results (Supplementary file 1c).”

      1.3) Assessment of associations with functional gradients is similarly a little hard to interpret. Thus, it is hard to judge the implications for our understanding of the neurophysiological basis of psychopathology and the ability of MRI to provide clinical tools for, say, stratification.

      We now provide additional context, including a rising body of theoretical and empirical work, that outlines the value of functional gradients and cortical hierarchies in the understanding of brain development and psychopathology. Please see P.26.

      “Initially demonstrated at the level of intrinsic functional connectivity (Margulies et al., 2016), follow up work confirmed a similar cortical patterning using microarchitectural in-vivo MRI indices related to cortical myelination (Burt et al., 2018; Huntenburg et al., 2017; Paquola et al., 2019), post-mortem cytoarchitecture (Goulas et al., 2018; Paquola et al., 2020, 2019), or post-mortem microarray gene expression (Burt et al., 2018). Spatiotemporal patterns in the formation and maturation of large-scale networks have been found to follow a similar sensory-to-association axis; moreover, there is the emerging view that this framework may offer key insights into brain plasticity and susceptibility to psychopathology (Sydnor et al., 2021). In particular, the increased vulnerability of transmodal association cortices in late childhood and early adolescence has been suggested to relate to prolonged maturation and potential for plastic reconfigurations of these systems (Paquola et al., 2019; Park et al., 2022b). Between mid-childhood and early adolescence, heteromodal association systems such as the default network become progressively more integrated among distant regions, while being more differentiated from spatially adjacent systems, paralleling the development of cognitive control, as well as increasingly abstract and logical thinking. [...] This suggests that neurodevelopmental difficulties might be related to alterations in various processes underpinned by sensory and association regions, as well as the macroscale balance and hierarchy of these systems, in line with previous findings in several neurodevelopmental conditions, including autism, schizophrenia, as well as epilepsy, showing a decreased differentiation between the two anchors of this gradient (Hong et al., 2019). In future work, it will be important to evaluate these tools for diagnostics and population stratification. In particular, the compact and low dimensional perspective of gradients may provide beneficial in terms of biomarker reliability as well as phenotypic prediction, as previously demonstrated using typically developing cohorts (Hong et al. 2020) On the other hand, it will be of interest to explore in how far alterations in connectivity along sensory-to-transmodal hierarchies provide sufficient graduality to differentiate between specific psychopathologies, or whether they, as the current work suggests, mainly reflect risk for general psychopathology and atypical development.”

      1.4) The observation of a recapitulation of psychopathology hierarchy may be somewhat undermined by the relatively modest strength of the components in the imaging domain.

      We thank the Reviewer for this comment, and now expressed this limitation in the revised Discussion, P.23.

      “The p factor, internalizing, externalizing, and neurodevelopmental dimensions were each associated with distinct morphological and intrinsic functional connectivity signatures, although these relationships varied in strength.”

      1.5) The task fMRI was assessed with a fairly basic functional connectivity approach, not using task timings to more specifically extract network responses.

      In the revised Discussion on P.24, we acknowledge that more in-depth analyses of task-based fMRI may have offered additional insights into state-dependent changes in functional architecture.

      “While the current work derived main imaging signatures from resting-state fMRI as well as grey matter morphometry, we could nevertheless demonstrate associations to white matter architecture (derived from diffusion MRI tractography) and recover similar dimensions when using task-based fMRI connectivity. Despite subtle variations in the strength of observed associations, the latter finding provided additional support that the different behavioral dimensions of psychopathology more generally relate to alterations in functional connectivity. Given that task-based fMRI data offers numerous avenues for analytical exploration, our findings may motivate follow-up work assessing associations to network- and gradient-based response strength and timing with respect to external stimuli across different functional states.”

      1.6) Overall, the authors achieve their aim to provide a detailed multimodal characterisation of MRI correlations of psychopathology. Code and data are available and well organised and should provide a valuable resource for researchers wanting to understand MRI-based neural correlates of psycho-pathology-related behavioural traits in this important age group. It is largely a descriptive study, with comparisons to previous uni-modal work, but without particularly strong testing of neuroscience hypotheses.

      We thank the Reviewer for recognizing the detail and rigor of data-driven study and extensive code and data documentation.

      Reviewer #2 - Public Review

      In "Multi-modal Neural Correlates of Childhood Psychopathology" Krebets et al. integrate multi-modal neuroimaging data using machine learning to delineate dissociable links to diverse dimensions of psychopathology in the ABCD sample. This paper had numerous strengths including a superb use of a large resource dataset, appropriate analyses, beautiful visualizations, clear writing, and highly interpretable results from a data-driven analysis. Overall, I think it would certainly be of interest to a general readership. That being said, I do have several comments for the authors to consider.

      We thank Dr Satterthwaite for the positive evaluation and helpful comments.

      2.1) Out-of-sample testing: while the permutation testing procedure for the PLS is entirely appropriate, without out-of-sample testing the reported effect sizes are likely inflated.

      As discussed in the editorial summary of essential revisions, we agree that out-of-sample prediction indeed provides stronger estimates of generalizability. We assess this by applying the PCA coefficients derived from the discovery cohort imaging data to the replication cohort imaging data. The resulting PCA scores and behavioral data were then z-scored using the mean and standard deviation of the replication cohort. The SVD weights derived from the discovery cohort were applied to the normalized replication cohort data to derive imaging and behavioral composite scores, which were used to recover the contribution of each imaging and behavioral variable to the LCs (i.e., loadings). Out-of-sample replicability of imaging (mean r=0.681, S.D.=0.131) and behavioral (mean r=0.948, S.D.=0.022) loadings was generally high across LCs 1-5. This analysis is reported in the revised manuscript (P.18).

      “Generalizability of reported findings was also assessed by directly applying PCA coefficients and latent components weights from the PLS analysis performed in the discovery cohort to the replication sample data. Out-of-sample prediction was overall high across LCs1-5 for both imaging (mean r=0.681, S.D.=0.131) and behavioral (mean r=0.948, S.D.=0.022) loadings.”

      2.2) Site/family structure: it was unclear how site/family structure were handled as covariates.

      Only unrelated participants were included in discovery and replication samples (see P.6). The site variable was regressed out of the imaging and behavioral data prior to the PLS analysis using the residuals from a multiple linear model which also included age, age2, sex, and ethnicity. This is now clarified on P.29:

      “Prior to the PLS analysis, effects of age, age2, sex, site, and ethnicity were regressed out from the behavioral and imaging data using a multiple linear regression to ensure that the LCs would not be driven by possible confounders (Kebets et al., 2021, 2019; Xia et al., 2018). The imaging and behavioral residuals of this procedure were input to the PLS analysis.”

      2.3) Anatomical features: I was a bit surprised to see volume, surface area, and thickness all evaluated - and that there were several comments on the correspondence between the SA and volume in the results section. Given that cortical volume is simply a product of SA and CT (and mainly driven by SA), this result may be pre-required.

      As suggested, we reduced the reporting of correlations between the loadings from the different modalities in the revised Results (specifically subsections on LC1, LC2, and LC3). Instead, we now refer to Table S4 in each subsection for this information: “Spatial correlations between modality-specific loadings are reported in Supplementary file 1c.”

      We also reran the PLS analysis while only including thickness and surface area as our structural metrics, to account for potential redundancy of these measures with volume. This analysis and associated findings are reported on P.36 and P.19:

      “As cortical volume is a result of both thickness and surface area, we repeated our main PLS analysis while excluding cortical volume from our imaging metrics and report the consistency of these findings with our main model.”

      “Third, to account for redundancy within structural imaging metrics included in our main PLS model (i.e., cortical volume is a result of both thickness and surface area), we also repeated our main analysis while excluding cortical volume from our imaging metrics. Findings were very similar to those in our main analysis, with an average absolute correlation of 0.898±0.114 across imaging composite scores of LCs 1-5.”

      2.4) Ethnicity: the rationale for regressing ethnicity from the data was unclear and may conflict with current best practices.

      We thank the Reviewer for this comment. In light of recent discussions on including this covariate in large datasets such as ABCD (e.g., Saragosa-Harris et al., 2022), we elaborate on our rationale for including this variable in our model in the revised manuscript on P.30:

      “Of note, the inclusion of ethnicity as a covariate in imaging studies has been recently called into question. In the present study, we included this variable in our main model as a proxy for social inequalities relating to race and ethnicity alongside biological factors (age, sex) with documented effects on brain organization and neurodevelopmental symptomatology queried in the CBCL.”

      We also assess the replicability of our analyses when removing race and ethnicity covariates prior to computing the PLS analysis and correlating imaging and behavioral composite scores across both models. We report resulting correlations in the revised manuscript (P.37, 19, and 27):

      “We also assessed the replicability of our findings when removing race and ethnicity covariates prior to computing the PLS analysis and correlating imaging and behavioral composite scores across both models.”

      “Moreover, repeating the PLS analysis while excluding this variable as a model covariate yielded overall similar imaging and behavioral composites scores across LCs to our original analysis. Across LCs 1-5, the average absolute correlations reached r=0.636±0.248 for imaging composite scores, and r=0.715±0.269 for behavioral composite scores. Removing these covariates seemed to exert stronger effects on LC3 and LC4 for both imaging and behavior, as lower correlations across models were specifically observed for these components.”

      “Although we could consider some socio-demographic variables and proxies of social inequalities relating to race and ethnicity as covariates in our main model, the relationship of these social factors to structural and functional brain phenotypes remains to be established with more targeted analyses.”

      2.5) Data quality: the authors did an admirable job in controlling for data quality in the analyses of functional connectivity data. However, it is unclear if a comparable measure of data quality was used for the T1/dMRI analyses. This likely will result in inflated effect sizes in some cases; it has the potential to reduce sensitivity to real effects.

      We agree that data quality was not accounted for in our analysis of T1w- and diffusion-derived metrics. We now accounted for T1w image quality by adding manual quality control ratings to the regressors applied to all structural imaging metrics prior to performing the PLS analysis, and reported the consistency of this new model with original findings. See P.36, P.19:

      “We also considered manual quality control ratings as a measure of T1w scan quality. This metric was included as a covariate in a multiple linear regression model accounting for potential confounds in the structural imaging data, in addition to age, age2, sex, site, ethnicity, ICV, and total surface area. Downstream PLS results were then benchmarked against those obtained from our main model.”

      “Considering scan quality in T1w-derived metrics (from manual quality control ratings) yielded similar results to our main analysis, with an average correlation of 0.986±0.014 across imaging composite scores.”

      As for diffusion imaging, we also regressed out effects of head motion in addition to age, age2, sex, site, and ethnicity from FA and MD measures and reported the consistency with our original results (P.36, P.19):

      “We tested another model which additionally included head motion parameters as regressors in our analyses of FA and MD measures, and assessed the consistency of findings from both models.”

      “Additionally considering head motion parameters from diffusion imaging metrics in our model yielded consistent results to those in our main analyses (mean r=0.891, S.D.=0.103; r=0.733-0.998).”

      Reviewer #3 - Public Review

      In this study, the authors utilized the Adolescent Brain Cognitive Development dataset to investigate the relationship between structural and functional brain network patterns and dimensions of psychopathology. They identified multiple components, including a general psychopathology (p) factor that exhibited a strong association with multimodal imaging features. The connectivity signatures associated with the p factor and neurodevelopmental dimensions aligned with the sensory-to-transmodal axis of cortical organization, which is linked to complex cognition and psychopathology risk. The findings were consistent across two separate subsamples and remained robust when accounting for variations in analytical parameters, thus contributing to a better understanding of the biological mechanisms underlying psychopathology dimensions and offering potential brain-based vulnerability markers.

      3.1) An intriguing aspect of this study is the integration of multiple neuroimaging modalities, combining structural and functional measures, to comprehensively assess the covariance with various symptom combinations. This approach provides a multidimensional understanding of the risk patterns associated with mental illness development.

      We thank the Reviewer for acknowledging the multimodal approach, and for the constructive suggestions.

      3.2) The paper delves deeper into established behavioral latent variables such as the p factor, internalizing, externalizing, and neurodevelopmental dimensions, revealing their distinct associations with morphological and intrinsic functional connectivity signatures. This sheds light on the neurobiological underpinnings of these dimensions.

      We are happy to hear the Reviewer appreciates the gain in understanding neural underpinnings of dimensions of psychopathology resulting from the current work.

      3.3) The robustness of the findings is a notable strength, as they were validated in a separate replication sample and remained consistent even when accounting for different parameter variations in the analysis methodology. This reinforces the generalizability and reliability of the results.

      We appreciate that the Reviewer found our robustness and generalizability assessment convincing.

      3.4) Based on their findings, the authors suggest that the observed variations in resting-state functional connectivity may indicate shared neurobiological substrates specific to certain symptoms. However, it should be noted that differences in resting-state connectivity between groups can stem from various factors, as highlighted in the existing literature. For instance, discrepancies in the interpretation of instructions during the resting state scan can influence the results. Hence, while their findings may indicate biological distinctions, they could also reflect differences in behavior.

      For the ABCD dataset, resting-state fMRI scans were based on eyes open and passive viewing of a crosshair, and are thus homogenized. We acknowledge, however, that there may still be state-to-state fluctuations contributing to the findings, and this is now discussed in the revised Discussion, on P.28. Note, however, that prior literature has generally also suggested rather modest impacts of cognitive and daily variation on resting-state functional networks, compared to much more dominating inter-individual and inter-group factors.

      “Finally, while prior research has shown that resting-state fMRI networks may be affected by differences in instructions and study paradigm (e.g., with respect to eyes open vs closed) (Agcaoglu et al., 2019), the resting-state fMRI paradigm is homogenized in the ABCD study to be passive viewing of a centrally presented fixation cross. It is nevertheless possible that there were slight variations in compliance and instructions that contributed to differences in associated functional architecture. Notably, however, there is a mounting literature based on high-definition fMRI acquisitions suggesting that functional networks are mainly dominated by common organizational principles and stable individual features, with substantially more modest contributions from task-state variability (Gratton et al. 2018). These findings, thus, suggest that resting-state fMRI markers can serve as powerful phenotypes of psychiatric conditions, and potential biomarkers (Abraham et al., 2017; Gratton et al., 2020; Parkes et al., 2020).”

      3.5) The authors conducted several analyses to investigate the relationship between imaging loadings associated with latent components and the principal functional gradient. They found several associations between principal gradient scores and both within- and between-network resting-state functional connectivity (RSFC) loadings. Assessing the analysis presented here proves challenging due to the nature of relating loadings, which are partly based on the RSFC, to gradients derived from RSFC. Consequently, a certain level of correlation between these two variables would be expected, making it difficult to determine the significance of the authors' findings. It would be more intriguing if a direct correlation between the composite scores reflecting behavior and the gradients were to yield statistically significant results.

      We thank the Reviewer for the comment, and agree that investigating gradient-behavior relationships could offer additional insights into the neural basis of psychiatric symptomatology. However, the current analysis pipeline precludes this direct comparison which is performed on a region-by-region basis across the span of the cortical gradient. Indeed, the behavioral loadings are provided for each CBCL item, and not cortical regions.

      The Reviewer also evokes concerns of potential circularity in our analysis, as we compared imaging loadings, which are partially based on RSFC, and gradient values generated from the same RSFC data. In response to this comment, we cross-validated our findings using an RSFC gradient derived from an independent dataset (HCP), showing highly consistent findings to those presented in the manuscript. This correlation is now reported in the Results section P.15.

      “A similar pattern of findings was observed when cross-validating between- and within-network RSFC loadings to a RSFC gradient derived from an independent dataset (HCP), with strongest correlations seen for between-network RSFC loadings for LC1 and LC3 (LC1: r=0.50, pspin<0.001; LC3: r=0.37, pspin<0.001).”

      We furthermore note similar correlations between imaging loadings and T1w/T2w ratio in the same participants, a proxy of intracortical microstructure and hierarchy (Glasser et al., 2011). These findings are now detailed in the revised Results, P.15-16:

      “Of note, we obtain similar correlations when using T1w/T2w ratio in the same participants, a proxy of intracortical microstructure and hierarchy (Glasser et al., 2011). Specifically, we observed the strongest association between this microstructural marker of the cortical hierarchy and between-network RSFC loadings related to LC1 (r=-0.43, pspin<0.001).”

      3.6) Lastly, regarding the interpretation of the first identified latent component, I have some reservations. Upon examining the loadings, it appears that LC1 primarily reflects impulse control issues rather than representing a comprehensive p-factor. Furthermore, it is worth noting that within the field, there is an ongoing debate concerning the interpretation and utilization of the p-factor. An insightful publication on this topic is "The p factor is the sum of its parts, for now" (Fried et al, 2021), which explains that the p-factor emerges as a result of a positive manifold, but it does not necessarily provide insights into the underlying mechanisms that generated the data.

      We thank the Reviewer for this comment, and added greater nuance into the discussion of the association to the p factor. We furthermore discuss some of the ongoing debate about the use of the p factor, and cite the recommended publication on P.27.

      “Other factors have also been suggested to impact the development of psychopathology, such as executive functioning deficits, earlier pubertal timing, negative life events (Brieant et al., 2021), maternal depression, or psychological factors (e.g., low effortful control, high neuroticism, negative affectivity). Inclusion of such data could also help to add further mechanistic insights into the rather synoptic proxy measure of the p factor itself (Fried et al., 2021), and to potentially assess shared and unique effects of the p factor vis-à-vis highly correlated measures of impulse control.”

    1. Author response:

      Reviewer #2 (Public Review):

      This is, to my knowledge, the most scalable method for phylogenetic placement that uses likelihoods. The tool has an inter- esting and innovative means of using gaps, which I haven’t seen before. In the validation the authors demonstrate superior performance to existing tools for taxonomic annotation (though there are questions about the setup of the validation as described below).

      The program is written in C with no library dependencies. This is great. However, I wasn’t able to try out the software because the linking failed on Debian 11, and the binary artifact made by the GitHub Actions pipeline was too recent for my GLIBC/kernel. It’d be nice to provide a binary for people stuck on older kernels (our cluster is still on Ubuntu 18.04). Also, would it be hard to publish your .zipped binaries as packages?

      We have provided a binary (and zipped package) that supports Ubuntu 18.04 in GitHub Actions ( https://github.com/lpipes/tronko/actions/runs/9947708087). This should facilitate the use of our software on older sys- tems like yours. We were not able to test the binary however, since GitHub did not seem to find any nodes with Ubuntu 18.04. It is important to note that Ubuntu 18.04 is deprecated. The latest version of Ubuntu is 24.04, and we recommend users to upgrade to newer, supported versions of their operating systems to benefit from the latest security updates and features.

      Thank you for publishing your source files for the validation on zenodo. Please provide a script that would enable the user to rerun the analysis using those files, either on zenodo or on GitHub somewhere.

      We have posted all datasets as well as scripts to Zenodo.

      The validations need further attention as follows.

      First, the authors have not chosen data sets that are not well-aligned with real-world use cases for this software, and as a re- sult, its applicability is difficult to determine. First, the leave-one-species-out experiment made use of COI gene sequences representing 253 species from the order Charadriiformes, which includes bird species such as gulls and terns. What is the reasoning for selecting this data set given the objective of demonstrating the utility of Tronko for large scale community profiling experiments which by their nature tend to include microorganisms as subjects? If the authors are interested in evaluating COI (or another gene target) as a marker for characterizing the composition of eukaryotic populations, is the heterogeneity and species distribution of bird species within order Charadriiformes comparable to what one would expect in populations of organisms that might actually be the target of a metagenomic analysis?

      Our reasoning for selecting Charadriiformes is that these species are often misidentified for each other and there is a heavy reliance on COI for their species identification. This choice allows us to demonstrate Tronko’s ability to handle difficult and realistic identification challenges. Additionally, we aimed to simulate a challenging dataset to effectively differentiate between the methods used, showcasing Tronko’s robustness. Including more distantly related bird species would have simplified the identification process, which would not serve our objective of demonstrating the utility of Tronko for dis- tinguishing closely related species. It is also important to note that all methods used the exact same reference database which is not always the case in other species assignment comparative studies.

      Furthermore, while our study uses bird species, the principles and techniques applied are broadly applicable to other taxa, including microorganisms. By selecting a datase tknown for its identification difficulties, we underscore Tronko’spotential utility in a wide range of taxonomic profiling scenarios, including those involving high heterogeneity and closely related species, such as in microbial communities.

      Second, It appears that experiments evaluating performance for 16S were limited to reclassification of sequencing data from mock communities described in two publications, Schirmer (2015, 49 bacteria and 10 archaea, all environmental), and Gohl (2016; 20 bacteria - this is the widely used commercial mock community from BEI, all well-known human pathogens or commensals). The authors performed a comparison with kraken2, metaphlan2, and MEGAN using both the default database for each as well as the same database used for Tronko (kudos for including the latter). This pair of experiments provide a reasonable high-level indication of Tronko’s performance relative to other tools, but the total number of organ- isms is very limited, and particularly limited with respect to the human microbiome. It is also important to point out that these mock communities are composed primarily of type strains and provide limited species-level heterogeneity. The per- formance of these classification tools on type strains may not be representative of what one would find in natural samples. Thus, the leave-one-individual-out and leave-one-species-out experiments would have been more useful and informative had they been applied to extended 16S data sets representing more ecologically realistic populations.

      We thank the reviewer for this comment and we have included both an additional bacterial mock community dataset from Lluch et al. (2015) and an additional leave-one-species-out experiment. We describe how this leave-one-species-out dataset was constructed in our previous response to ’Essential Revisions’ #1. We also added Figure 5, S5, and S6.

      Finally, the authors should describe the composition of the databases used for classification as well as the strategy (and toolchain) used to select reference sequences. What databases were the reference sequences drawn from and by what criteria? Were the reference databases designed to reflect the composition of the mock communities (and if so, are they limited to species in those communities, or are additional related species included), or have the authors constructed general pur- pose reference databases? How many representatives of each species were included (on average), and were there efforts to represent a diversity of strains for each species? The methods should include a section detailing the construction of the data sets: as illustrated in this very study, the choice of reference database influences the quality of classification results, and the authors should explain the process and design considerations for database construction.

      To construct our databases, we used CRUX (Curd et al., 2018). This is described in the Methods section under ’Custom 16S and COI Tronko-build reference database construction’. All missing outs tests were downsamples of these two databases. It is beyond the scope of the manuscript to discuss how CRUX works. Additionally, we added the following text:

      To compare the new method (Tronko) to previous methods, we constructed reference databases for COI and 16S for com- mon amplicon primer sets using CRUX (See Methods for exact primers used).

    1. Author response:

      Reviewer #1 (Public Review):

      In this manuscript, Perez-Lopez et al. examine the function of the chemokine CCL28, which is expressed highly in mucosal tissues during infection, but its role during infection is poorly understood. They find that CCL28 promotes neutrophil accumulation in the intestines of mice infected with Salmonella and in the lungs of mice infected with Acinetobacter. They find that Ccl28-/- mice are highly susceptible to Salmonella infection, and highly resistant and protected from lethality following Acinetobacter infection. They find that neutrophils express the CCL28 receptors CCR3 and CCR10. CCR3 was pre-formed and intracellular and translocated to the cell surface following phagocytosis or inflammatory stimuli. They also find that CCL28 stimulation of CCR3 promoted neutrophil antimicrobial activity, ROS production, and NET formation, using a combination of primary mouse and human neutrophils for their studies. Overall, the authors' findings provide new and fundamental insight into the role of the CCL28:CCR3 chemokine:chemokine receptor pair in regulating neutrophil recruitment and effector function during infection with the intestinal pathogen Salmonella Typhimurium and the lung pathogen Acinetobacter baumanii.

      We would like to thank the reviewer for their positive assessment of our work and for providing us with constructive comments that have helped us to improve the manuscript.

      Reviewer #2 (Public Review):

      In this manuscript by Perez-Lopez et al., the authors investigate the role of the chemokine CCL28 during bacterial infections in mucosal tissues. This is a well-written study with exciting results. They show a role for CCL28 in promoting neutrophil accumulation to the guts of Salmonella-infected mice and to the lung of mice infected with Acinetobacter. Interestingly, the functional consequences of CCL28 deficiency differ between infections with the two different pathogens, with CCL28-deficiency increasing susceptibility to Salmonella, but increasing resistance to Acinetobacter. The underlying mechanistic reasons for this suggest roles for CCL28 in enhanced neutrophil antimicrobial activity, production of reactive oxygen species, and formation of extracellular traps. However, additional experiments are required to shore up these mechanisms, including addressing the role of other CCL28-dependent cell types and further characterization of neutrophils from CCL28-deficient mice.

      We would like to thank the reviewer for the positive assessment of our work and for providing us with constructive comments that have helped us to improve the manuscript.

      Reviewer #3 (Public Review):

      The manuscript by Perez-Lopez and colleagues uses a combination of in vivo studies using knockout mice and elegant in vitro studies to explore the role of the chemokine CCL28 during bacterial infection on mucosal surfaces. Using the streptomycin model of Salmonella Typhimurium (S. Tm) infection, the authors demonstrate that CCL28 is required for neutrophil influx in the intestinal mucosa to control pathogen burden both locally and systemically. Interestingly, CCL28 plays the opposite role in a model lung infection by Acinetobacter baumanii, as Ccl28-/- mice are protected from Acinetobacter infection. Authors suggest that the mechanism by which CCL28 plays a role during bacterial infection is due to its role in modulating neutrophil recruitment and function.

      We would like to thank the reviewer for the positive assessment of our work and for providing us with constructive comments that have helped us to improve the manuscript.

      The major strengths of the manuscript are:

      The novelty of the findings that are described in the manuscript. The role of the chemokine CCL28 in modulating neutrophil function and recruitment in mucosal surfaces is intriguing and novel.

      Authors use Ccl28-/- mice in their studies, a mouse strain that has only recently been available. To assess the impact of CCL28 on mucosal surfaces during pathogen-induced inflammation, the authors choose not one but two models of bacterial infection (S. Tm and A. baumanii). This approach increases the rigor and impact of the data presented.

      Authors combine the elegant in vivo studies using Ccl28 -/- with in vitro experiments that explore the mechanisms by which CCL28 affects neutrophil function.

      The major weaknesses of the manuscript in its present form are:

      Authors use different time points in the S. Tm model to characterize the influx of immune cells and pathology. They do not provide a clear justification as to why distinct time points were chosen for their analysis.

      The reviewer raises a good point. As discussed in the detailed response to the reviewers, we have now generated extensive results at different time points and included these in the revised manuscript.

      Authors provide puzzling data that Ccl28-/- mice have the same numbers of CCR3 and CCR10- expressing neutrophils in the mucosa during infection. It is unclear why the lack of CCL28 expression would not affect the recruitment of neutrophils that express the ligands (CCR3 and CCR10) for this chemokine. Thus, these results need to be better explained.

      As discussed in the detailed response to the reviewers, we clarified that Ccl28-/- mice have reduced numbers of neutrophils in the mucosa during infection, but the percentage of CCR3+ and CCR10+ neutrophils does not change. We provide additional discussion of this point in the manuscript and in the response to the reviewers.

      The in vitro studies focus primarily on characterizing how CCL28 affects the function of neutrophils in response to S. Tm infection. There is a lack of data to demonstrate whether Acinetobacter affects CCR3 and CCR10 expression and recruitment to the cell surface and whether CCL28 plays any role in this process.

      We agree and have performed additional studies with Acinetobacter and CCL28, which we discuss in greater detail below in the response to the reviewers.

    1. Author response:

      We appreciate the time of the reviewers and their detailed comments, which will help to improve the manuscript.

      We are sorry that at least one reviewer seems to have had the impression that we have conflated issues about gonadal and non-gonadal sex phenotypes. This referee suggests that we should use Sharpe et al. (2023) to develop our concepts. However, what is discussed in Sharpe et al. was already the guiding principle for our study (without knowing this paper before). In our paper, we introduce the gonadal binary sex (which is self-evidently also the basis for creating the dataset in the first place, because we needed to separate males from females) and go then on to the question of (adult) sex phenotypes for the rest of the paper. The gonadal data are included only as comparison for contrasting the patterns in the non-gonadal tissues.

      Our study presents the largest systematic dataset so far on the evolution of sex-biased gene expression. It is also the first that explores the patterns of individual variation in sex-biased gene expression and the SBI is an entirely new procedure to directly visualize these variance patterns in an intuitive way (note that the relative position of the distributions along the X-axis is indeed not relevant). The results are actually quite nuanced (e.g. the rather dynamv changes seen in mouse kidney and liver comparisons) and go certainly beyond what would have been predictable based on the current literature.

      Also, we should like to point out that our study contradicts recent conclusions that were published in high profile journals, that had suggested that a substantial set of sex-biased genes has conserved functions between humans and mice and that mice can therefore be informative for gender-specific medicine studies. Our data suggest that that only a very small set of genes are conserved in their sex-biased expression. These are epigenetic regulator genes and it will therefore be interesting in the future to focus on their roles in generating the differences between sexual phenotypes in given species.

      We will be happy to use the referee comments to clarify all of these points in a revised version. But we do not think that our "evidence is incomplete" and that there are several "overstated key conclusions". We have used all canonical statistical analyses that are typically used in papers of sex-biased gene expression, as acknowledged by reviewers 1 and 2. The additional statistical analyses that are requested are not within the scope of such papers, but could be subject to separate general studies, independent of the sex-bias analysis (e.g. the role of highly expressed genes versus low expressed genes, or the analysis of the fraction of neutrally evolving loci).

      Finally, it is unclear why the overall rating of the paper is at the lowest possible category ("useful study"), given that it adds a substantial amount of data and new insights into the exploration of the non-binary nature of sexual phenotypes.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Revisions Round 1

      Reviewer #1

      We thank the reviewer for their careful reading of our manuscript and have taken all of their grammatical corrections into account.

      Reviewer #2 (Public Review): 

      Weaknesses: 

      The paper contains multiple instances of non-scientific language, as indicated below. It would also benefit from additional details on the cryo-EM structure determination in the Methods and inclusion of commonly accepted requirements for cryo-EM structures, like examples of 2D class averages, raw micrographs, and FSC curves (between half-maps as well as between rigid-body fitted (or refined) atomic models of the different polymorphs and their corresponding maps). In addition, cryo-EM maps for the control experiments F1 and F2 should be presented in Figure 9.

      We tried to correct the non-scientific language and have included the suggested data on the Cryo-EM analyses including new Figures 11-17.  We did not collect data on the sample used for the seeds in the cross seeding experiments because we had already confirmed in multiple datasets that the conditions in F1 and F2 reproducibly produce fibrils of Type 1 and Type 3, respectively. We have now analyzed cryo-EM data for 6 more samples at pH 7.0 and found that several kinds of polymorphs (Types 1A, 1M, 2A, 2B and 5) are accessible at this pH, however the Type 3 polymorphs are not formed at pH 7.0 under the conditions that we used for aggregation.

      Reviewer #2 (Recommendations For The Authors): 

      - remove unscientific language: "it seems that there are about as many unique atomic-resolution structures of these aggregates as there are publications describing them"   

      We have rephrased this sentence.

      - for same reason, remove "Obviously, " 

      Done

      - What does this mean? “polymorph-unspecific” 

      Rephrased as non-polymorph-specific

      - What does this mean? "shallow amyloid energy hypersurface"  

      By “shallow hypersurface” we mean that the minimum of the multi-dimensional function that describes the energy of the amyloid is not so deep that subtle changes to the environment will not favor another fold/energy minimum. We have left the sentence because while it may not be perfect, it is concise and seems to get the point across.

      - "The results also confirm the possibility of producing disease-relevant structure in vitro." -> This is incorrect as no disease-relevant structure was replicated in this work. Use another word like “suggest”.

      We have changed to “suggest” as suggested.

      - Remove "historically" 

      Done

      - Rephrase “It has long been understood that all amyloids contain a common structural scaffold” 

      Changed to “It has long been established that all amyloids contain a common structural scaffold..” 

      - "Amyloid polymorphs whose differences lie in both their tertiary structure (the arrangement of the beta-strands) and the quaternary structure (protofilamentprotofilament assembly) have been found to display distinct biological activities [8]" -> I don't think this is true, different biological activities of amyloids have never been linked to their distinct structures.  

      We have added 5 new references (8-12) to support this sentence.

      - Reference 10 is a comment on reference 9; it should be removed. Instead, as for alpha-synuclein, all papers describing the tau structures should be included.  

      We have removed the reference, but feel that the addition of all Tau structure references is not merited in this manuscript since we are not comparing them.

      - Rephrase: "is not always 100% faithful"

      Removed “100%”

      - What is pseudo-C2 symmetry? Do the authors mean pseudo 2_1 symmetry (ie a 2-start helical symmetry)?

      Thank for pointing this out.  We did indeed mean pseudo 21 helical symmetry.  

      - Re-phrase: "alpha-Syn's chameleon-like behavior" 

      We have removed this phrase.

      - "In the case of alpha-Syn, the secondary nucleation mechanism is based on the interaction of the positively charged N-terminal region of monomeric alpha-Syn and the disordered, negatively charged C-terminal region of the alpha-Syn amyloid fibrils [54]" -> I would say the mechanisms of secondary nucleation are not that well understood yet, so one may want to tune this down a bit. 

      We have changed this to “mechanism has been proposed to be”

      - The paragraphs describing experiments by others are better suited for a Discussion rather than a Results section. Perhaps re-organize this part? 

      We have left the text intact as we are using a Results and Discussion format.

      - A lot of information about Image processing seems to be missing: what steps were performed after initial model generation? 

      We have added more details in the methods section on the EM data processing and model analysis.

      - Figure 1: Where is Type 4 on the pH scale?

      We have adjusted the Fig 1 legend to clarify that pH scale is only applicable to the structures presented in this manuscript. 

      - Figure 2: This might be better incorporated as a subpanel of Figure 1.

      We agree that this figure is somewhat of a loner on its own and we only added it in order to avoid confusion with the somewhat inconsistent naming scheme used for the Type 1B structure. However, we prefer to leave it as a separate figure so that it does not get dilute the impact of figure 1.

      - Figure 3: What is the extra density at the bottom of Type 3B from pH 5.8 samples 1 and 2. pH 5.8 + 50mM NaCl (but not pH 5.8 + 100 mM NaCl)? Could this be an indication of a local minimum and the pH 5.8 + 100 mM NaCl structure is correct? Or is this a real difference between 0/50mM NaCl and 100 mM NaCl? 

      We did not see the extra density to which the reviewer is referring, however the images used in this panel are the based on the output of 3D-classification which is more likely to produce more artifacts than a 3D refinement. With this in mind, we did not see any significant differences in the refined structures and therefore only deposited the better quality map and model for each of the polymorph types.

      - Figure 3: To what extent is Type 3B of pH 6.5 still a mixture of different types? The density looks poor. In general, in the absence of more details about the cryo-EM maps, it is hard to assess the quality of the structures presented.  

      In order to improve the quality of the images in this panel, a more complete separation of the particles from each polymorph was achieved via the filament subset selection tool in RELION 5. In each case, an unbiased could be created from the 2D classes via the relion_helix_inimodel2D program, further supporting the coexistence of 4 polymorphs in the pH 6.5 sample. The particles were individually refined to produce the respective maps that are now used in this figure.

      - Many references are incorrect, containing "Preprint at (20xx)" statements.

      This has been corrected.  

      Reviewer #3 (Public Review): 

      Weaknesses: 

      (1) The authors reveal that both Type 1 monofilament fibril polymorph (reminiscent of JOSlike polymorph) and Type 5 polymorph (akin to tissue-amplified-like polymorph) can both form under the same condition. Additionally, this condition also fosters the formation of flat ribbon-like fibril across different batches. Notably, at pH 5.8, variations in experimental groups yield disparate abundance ratios between polymorph 3B and 3C, indicating a degree of instability in fibrillar formation. The variability would potentially pose challenges for replicability in subsequent research. In light of these situations, I propose the following recommendations: 

      (a) An explicit elucidation of the factors contributing to these divergent outcomes under similar experimental conditions is warranted. This should include an exploration of whether variations in purified protein batches are contributing factors to the observed heterogeneity.

      We are in complete agreement that understanding the factors that lead to polymorph variability is of utmost importance (and was the impetus for the manuscript itself). However the number of variables to explore is overwhelming and we will continue to investigate this in our future research. Regarding the variability between batches of purified protein, we also think that this could be a factor in the polymorph variability observed for otherwise “identical” aggregation conditions, particularly at pH 7 where the largest variety of polymorphs have been observed. However, even variation between identical replicates (samples created from the same protein solution and simply aggregated simultaneously in separate tubes) can lead to different outcomes (see datasets 15 and 16 in the revised Table 1) suggesting that there are stochastic processes that can determine the outcome of an individual aggregation experiment. While our data still indicates that Type 1,2 and 3 polymorphs are strongly selected by pH, the selection between interface variants 3B vs. 3C and 2A vs. 2B might also be affected by protein purity. Our standard purification protocol produces a single band by coomassie-stained SDS-PAGE however minor truncations and other impurities below a few percent would go undetected and, given the proposed roles of the N and C-termini in secondary nucleation, could have a large effect on polymorph selection and seeding. In line with the reviewer’s comments we now include a batch number for each EM dataset. While no new conclusions can be drawn from the inclusion of this additional data, we feel that it is important to acknowledge the possible role of batch to batch variability. 

      (b) To enhance the robustness of the conclusions, additional replicates of the experiments under the same condition should be conducted, ideally a minimum of three times.  

      The pH 5.8 conditions that yield Type 3 fibrils has already been repeated several times in the original manuscript. Since the pH 7.4 conditions produce the most common a-Syn polymorph (Type 1A) and were produced twice in this manuscript (once as an unseeded and once as a cross-seeded fibrilization) we decided to focus on the intermediate condition where the most variability had been seen (pH 7.0). The revised table 1 now has 6 new datasets (11-16) representing 6 independent aggregations at pH 7.0 starting from two different protein purification batches. The results is that we now produce the type 2A/B polymorphs in three samples and in two of these samples we once again observed the type 1M polymorph.  The other samples produced Type 1A or non-twisted fibrils.

      (c) Further investigation into whether different polymorphs formed under the same buffer condition could lead to distinct toxicological and pathology effects would be a valuable addition to the study.  

      The correlation of toxicity with structure would in principle be interesting. However the Type 1 and Type 3 polymorphs formed at pH 5.8 and 7.4 are not likely to be biologically relevant. The pH 7 polymorphs (Type 5 and 1M) would be more interesting because they form under the same conditions and might be related to some disease relevant structures. Still, it is rare that a single polymorph appears at 7.0 (the Type 5 represented only 10-20% of the fibrils in the sample and the Type 1M also had unidentified double-filament fibrils in the sample). We plan to pursue this line of research and hope to include it in a future publication.

      (2) The cross-seeding study presented in the manuscript demonstrates the pivotal role of pH conditions in dictating conformation. However, an intriguing aspect that emerges is the potential role of seed concentration in determining the resultant product structure. This raises a critical question: at what specific seed concentration does the determining factor for polymorph selection shift from pH condition to seed concentration? A methodological robust approach to address this should be conducted through a series of experiments across a range of seed concentrations. Such an approach could delineate a clear boundary at which seed concentration begins to predominantly dictate the conformation, as opposed to pH conditions. Incorporating this aspect into the study would not only clarify the interplay between seed concentration and pH conditions, but also add a fascinating dimension to the understanding of polymorph selection mechanisms.

      A more complete analysis of the mechanisms of aggregation, including the effect of seed concentration and the resulting polymorph specificity of the process, are all very important for our understanding of the aggregation pathways of alpha-synuclein and are currently the topic of ongoing investigations in our lab.

      Furthermore, the study prompts additional queries regarding the behavior of cross-seeding production under the same pH conditions when employing seeds of distinct conformation. Evidence from various studies, such as those involving E46K and G51D cross-seeding, suggests that seed structure plays a crucial role in dictating polymorph selection. A key question is whether these products consistently mirror the structure of their respective seeds. 

      We thank the reviewer for reminding us to cite these studies as a clear example of polymorph selection by cross-seeding. Unfortunately, it is not 100% clear from the G51D cross seeding manuscript (https://doi.org/10.1038/s41467-021-26433-2) what conditions were used in the cross-seeding since different conditions were used for the seedless wild-type and mutant aggregations… however it appears that the wildtype without seeds was Tris pH 7.5 (although at 37C the pH could have dropped to 7ish) and the cross-seeded wild-type was in Phosphate buffer at pH 7.0. In the E46K cross-seeding manuscript, it appears that pH 7.5 Tris was used for all fibrilizations (https://doi.org/10.1073/pnas.2012435118).  In any event, both results point to the fact that at pH 7.0-7.5 under low-seed conditions (0.5%) the Type 4 polymorph can propagate in a seed specific manner.

      (3) In the Results section of "The buffer environment can dictate polymorph during seeded nucleation", the authors reference previous cell biological and biochemical assays to support the polymorph-specific seeding of MSA and PD patients under the same buffer conditions. This discussion is juxtaposed with recent research that compares the in vivo biological activities of hPFF, ampLB as well as LB, particularly in terms of seeding activity and pathology. Notably, this research suggests that ampLB, rather than hPFF, can accurately model the key aspects of Lewy Body Diseases (LBD) (refer to: https://doi.org/10.1038/s41467-023-42705-5). The critical issue here is the need to reconcile the phenomena observed in vitro with those in in-vivo or in-cell models. Given the low seed concentration reported in these studies, it is imperative for the authors to provide a more detailed explanation as to why the possible similar conformation could lead to divergent pathologies, including differences in cell-type preference and seeding capability.  

      We thank the reviewer for bring this recent report to our attention. The findings that ampLB and hPFF have different PK digestion patterns and that only the former is able to model key aspects of Lewy Body disease are in support of the seed-specific nature of some types of alpha-synuclein aggregation.  We have added this to the discussion regarding the significant role that seed type and seed conditions likely play in polymorph selection.

      (4) In the Method section of "Image processing", the authors describe the helical reconstruction procedure, without mentioning much detail about the 3D reconstruction and refinement process. For the benefit of reproducibility and to facilitate a deeper understanding among readers, the authors should enrich this part to include more comprehensive information, akin to the level of detail found in similar studies (refer to: https://doi.org/10.1038/nature23002).

      As also suggested by reviewer #2, we have now added more comprehensive information on the 3D reconstruction and refinement process.

      (5) The abbreviation of amino acids should be unified. In the Results section "On the structural heterogeneity of Type 1 polymorphs", the amino acids are denoted using three-letter abbreviation. Conversely, in the same section under "On the structural heterogeneity of Type 2 and 3 structures", amino acids are abbreviated using the one-letter format. For clarity and consistency, it is essential that a standardized format for amino acid abbreviations be adopted throughout the manuscript.

      That makes perfect sense and had been corrected.

      Reviewing Editor: 

      After discussion among the reviewers, it was decided that point 2 in Reviewer #3's Public Review (about the experiments with different concentrations of seeds) would probably lie outside the scope of a reasonable revision for this work. 

      We agree as stated above and will continue to work on this important point.

      Revisions Round 2

      Reviewer #2 (Public Review): 

      I do worry that the FSC values of model-vs-map appear to be higher than expected from the corresponding FSCs between the half-maps (e.g. see Fig 13). The implication of this observation is that the atomic models may have been overfitted in the maps, which would have led to a deterioration of their geometry. A table with rmsd on bond lengths, angles, etc would probably show this. In addition, to check for overfitting, the atomic model for each data set could be refined in one of the half-maps, and then that same model could be used to calculate 2 FSC model-vs-map curves: one against the half-map it was refined in and one against the other half-map. Deviations between these two curves are an indication of overfitting. 

      Thank you for the recommendations for model validation.  We have added the suggested statistics to Table 2 and performed the suggested model fitting to one of the half-maps and plotted 3 FSC model-vs-map curves: one for each half-map versus the model fit against only one half map and one for the model fit against the full map. We feel that the degree of overfitting is reasonable and does not  significantly impact the quality of the models. 

      In addition, the sudden drop in the FSC curves in Figure 16 shows that something unexpected has happened to this refinement. Are the authors sure that only the procedures outlined in the Methods were used to create these curves? The unexpected nature of the FSC curve for this type (2A) raises doubts about the correctness of the reconstruction. 

      We thank the reviewer for the attention to detail.  We should have caught this mistake. It turns out that in the last round of 3D refinement, the two half-maps become shifted with respect to each other in the z direction. We realigned the two maps using Chimera and then re-ran the postprocessing. The new maps have been deposited in EMD-50850. This mistake motivated us to inspect all of the maps and we found the same problem had occurred in the Type 3B maps.  This was not noticed by the reviewer because we accidentally plotted the FSC curves from postprocessing from one refinement round before the one deposited in the EMD. We performed the same half-map shifting procedure for the Type 3B data and performed a final round of real-space refinement to produce new maps and models that have been deposited as EMD-50888 and 9FYP (superseding the previous entries).

      Reviewer #3 (Public Review): 

      There are two minor points I recommend the authors to address: 

      (1) In the response to Weakness 1, point (3), the authors state that "the Type 5 represented only 10-20% of the fibrils in the sample." However, this information is not labeled in the corresponding Figure 4. I suggest the authors verify and label all relevant percentages in the figures to prevent misunderstandings. 

      We aim to be as transparent as possible and this information was included in the main text however we did not label the percentage of Type 5 fibrils in Figure 4 because that would make the other percentages ambiguous.  The percentages in Figure 4 represent the ratio of helical segments used for each type of refined structure in the dataset (always adding up to 100%), not the percent of all fibrils in the dataset.  That is, there are sometimes untwisted or unidentifiable fibrils in datasets and these were not accounted for in the listed percentages. We have added a sentence to the Figure 4 legend to explain to what the percentages refer.

      (2) While the authors have detailed the helical reconstruction procedure in the Methods section, it is necessary to indicate the scale bar or box size in the figure legend of the 2D representative classes to ensure clarity and reproducibility. 

      Thank you for reminding us to add the scale bars. This is now done for the 2D classes in Figures 11-17.

      Recommendations for the authors: 

      Reviewer #2 (Recommendations For The Authors): 

      A critical look at the maps and models of the various structures at this stage may prevent the authors from entering suboptimal structures into the databases.  

      We agree. Thank you for suggesting this.

      Reviewer #3 (Recommendations For The Authors): 

      The authors have responded adequately to these critiques in the revised version of the manuscript. There are two minor points. 

      (1) The authors state that "the Type 5 represented only 10-20% of the fibrils in the sample." However, this information is not labeled in the corresponding Figure 4. I suggest the authors verify and label all relevant percentages in the figures to prevent misunderstandings. 

      (2) While the authors have detailed the helical reconstruction procedure in the Methods section, it is necessary to indicate the scale bar or box size in the figure legend of the 2D representative classes to ensure clarity and reproducibility. 

      Answered in public comments

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Little is known about the local circuit mechanisms in the preoptic area (POA) that regulate body temperature. This carefully executed study investigates the role of GABAergic interneurons in the POA that express neurotensin (NTS). The principal finding is that GABA-release from these cells inhibits neighboring neurons, including warm-activated PACAP neurons, thereby promoting hyperthermia, whereas NTS released from these cells has the opposite effect, causing a delayed activation and hypothermia. This is shown through an elegant series of experiments that include slice recordings alongside matched in vivo functional manipulations. The roles of the two neurotransmitters are distinguished using a cell-type-specific knockout of Vgat as well as pharmacology to block GABA and NTS receptors. Overall, this is an excellent study that is noteworthy for revealing local circuit mechanisms in the POA that control body temperature and also for highlighting how amino acid neurotransmitters and neuropeptides released from the same cell can have opposing physiologic effects. I have only minor suggestions for revision.

      Reviewer #2 (Public Review):

      Summary:

      The study has demonstrated how two neurotransmitters and neuromodulators from the same neurons can be regulated and utilized in thermoregulation.

      The study utilized electrophysiological methods to examine the characteristics and thermoregulation of Neurotensin (Nts)-expressing neurons in the medial preoptic area (MPO). It was discovered that GABA and Nts may be co-released by neurons in MPO when communicating with their target neurons.

      Strengths:

      The study has leveraged optogenetic, chemogenetic, knockout, and pharmacological inhibitors to investigate the release process of Nts and GABA in controlling body temperature.

      The findings are relevant to those interested in the various functions of specific neuron populations and their distinct regulatory mechanisms on neurotransmitter/neuromodulator activities

      Weaknesses:

      Key points for consideration include:

      (1) The co-release of GABA and Nts is primarily inferred rather than directly proven. Providing more direct evidence for the release of GABA and the co-release of GABA and Nts would strengthen the argument. Further in vitro analysis could strengthen the conclusion regarding this co-releasing process.

      Measurement of Nts concentrations in various brain regions during thermoregulatory responses is part of a future study.

      (2) The differences between optogenetic and chemogenetic methods were not thoroughly investigated. A comparison of in vitro results and direct observation of release patterns could clarify the mechanisms of GABA release alone or in conjunction with Nts under different stimulation techniques.

      A comparison of chemogenetic and optogenetic stimulation methods is not within the scope of this study.

      (3) Neuronal transcripts were mainly identified through PCR, and alternative methods like single-cell sequencing could be explored.

      Single cell transcriptomics of preoptic neurotensinergic neurons will be part of a different study.

      (4) In Figure 6, the impact of GABA released from Nts neurons in MPO on CBT regulation appears to vary with ambient temperatures, requiring a more detailed explanation for better comprehension.

      The different possible roles of GABA in different thermoregulatory circumstances is discussed on lines 555-581.

      (5) The model should emphasize the key findings of the study.

      The model is presented in Fig 8.

      Reviewer #3 (Public Review):

      Summary:

      Understanding the central neural circuits regulating body temperature is critical for improving health outcomes in many disease conditions and in combating heat stress in an ever-warming environment. The authors present important and detailed new data that characterizes a specific population of POA neurons with a relationship to thermoregulation. The new insights provided in this manuscript are exactly what is needed to assemble a neural network model of the central thermoregulatory circuitry that will contribute significantly to our understanding of regulating the critical homeostatic variable of body temperature. These experiments were conducted with the expertise of an investigator with career-long experience in intracellular recordings from POA neurons. They were interpreted conservatively in the appropriate context of current literature.

      The Introduction begins with "Homeotherms, including mammals, maintain core body temperature (CBT) within a narrow range", but this ignores the frequent hypothermic episodes of torpor that mice undergo triggered by cold exposure. Although the author does mention torpor briefly in the Discussion, since these experiments were carried out exclusively in mice, greater consideration (albeit speculative) of the potential for a role of MPO Nts neurons in torpor initiation or recovery is warranted. This is especially the case since some 'torpor neurons' have been characterized as PACAP-expressing and a population of PACAP neurons represent the target of MPO Nts neurons.

      Additional discussion of a possible role of neurotensinergic neurons in the initiation or recovery from torpor is included (lines 593-597).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary:

      The authors profile gene expression, chromatin accessibility, and chromosomal architecture (by Hi-C) in activated CD4 T cells and use this information to link non-coding variants associated with autoimmune diseases with putative target genes. They find over 1000 genes physically linked with autoimmune disease loci in these cells, many of which are upregulated upon T cell activation. Focusing on IL2, they dissect the regulatory architecture of this locus, including the allelic effects of GWAS variants. They also intersect their variant-to-gene lists with data from CRISPR screens for genes involved in CD4 T cell activation and expression of inflammatory genes, finding enrichments for regulators. Finally, they showed that pharmacological inhibition of some of these genes impacts T-cell activation. 

      This is a solid study that follows a well-established canvas for variant-to-gene prioritisation using 3D genomics, applying it to activated T cells. The authors go some way in validating the lists of candidate genes, as well as exploring the regulatory architecture of a candidate GWAS locus. Jointly with data from previous studies performing variant-to-gene assignment in activated CD4 T cells (and other immune cells), this work provides a useful additional resource for interpreting autoimmune disease-associated genetic variation. 

      Suggestions for improvement:

      Autoimmune disease variants were already linked with genes in CD28-stimulated CD4 T cells using chromosome conformation capture, specifically Promoter CHi-C and the COGS pipeline (Javierre et al., Cell 2016; Burren et al., Genome Biol 2017; Yang et al., Nat Comms 2020). The authors cite these papers and present a comparative analysis of their variant-to-gene assignments (in addition to scRNA-seq eQTL-based assignments). Furthermore, they find that the Burren analysis yields a higher enrichment for gold standard genes. 

      The obvious question that the authors don't venture into is why the results are quite different. In principle, this could be due to the differences between: 

      (a) the cell stimulation procedure 

      (b) the GWAS datasets used 

      (c)  the types of assay (Hi-C vs Capture Hi-C) 

      (d) approaches for defining gene-linked regions (loops vs neighbourhoods) 

      (e) how the GWAS signals at gene-linked regions are aggregated (e.g., the flavours of COGS in Javierre and Burren vs the authors' approach)

      Re (a), I'm not sure the authors make it explicitly clear in the main text that the Capture Hi-Cbased studies also use *stimulated* CD4 T cells, particularly in the section "Comparative predictive power...". So the cells used are pretty much the same, and the differences likely arise from points (b) to (e).

      It would be useful for the community to understand more clearly what is driving these differences, ideally with some added data. Could the authors, for example, take the PCHi-C data from Javierre/Burren and use their GWAS data and variant-to-gene assignment algorithms? 

      We greatly appreciate the referee’s expert assessment of our work and its value to the field, and we are glad that the referee was enthused by our comparison of the predictive power of the various V2G approaches. A point not emphasized enough in the original version of the manuscript is that we actually did harmonize the various datasets in the way the referee suggests for the precision/recall analysis. We took the contact maps presented from each paper, mapped genes using the same set of GWAS SNPs, and defined all gene-linked regions using our loop calling approach. This has been clarified in the revised version of the manuscript. We have now included a more thoughtful discussion of the possible sources of discrepancy between the different studies included in the comparison, and our thoughts on the potential sources raised by the referee are outlined below:

      (a) The modes of stimulation used are similar between studies, but timepoints and donors did vary, and ours was the only study that sorted naïve CD4+ T cells before stimulation. These aspects could represent a source of variability. 

      (b) The GWAS is not a source of variability because we re-ran the raw data from all the orthogonal studies through our V2G pipeline using the same GWAS as in the current manuscript. 

      (c) The use of HiC vs. Capture HiC is a likely source of variability. The Capture-HiC datasets included in our comparison are lower resolution (i.e. HindIII) but focus higher sequencing depth at promoters compared to our HiC datasets – i.e., Capture-HiC may mis-call loops to the wrong promoters due to lower resolution as we have shown in our previous study [Su, Human Genetics, 2021], and will miss distal SNP interactions at promoters not included in the capture set. While HiC is unbiased in this regard, HiC will fail to call some SNP-promoter loops called by CaptureHiC because the sequencing power is not specifically focused at promoters. 

      (d) For studies using neighborhood approaches, we re-ran the raw data through our loop calling algorithm to connect distal SNP to gene promoters, and regarding (e) above, we ran the raw data through our V2G pipeline to allow a better comparison.

      In addition, given that the authors use Hi-C, a popular method for V2G prioritisation for this type of data is currently ABC (Nasser et al, Nature 2021). Could the authors provide a comparative analysis with respect to the V2G assignments in the paper and, if they see it appropriate, also run ABC-based GWAS integration on their own Hi-C data?

      This is an excellent suggestion, which we have followed in the revised version of our manuscript. It should be noted (and we do so in the text of the revision) that there is an important caveat to bringing in the ABC model. Chromosome conformation-based approaches are biologically constrained (i.e., informed) by the natural structure of chromatin in the nucleus that controls how gene transcription is regulated in cis, and it does so in a way that brings value to GWAS data. However, the ABC model further constrains the input data by imposing non-biological filters that allow the algorithm to be applied, but impose artifactual limitations that may negatively impact interpretation and discovery. In addition to filtering out pseudogenes, bidirectional RNA, antisense RNAs, and small RNAs, the ABC model gene set eliminates genes ubiquitously expressed across tissues (based on the assumption that these genes are driven primarily by elements adjacent to their promoters) and only allows annotation of one promoter per gene, even though the median number of promoters per gene in the human genome is three. In contrast, our chromatin-based V2G removes pseudogenes, but includes lincRNA and small RNAs, and includes all alternative transcription start sites annotated by gencode. 

      To apply the ABC GWAS gene nomination model to our CD4+ T cell chromatin-based V2G data, we used our ATAC-seq data and publicly available CD4+ T cell H3K27ac ChIP-seq data as input, and integrated this with GWAS and the average ENCODE-derived HiC dataset from the original ABC paper. The activity-by-contact model nominated 650 genes, compared to 1836 genes when using our cell type-matched HiC data and analysis pipeline. Only 357 of these genes were nominated by both approaches; 1479 genes nominated by our approach were not nominated by ABC, while 293 genes not implicated by our approach were newly implicated by ABC. To determine how the ABC-constrained approach performs against the HIEI gold standard set, we subjected all datasets used for the comparison depicted in the new Figure 5D to the same promoter filter used by the ABC model prior as part of the precision-recall re-analysis. Firstly, we found that applying the restricted ABC model promoter annotation to all datasets did not have a large effect on recall, however, the precision of several of the datasets were affected. For example, using the restricted promoter set reduced the precision of our (Pahl) V2G approach and inflated the precision of the nearest gene to SNP metric. Second, the new precision-recall analysis shows that the ABC score-based approach is only half as sensitive at predicting HIEI genes as the chromatin-based V2G approaches. This indicates that constraining GWAS data with cell type- and state-specific 3D chromatin-based data brings more GWAS target gene predictive power than application of the multi-tissue-averaged HiC used by the ABC model. We thank the reviewer for helpful suggestions that have improved the quality of our study.

      Reviewer #2 (Public Review): 

      Summary:

      There is significant interest in characterizing the mechanisms by which genetic mutations linked to autoimmunity perturb immune processes. Pahl et al. collect information on dynamic accessible regions, genes, and 3D contacts in primary CD4+ T cell samples that have been stimulated ex vivo. The study includes a variety of analyses characterizing these dynamic changes. With TF footprinting they propose factors linked to active regulatory elements. They compare the performance of their variant mapping pipeline that uses their data versus existing datasets. Most compelling there was a deep dive into additional study of regulatory elements nearby the IL2 gene. Finally, they perform a pharmacological screen targeting several genes they suggest are involved in T cell proliferation. 

      Strengths:

      The work done characterizing elements at the IL2 locus is impressive. 

      Weaknesses:

      Missing critical context to evaluate claims. There are extensive studies performed on resting and activated immune cell states (CD4+ T cells and other cell types) and some at multiple time points or concentrations of stimuli that collect ATAC-seq and/or RNA-seq that have been ignored by this study. How do conclusions from previous studies compare to what the authors conclude here? It is impossible to evaluate the claims without this additional context. These are a few studies I am familiar with (the authors should perform a more comprehensive search to be sure they're not ignoring existing observations) that would be important to compare/contrast conclusions:  o Alasoo, K. et al. Shared genetic effects on chromatin and gene expression indicate a role for enhancer priming in immune response. Nat. Genet. 50, 424-431 (2018). 

      - Calderon, D., Nguyen, M.L.T., Mezger, A. et al. Landscape of stimulation-responsive chromatin across diverse human immune cells. Nat Genet 51, 1494-1505 (2019). 

      - Gate, R.E., Cheng, C.S., Aiden, A.P. et al. Genetic determinants of co-accessible chromatin regions in activated T cells across humans. Nat Genet 50, 1140-1150 (2018).  o Glinos, D.A., Soskic, B., Williams, C. et al. Genomic profiling of T-cell activation suggests increased sensitivity of memory T cells to CD28 costimulation. Genes Immun 21, 390-408 (2020).  o Gutierrez-Arcelus, M., Baglaenko, Y., Arora, J. et al. Allele-specific expression changes dynamically during T cell activation in HLA and other autoimmune loci. Nat Genet 52, 247-253 (2020). 

      - Kim-Hellmuth, S. et al. Genetic regulatory effects modified by immune activation contribute to autoimmune disease associations. Nat. Commun. 8, 266 (2017).  o Ye, C. J. et al. Intersection of population variation and autoimmunity genetics in human T cell activation. Science 345, 1254665 (2014). 

      - As a general point, I appreciate it when each claim includes a corresponding effect size and p-value, which helps me evaluate the strength of significance of supporting evidence. 

      We greatly appreciate the referee’s expert assessment of our work and emphasis on the value of our functional follow-up studies. Our precision-recall analyses were not meant to represent an exhaustive comparison of all prior GWAS gene nomination studies, although we agree that this could (and should) be done as part of a separate study in a future manuscript. Instead, we focused on gene nomination studies that 1) analyzed resting and activated human CD4+ T cells, 2) whose experimental design was most comparable to our own studies, and 3) had raw data readily available in the appropriate formats to allow re-analysis and harmonization before comparison. This is a point we did not make sufficiently clear in the original version of the manuscript, but have clarified in the revision. 

      Based on this rationale, we agree that the studies by Gate et al. and Ye et al. should be included in our comparative precision-recall analysis, and we have done so in the revised manuscript. The Gate study reported ATAC-seq peak co-accessibility, caQTL, eQTL, and HiC data, and we now include the resulting gene nominations from these datasets in the precision-recall analysis. These datasets performed poorly with respect to nomination of HIEI genes, likely due to small sample numbers and low sequencing depth compared to the other eQTL and chromatin capture-based studies. The eQTL reported by Ye et al. nominated 15 genes for autoimmune traits, two of which were in the ‘truth’ HIEI set (IL7R and IL2RB). This resulted low predictive power but a high precision due to the low number of nominated genes compared to the other V2G datasets. As suggested by referee 1, we have also subjected our data to the ‘activity-by-contact’ (ABC) algorithm and have included this dataset in the comparison as well. Please see Figure 5 in the revised manuscript. 

      We have elected not to include data from the other studies suggested by the referee for the following reasons: The stimulation paradigm used in the Glinos study is very different from that used in other studies. Also, this study and the study by Calderon did not nominate genes. The studies by Alasoo et al. and Kim-Hellmuth et al. analyzed macrophages, which are not a comparable cell type to CD4+ T cells. The allele-specific eQTL study by Gutierrez-Arcelus et al. included relevant the cell type and activation states, but included a relatively small number of samples (24) and variants (561), and the raw data in dbGAP does not readily allow for re-analysis and harmonization with the other studies. We thank the reviewer for helpful suggestions that have improved the quality of our study.

      Reviewer #3 (Public Review): 

      Summary:

      This paper used RNAseq, ATACseq, and Hi-C to assess gene expression, chromatin accessibility, and chromatin physical associations for native CD4+ T cells as they respond to stimulation through TCR and CD28. With these data in hand, the authors identified 423 GWAS signals to their respective target genes, where most of these were not in the proximal promoter, but rather distal enhancers. The IL-2 gene was used as an example to identify new distal cisregulatory regions required for optimal IL-2 gene transcription. These distal elements interact with the proximal IL2 promoter region. When the distal enhancer contained an autoimmune SNP, it affected IL-2 gene transcription. The authors also identified genetic risk variants that were associated with genes upon activation. Some of these regulate proliferation and cytokine production, but others are novel. 

      Strengths:

      This paper provides a wealth of data related to gene expression after CD4 T cells are activated through the TCR and CD28. An important strength of this paper is that these data were intensively analyzed to uncover autoimmune disease SNPs in cis-acting regions. Many of these could be assigned to likely target genes even though they often are in distal enhancers. These findings help to provide a better understanding concerning the mechanism by which GWAS risk elements impact gene expression. 

      Another strength of this study was the proof-of-principle studies examining the IL-2 gene. Not only were new cis-acting enhancers discovered, but they were functionally shown to be important in regulating IL-2 expression, including susceptibility to colitis. Their importance was also established with respect to such distal enhancers harboring disease-relevant SNPs, which were shown to affect IL-2 transcription. 

      The data from this study were also mined against past CRISPR screens that identified genes that control aspects of CD4 T cell activation. From these comparisons, novel genes were identified that function during T cell activation. 

      Weaknesses:

      A weakness of this study is that few individuals were analyzed, i.e., RNAseq and ATACseq (n=3) and HiC (n=2). Thus, the authors may have underestimated potentially relevant risk associations by their chromatin capture-based methodology. This might account for the low overlap of their data with the eQTL-based approach or the HIEI truth set. 

      Impact:

      This study indicates that defining distal chromatin interacting regions helps to identify distal genetic elements, including relevant variants, that contribute to gene activation. 

      We greatly appreciate the referee’s expert assessment of our work and emphasis on the value of our functional follow-up studies. We have ensured that all sample sizes, effect sizes, p values and FDR statistics are included in the figures and figure legends. We agree that including more donors for the HiC studies would increase the number of implicated variants and genes, however, all the chromatin-based V2G approaches described in our manuscript use relatively small sample sizes, but implicate more variants and genes than the comparable eQTL studies. I.e., the low overlap is not driven by a paucity of GWAS-chromatin-based associations. An alternative explanation for the low overlap between GWAS-chromatin-based approaches and eQTL approaches was recently by Pritchard and colleagues, who reported that GWAS and eQTL studies systematically implicate different types of variants (Mostafavi et al., Nature Genetics 2023). Among other differences, eQTL tend to implicate nearby genes while GWAS variants implicate distant genes, and our results support this contention. We referred to this study in the original version of the manuscript, but have included a more extensive discussion of potential explanations in the revised version. We thank the reviewer for helpful suggestions that have improved the quality of our study.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Ever-improving techniques allow the detailed capture of brain morphology and function to the point where individual brain anatomy becomes an important factor. This study investigated detailed sulcal morphology in the parieto-occipital junction. Using cutting-edge methods, it provides important insights into local anatomy, individual variability, and local brain function. The presented work advances the field and will stimulate future research into this important area.

      Strengths:

      Detailed, very thorough methodology. Multiple raters mapped detailed sulci in a large cohort. The identified sulcal features and their functional and behavioural relevance are then studied using various complementary methods. The results provide compelling evidence for the importance of the described sulcal features and their proposed relationship to cortical brain function.

      We thank the Reviewer for highlighting the strengths of our methods and findings.

      Weaknesses:

      A detailed description/depiction of the various sulcal patterns is missing.

      We agree that adding these details for the newly described sulci is necessary and have now done so. These details are included in the Results (Page 6):

      “Beyond characterizing the incidence of sulci, it is also common in the neuroanatomical literature to qualitatively characterize sulci on the basis of fractionation and intersection with surrounding sulci (termed “sulcal types”; for examples in other cortical expanses, see Chiavaras & Petrides, 2000; Drudik et al., 2023; Miller et al., 2021; Paus et al., 1996; Weiner et al., 2014; Willbrand, Parker, et al., 2022). All four sulci most commonly did not intersect with other sulci (see Supplementary Tables 1-4 for a summary of the sulcal types of the slocs and pAngs dorsal and ventral components). The sulcal types were also highly comparable between hemispheres (rs > .99 , ps < .001).”

      And in four new Supplementary Tables.

      A possible relationship between sulcal morphology and individual demographics might provide more insight into anatomical variability.

      We have conducted additional analyses to relate sulcal incidence to demographic features (age and gender). These results are included on Pages 5-6:

      “Given that sulcal incidence and patterning is also sometimes related to demographic features (Cachia et al., 2021; Leonard et al., 2009; Wei et al., 2017), subsequent GLMs relating the incidence and patterning of the three more variable sulci (slocs-d, pAngs-v, and pAngs-d) to demographic features (age and gender) revealed no associations for any sulcus (ps > .05).”

      The unique dataset offers an opportunity to provide insights into laterality effects that should be explored.

      We included hemisphere as a factor in all models for this exact reason. Throughout the paper, we have edited the text to ensure that these laterality effects are more apparent to readers.

      Further, we have a Supplementary Results section on hemispheric effects regarding the slocs-v, cSTS3, and lTOS:

      “Hemispheric asymmetries in morphological, architectural, and functional features with regards to the slocs-v, cSTS3, and lTOS comparison

      We observed a sulcus x metric x hemisphere interaction on the morphological and architectural features of the slocs-v (F(4.20, 289.81) = 4.16, η2 = 0.01, p = .002; the cSTS3 is discussed in the next section). Post hoc tests showed that this interaction was driven by  the slocs-v being cortically thinner in the left than the right hemisphere (p < .001; Fig. 2a).

      There was also a sulcus x network x hemisphere interaction on the functional connectivity profiles (using functional connectivity parcellations from (Kong et al., 2019)) of the slocs-v and lTOS (F(32, 2144) = 3.99, η2 = 0.06, p < .001; the cSTS3 is discussed in the next section). Post hoc tests showed that this interaction was driven by three effects: (i) the slocs-v overlapped more with the Default C subnetwork in the left than the right hemisphere (p = .013), (ii) the lTOS overlapped more with Visual A subnetwork in the right than the left hemisphere (p = .002), and (iii) the lTOS overlapped more with the Visual B subnetwork in the left than the right hemisphere (p = .002; Fig. 2b).”

      As well as the other STS rami on morphology:

      “It is also worth noting that there was a sulcus x metric x hemisphere interaction (F(4, 284.12) = 6.60, η2 = 0.08, p < .001). Post hoc tests showed that: (i) the cSTS3 was smaller (p < .001) and thinner (p = .025) in the left than the right hemisphere (Supplementary Fig. 8a), (ii) the cSTS2 was shallower (p = .004) and thicker (p < .001) in the right than left hemisphere (Supplementary Fig. 8a), and (iii) the cSTS1 was shallower (p < .001), smaller (p = .002), thinner (p = .001), and less myelinated (p < .001) in the left than the right hemisphere (Supplementary Fig. 8a).”

      And functional connectivity of the STS rami:

      “There was also a sulcus x network x hemisphere interaction (F(32, 2208) = 12.26, η2 = 0.15, p < .001). Post hoc tests showed differences for each cSTS component. Here, the cSTS1 overlapped more with the Auditory network (p < .001), less with the Control B subnetwork (p < .001), more with the Control C subnetwork (p < .001), less with the Default B subnetwork (p < .001), more with the Default C subnetwork (p < .001), more with the Ventral Attention B subnetwork (p < .001), and more with the Visual A subnetwork (p = .024) in the right than in the left hemisphere (Supplementary Fig. 8b). In addition, the cSTS2 overlapped more with the Control B subnetwork (p < .001), more with the Control C subnetwork (p < .001), less with the Default B subnetwork (p < .001), and less with the Temporal-Parietal network (p = .011) in the right than in the left hemisphere (Supplementary Fig. 8b). Finally, the cSTS3 overlapped more with the Control B subnetwork (p = .002), less with the Default B subnetwork (p = .014), more with the Default C subnetwork (p = .022), less with the Ventral Attention B subnetwork (p = .029) in the right than in the left hemisphere (Supplementary Fig. 8b).”

      Reviewer #2 (Public Review):

      Summary: After manually labeling 144 human adult hemispheres in the lateral parieto-occipital junction (LPOJ), the authors 1) propose a nomenclature for 4 previously unnamed highly variable sulci located between the temporal and parietal or occipital lobes, 2) focus on one of these newly named sulci, namely the ventral supralateral occipital sulcus (slocs-v) and compare it to neighboring sulci to demonstrate its specificity (in terms of depth, surface area, gray matter thickness, myelination, and connectivity), 3) relate the morphology of a subgroup of sulci from the region including the slocs-v to the performance in a spatial orientation task, demonstrating behavioral and morphological specificity. In addition to these results, the authors propose an extended reflection on the relationship between these newly named landmarks and previous anatomical studies, a reflection about the slocs-v related to functional and cytoarchitectonic parcellations as well as anatomic connectivity and an insight about potential anatomical mechanisms relating sulcation and behavior.

      Strengths:

      - To my knowledge, this is the first study addressing the variable tertiary sulci located between the superior temporal sulcus (STS) and intraparietal sulcus (IPS).

      - This is a very comprehensive study addressing altogether anatomical, architectural, functional and cognitive aspects.

      - The definition of highly variable yet highly reproducible sulci such as the slocs-v feeds the community with new anatomo-functional landmarks (which is emphasized by the provision of a probability map in supp. mat., which in my opinion should be proposed in the main body).

      - The comparison of different features between the slocs-v and similar sulci is useful to demonstrate their difference.

      - The detailed comparison of the present study with state of the art contextualizes and strengthens the novel findings.

      - The functional study complements the anatomical description and points towards cognitive specificity related to a subset of sulci from the LPOJ

      - The discussion offers a proposition of theoretical interpretation of the findings

      - The data and code are mostly available online (raw data made available upon request).

      We thank the Reviewer for highlighting the strengths of our methods, analyses, and applications of our findings.

      Weaknesses:

      - While three independent raters labeled all hemispheres, one single expert finalized the decision. Because no information is reported on the inter-rater variability, this somehow equates to a single expert labeling the whole cohort, which could result in biased labellings and therefore affect the reproducibility of the new labels.

      Our group does not use an approach amenable to calculating inter-rater agreements to expedite the process of defining thousands of sulci at the individual level in multiple regions. Our method consists of a two-tiered procedure. Here, authors YT and TG defined sulci which were then checked by a trained expert (EHW). These were then checked again by senior author  (KSW) . We emphasize that this process has produced reproducible anatomical results in other regions such as posteromedial cortex (Willbrand et al., 2023 Science Advances; Willbrand et al., 2023 Communications Biology; Maboudian et al., 2024 The Journal of Neuroscience), ventral temporal cortex (Weiner et al., 2014 NeuroImage; Miller et al., 2020 Scientific Reports; Parker et al., 2023 Brain Structure and Function), and lateral prefrontal cortex (Miller et al., 2021 The Journal of Neuroscience; Voorhies et al., 2021 Nature Communications; Yao et al., 2022 Cerebral Cortex; Willbrand et al., 2022 Brain Structure and Function; Willbrand et al., 2023 The Journal of Neuroscience) across age groups, species, and clinical populations. Further, in the Supplemental Materials we provide post mortem images showing that these sulci exist outside of cortical reconstructions, supporting this updated sulcal schematic of the lateral parieto-occipital junction. For the present study, by the time the final tier of our method was reached, we emphasize that a very small percentage (~2%) of sulcal definitions were actually modified. We will include an exact percentage in future publications in LPC/LOPJ.

      - 3 out of the 4 newly labeled sulci are only described in the very first part and never reused. This should be emphasized as it is far from obvious at first glance of the article.

      We have edited the Abstract (shown below, on Page 1) and paper throughout to emphasize the emphasis on the slocs-v over the other three sulci.

      “After defining thousands of sulci in a young adult cohort, we revised the previous LPC/LPOJ sulcal landscape to include four previously overlooked, small, shallow, and variable sulci. One of these sulci (ventral supralateral occipital sulcus, slocs-v) is present in nearly every hemisphere and is morphologically, architecturally, and functionally dissociable from neighboring sulci. A data-driven, model-based approach, relating sulcal depth to behavior further revealed that the morphology of only a subset of LPC/LPOJ sulci, including the slocs-v, is related to performance on a spatial orientation task.”

      It is worth noting that we have added additional analyses that include the other three newly-characterized sulci in response to Reviewer 1. We first described the relationship between these sulci and demographic features, alongside analyses on the patterning of these sulci, which are included in the Results (Page 6):

      “Beyond characterizing the incidence of sulci, it is also common in the neuroanatomical literature to qualitatively characterize sulci on the basis of fractionation and intersection with surrounding sulci (termed “sulcal types”; for examples in other cortical expanses, see Chiavaras & Petrides, 2000; Drudik et al., 2023; Miller et al., 2021; Paus et al., 1996; Weiner et al., 2014; Willbrand, Parker, et al., 2022). All four sulci most commonly did not intersect with other sulci (see Supplementary Tables 1-4 for a summary of the sulcal types of the slocs and pAngs dorsal and ventral components). The sulcal types were also highly comparable between hemispheres (rs > .99 , ps < .001). Though we characterize these sulci in this paper for the first time, the location of these four sulci is consistent with the presence of variable “accessory sulci” in this cortical expanse mentioned in prior modern and classic studies (Supplementary Methods). We could also identify these sulci in post-mortem hemispheres (Supplementary Figs. 2, 3), ensuring that these sulci were not an artifact of the cortical reconstruction process.

      Given that sulcal incidence and patterning is also sometimes related to demographic features (Cachia et al., 2021; Leonard et al., 2009; Wei et al., 2017), subsequent GLMs relating the incidence and patterning of the three more variable sulci (slocs-d, pAngs-v, and pAngs-d) to demographic features (age and gender) revealed no associations for any sulcus (ps > .05).  Finally, to help guide future research on these newly- and previously-classified LPC/LPOJ sulci, we generated probabilistic maps of each of these 17 sulci and share them with the field with the publication of this paper (Supplementary Fig. 6; Data availability).”

      - The tone of the article suggests a discovery of these 4 sulci when some of them have already been reported (as rightfully highlighted in the article), though not named nor studied specifically. This is slightly misleading as I interpret the first part of the article as a proposition of nomenclature rather than a discovery of sulci.

      We have toned down our language throughout the paper, emphasizing that this paper is updating the sulcal landscape of LPC/LOPJ taking into account these sulci that have not been comprehensively described previously. For example, in the Abstract (Page 1), we now write:

      “After defining thousands of sulci in a young adult cohort, we revised the previous LPC/LPOJ sulcal landscape to include four previously overlooked, small, shallow, and variable sulci. One of these sulci (ventral supralateral occipital sulcus, slocs-v) is present in nearly every hemisphere and is morphologically, architecturally, and functionally dissociable from neighboring sulci. A data-driven, model-based approach, relating sulcal depth to behavior further revealed that the morphology of only a subset of LPC/LPOJ sulci, including the slocs-v, is related to performance on a spatial orientation task. “

      - The article never mentions the concept of merging of sulcal elements and the potential effect it could have on the labeling of the newly named variable sulci.

      We emphasize that we use multiple surfaces (pial, inflated, smoothwm) to help distinguish intersecting sulci from one another. We include extra text in the Methods (Page 21):

      “We defined LPC/LPOJ sulci for each participant based on the most recent schematics of sulcal patterning by Petrides (2019) as well as pial, inflated, and smoothed white matter (smoothwm) FreeSurfer cortical surface reconstructions of each individual. In some cases, the precise start or end point of a sulcus can be difficult to determine on a surface (Borne et al., 2020); however, examining consensus across multiple surfaces allowed us to clearly determine each sulcal boundary in each individual. “

      Further, upon quantifying the patterning of these variable sulci, a majority of the time they are independent (described in the Results on Page 6):

      “Beyond characterizing the incidence of sulci, it is also common in the neuroanatomical literature to qualitatively characterize sulci on the basis of fractionation and intersection with surrounding sulci (termed “sulcal types”; for examples in other cortical expanses, see (Chiavaras & Petrides, 2000; Drudik et al., 2023; Miller et al., 2021; Paus et al., 1996; Weiner et al., 2014; Willbrand, Parker, et al., 2022). All four sulci most commonly did not intersect with other sulci (see Supplementary Tables 1-4 for a summary of the sulcal types of the slocs and pAngs dorsal and ventral components). The sulcal types were also highly comparable between hemispheres (rs > .99 , ps < .001).”

      Thus, merging sulcal elements likely had a minimal impact on the present definitions.

      - The definition of the new sulci is solely based on their localization relative to other sulci which are themselves variable (e.g. the 3rd branch of the STS can show different locations and different orientation, potentially affecting the definition of the slocs-v). This is not addressed in the discussion.

      As displayed in our probabilistic maps of these sulci (Supplementary Fig. 6), the cSTS components (2-4) are actually relatively consistent between individuals, and thus, future investigators can utilize these maps to help define these sulci in new hemispheres.

      Nevertheless, there is, of course, individual variability in the location of these sulci, and we do agree that this point brought up by the Reviewer is important. We have now added text to the Limitations section of the Discussion (Pages 15-16):

      “The main limitation of our study is that presently, the most accurate methodology to define sulci —especially the small, shallow, and variable PTS—requires researchers to manually trace each structure on the cortical surface reconstructions. This method is limited due to the individual variability of cortical sulcal patterning (Fig. 1, Supplementary Fig. 5), which makes it challenging to identify sulci, let alone PTS, without extensive experience and practice. However, we anticipate that our probabilistic maps  will provide a starting point and hopefully, expedite the identification of these sulci in new participants. This method is also arduous and time-consuming—which, on the one hand, limits the sample size in terms of number of participants, while on the other, results in thousands of precisely defined sulci. This push-pull  relationship reflects a broader conversation in the human brain mapping and cognitive neuroscience fields between a balance of large N studies and “precision imaging” studies in individual participants (Allen et al., 2022; Gratton et al., 2022; Naselaris et al., 2021; Rosenberg and Finn, 2022).”

      - The new sulci are only defined in terms of localization relative to other sulci, and no other property is described (general length, depth, orientation, shape...), making it hard for a new observer to take labeling decisions in case of conflict.

      To help guide future investigators, we now show these metrics for all sulci in Supplemental Figure 7 to help future groups identify these sulci with the assistance of their general morphology.

      - The very assertive tone of the article conveys the idea that these sulci are identifiable certainly in most cases, when by definition these highly variable tertiary sulci are sometimes very difficult to take decisions on.

      The highly variable nature of ¾ of the putative tertiary sulci (slocs-v, slocs-d, pAngs-v, pAngs-d) described here is why we focused on the slocs-v (as it is identifiable in nearly all f hemispheres). However, we have edited our language throughout the text to also emphasize the variability of these sulci. For example, in the Results (Page 5), we now write:

      “In previous research in small sample sizes, neuroanatomists noticed shallow sulci in this cortical expanse (Supplementary Methods and Supplementary Figs. 1-4 for historical details). In the present study, we fully update this sulcal landscape considering these overlooked indentations. In addition to defining the 13 sulci previously described within the LPC/LPOJ, as well as the posterior superior temporal cortex (Methods) (Petrides, 2019) in individual participants, we could also identify as many as four small and shallow PTS situated within the LPC/LPOJ that were highly variable across individuals and uncharted until now (Supplementary Methods and Supplementary Figs. 1-4). Macroanatomically, we could identify two sulci between the cSTS3 and the IPS-PO/lTOS ventrally and two sulci between the cSTS2 and the pips/IPS dorsally. We focus our analyses on the slocs-v since it was identifiable in nearly every hemisphere.”

      - I am not absolutely convinced with the labeling proposed of a previously reported sulcus, namely the posterior intermediate parietal sulcus.

      In defining previously-identified LPC sulci, we followed the previous labeling procedure by Petrides (2019) alongside historical definitions (detailed in Supplementary Figures 1-4). Nevertheless, future deep learning algorithms using these and others data can be used to rectify discrepancies in labeling (e.g., Borne et al., 2020 Medical Image Analysis; Lyu et al., 2021 NeuroImage). We discuss these points in the Limitations section of the Discussion (Pages 16-17):

      “The main limitation of our study is that presently, the most accurate methodology to define sulci —especially the small, shallow, and variable PTS—requires researchers to manually trace each structure on the cortical surface reconstructions. This method is limited due to the individual variability of cortical sulcal patterning (Fig. 1, Supplementary Fig. 5), which makes it challenging to identify sulci without extensive experience and practice. However, we anticipate that our probabilistic maps  will provide a starting point and hopefully, expedite the identification of these sulci in new participants. This should accelerate the process of subsequent studies confirming the accuracy of our updated schematic of LPC/LOPJ. This manual method is also arduous and time-consuming, which, on the one hand, limits the sample size in terms of number of participants, while on the other, results in thousands of precisely defined sulci. This push-pull relationship reflects a broader conversation in the human brain mapping and cognitive neuroscience fields between a balance of large N studies and “precision imaging” studies in individual participants (Allen et al., 2022; Gratton et al., 2022; Naselaris et al., 2021; Rosenberg & Finn, 2022). Though our sample size is comparable to other studies that produced reliable results relating sulcal morphology to brain function and cognition (e.g., (Cachia et al., 2021; Garrison et al., 2015; Lopez-Persem et al., 2019; Miller et al., 2021; Roell et al., 2021; Voorhies et al., 2021; Weiner, 2019; Willbrand, Parker, et al., 2022; Willbrand, Voorhies, et al., 2022; Yao et al., 2022), ongoing work that uses deep learning algorithms to automatically define sulci should result in much larger sample sizes in future studies (Borne et al., 2020; Lyu et al., 2021). Finally, the time-consuming manual definitions of primary, secondary, and PTS also limit the cortical expanse explored in each study, thus, restricting the present study to LPC/LPOJ. “

      Assuming that the labelling of all sulci reported in the article is reproducible, the different results are convincing and in general, this study achieves its aims in defining more precisely the sulcation of the LPOJ and looking into its functional/cognitive value. This work clearly offers a finer understanding of sulcal pattern in this region, and lacks only little for the new markers to be convincingly demonstrated. An overall coherence of the labelling can still be inferred from the supplementary material which support the results and therefore the conclusions, yet, addressing some of the weaknesses listed above would greatly enhance the impact of this work. This work is important to the understanding of sulcal variability and its implications on functional and cognitive aspects.

      We thank the Reviewer for their positive remarks on the implications of this work.

      Reviewer #3 (Public Review):

      Summary: 72 subjects, and 144 hemispheres, from the Human Connectome Project had their parietal sulci manually traced. This identified the presence of previously undescribed shallow sulci. One of these sulci, the ventral supralateral occipital sulcus (slocs-v), was then demonstrated to have functional specificity in spatial orientation. The discussion furthermore provides an eloquent overview of our understanding of the anatomy of the parietal cortex, situating their new work into the broader field. Finally, this paper stimulates further debate about the relative value of detailed manual anatomy, inherently limited in participant numbers and areas of the brain covered, against fully automated processing that can cover thousands of participants but easily misses the kinds of anatomical details described here.

      Strengths:

      - This is the first paper describing the tertiary sulci of the parietal cortex with this level of detail, identifying novel shallow sulci and mapping them to behaviour and function.

      - It is a very elegantly written paper, situating the current work into the broader field.

      - The combination of detailed anatomy and function and behaviour is superb.

      We thank the Reviewer for their positive remarks on paper and our findings.

      Weaknesses:

      - The numbers of subjects are inherently limited both in number as well as in typically developing young adults.

      We emphasize that the sample size is limited due to the arduous nature of manually defining sulci; however, we provide probabilistic maps with the publication of this work to help expedite this process for future investigators. Further, with improved deep learning algorithms, the sample sizes in future neuroanatomical studies should be enhanced. We discuss these points in the Limitations section of the Discussion (Pages 16-17):

      “The main limitation of our study is that presently, the most accurate methodology to define sulci —especially the small, shallow, and variable PTS—requires researchers to manually trace each structure on the cortical surface reconstructions. This method is limited due to the individual variability of cortical sulcal patterning (Fig. 1, Supplementary Fig. 5), which makes it challenging to identify sulci without extensive experience and practice. However, we anticipate that our probabilistic maps  will provide a starting point and hopefully, expedite the identification of these sulci in new participants. This should accelerate the process of subsequent studies confirming the accuracy of our updated schematic of LPC/LOPJ. This manual method is also arduous and time-consuming, which, on the one hand, limits the sample size in terms of number of participants, while on the other, results in thousands of precisely defined sulci. This push-pull relationship reflects a broader conversation in the human brain mapping and cognitive neuroscience fields between a balance of large N studies and “precision imaging” studies in individual participants (Allen et al., 2022; Gratton et al., 2022; Naselaris et al., 2021; Rosenberg & Finn, 2022). Though our sample size is comparable to other studies that produced reliable results relating sulcal morphology to brain function and cognition (e.g., (Cachia et al., 2021; Garrison et al., 2015; Lopez-Persem et al., 2019; Miller et al., 2021; Roell et al., 2021; Voorhies et al., 2021; Weiner, 2019; Willbrand, Parker, et al., 2022; Willbrand, Voorhies, et al., 2022; Yao et al., 2022), ongoing work that uses deep learning algorithms to automatically define sulci should result in much larger sample sizes in future studies (Borne et al., 2020; Lyu et al., 2021). The time-consuming manual definitions of primary, secondary, and PTS also limit the cortical expanse explored in each study, thus restricting the present study to LPC/LPOJ.”

      - While the paper begins by describing four new sulci, only one is explored further in greater detail.

      Due to the increased variability of three of the four newly-classified sulci, we chose to only focus on the slocs-v given that it was present in nearly all hemispheres. In response to other reviewers, we have conducted additional analyses that also describe these new sulci and potential factors related to their incidence (Page 6):

      “Given that sulcal incidence and patterning is also sometimes related to demographic features (Cachia et al., 2021; Leonard et al., 2009; Wei et al., 2017), subsequent GLMs relating the incidence and patterning of the three more variable sulci (slocs-d, pAngs-v, and pAngs-d) to demographic features (age and gender) revealed no associations for any sulcus (ps > .05).”

      In addition, given that sulcal variability is cognitively (e.g., Amiez et al., 2018 Scientific Reports; Cachia et al., 2021 Frontiers in Neuroanatomy; Garrison et al., 2015 Nature Communications; Willbrand et al., 2022, 2023 Brain Structure & Function), anatomically (e.g., Amiez et al., 2021 Communications Biology; Vogt et al., 1995 Journal of Comparative Neurology), functionally (e.g., Lopez Persem et al., 2019 The Journal of Neuroscience), and translationally (e.g., Yucel et al., 2002 Biological Psychiatry) relevant, future research can investigate these relationships regarding the slocs-d and pAngs components. We have added text to the Limitations section of the Discussion (Pages 17-18) to discuss this:

      “Finally, although we did not focus on the relationship between the other three PTS (slocs-d, pAngs-v, and pAngs-d) to anatomical and functional features of LPC and cognition, given that variability in sulcal incidence is cognitively (Amiez et al., 2018; Cachia et al., 2021; Garrison et al., 2015; Willbrand, Jackson, et al., 2023; Willbrand, Voorhies, et al., 2022), anatomically (Amiez et al., 2021; Vogt et al., 1995), functionally (Lopez-Persem et al., 2019), and translationally (Clark et al., 2010; Le Provost et al., 2003; Meredith et al., 2012; Nakamura et al., 2020; Yücel et al., 2002, 2003) relevant, future work can also examine the relationship between the more variable slocs-d, pAngs-v, and pAngs-d and these features.”

      - There is some tension between calling the discovered sulci new vs acknowledging they have already been reported, but not named.

      We have edited the manuscript throughout to emphasize our primary focus on revising the LPC/LOPJ sulcal landscape to include these often overlooked small, shallow, and variable putative tertiary sulci, rather than using the terms “discovered sulci” and “new.”

      - The anatomy of the sulci, as opposed to their relation to other sulci, could be described in greater detail.

      Beyond the radar plots in the main text which compare specific groupings of sulci, we now show the morphological metrics for all sulci investigated in the present work in Supplemental Figure 7.

      Overall, to summarize, I greatly enjoyed this paper and believe it to be a highly valued contribution to the field.

      We are glad the Reviewer enjoyed reading our paper and thank them for their positive thoughts on the potential impact of this work on the field.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The slocs-v is found in 71 subjects left and right. Is that the same subject?

      No, these are different subjects.

      (2) How were the 72 subjects chosen?

      The subjects were randomly selected from the HCP database as describe in the methods (Page 18):

      “Here, we used 72 randomly-selected participants, balanced for gender (following the terminology of the HCP data dictionary), from the HCP database (50% female, 22-36 years old, and 90% right-handed; there was no effect of handedness on our behavioral tasks; Supplementary materials) that were also analyzed in several prior studies (Hathaway et al., 2023; Miller et al., 2021, 2020; Willbrand et al., 2023b, 2023c, 2022a).”

      (3) Are there effects of laterality on sulcal pattern? Table?

      We now include sulcal pattern results in the Results section and Supplementary Materials; although there were no laterality effects regarding the sulcal pattern .

      (4) Depiction/description of common sulcal patterns

      We now include sulcal pattern results in the Results section and Supplementary Materials.

      (5) Is there a relationship between sulcal patterns and demographic features?

      We now include analyses on this in the Results section. There is no relationship between sulcal patterns and demographic features.

      (6) Just for clarity, the sulcal features are studied and extracted in native space?

      Yes, sulcal features are studied and extracted in native space, as described in the Methods section (Page 19):

      “Anatomical T1-weighted (T1-w) MRI scans (0.8 mm voxel resolution) were obtained in native space from the HCP database. Reconstructions of the cortical surfaces of each participant were generated using FreeSurfer (v6.0.0), a software package used for processing and analyzing human brain MRI images (surfer.nmr.mgh.harvard.edu) (Dale et al., 1999; Fischl et al., 1999). All subsequent sulcal labeling and extraction of anatomical metrics were calculated from these native space reconstructions generated through the HCP’s version of the FreeSurfer pipeline (Glasser et al., 2013).”

      (7) The authors use "Gender". Are they referring to biological sex (female/male) or socially defined characteristics (man/woman etc.)?

      The term gender is referred to socially defined characteristics, as used by the HCP data dictionary (Methods page 18):

      “Here, we used 72 randomly-selected participants, balanced for gender (following the terminology of the HCP data dictionary), from the HCP database (50% female, 22-36 years old, and 90% right-handed; there was no effect of handedness on our behavioral tasks; Supplementary materials) that were also analyzed in several prior studies (Hathaway et al., 2023; Miller et al., 2021, 2020; Willbrand et al., 2023b, 2023c, 2022a).”

      (8) Fig 2. Grey is poorly visible compared to green and blue.

      The shade of gray has been edited to be more distinguishable.

      (9) The relationship between behavior and sulcal features is significant but weak.

      We acknowledge that the morphological-behavioral relationship identified in the present study explains a modest amount of variance; however, the more important aspect of the finding is that multiple sulci identified in the model are recently-characterized sulci in LPC/LOPJ identified by our group and others (Petrides, 2019), and thus, the relationship would have been overlooked or lost if these sulci were not identified. We have added text to the Limitations section of the Discussion (Pages 17-18) to emphasize this point:

      “It is also worth noting that the morphological-behavioral relationship identified in the present study explains a modest  amount of variance; however, the more important aspect of our findings is that multiple sulci identified in our model-based approach are recently-characterized sulci in LPC/LOPJ identified by our group and others (Petrides, 2019), and thus, the relationship would have been overlooked or lost if these sulci were not identified. “

      (10) The Limitation section could be expanded.

      We have added additional text to flesh out the Limitations section of the Discussion (Pages 17-18):

      “It is also worth noting that the morphological-behavioral relationship identified in the present study explains a modest  amount of variance; however, the more important aspect of our findings is that multiple sulci identified in our model-based approach are recently-characterized sulci in LPC/LOPJ identified by our group and others (Petrides, 2019), and thus, the relationship would have been overlooked or lost if these sulci were not identified. Finally, although we did not focus on the relationship between the other three PTS (slocs-d, pAngs-v, and pAngs-d) to anatomical and functional features of LPC and cognition, given that variability in sulcal incidence is cognitively (Amiez et al., 2018; Cachia et al., 2021; Garrison et al., 2015; Willbrand, Jackson, et al., 2023; Willbrand, Voorhies, et al., 2022), anatomically (Amiez et al., 2021; Vogt et al., 1995), functionally (Lopez-Persem et al., 2019), and translationally (Clark et al., 2010; Le Provost et al., 2003; Meredith et al., 2012; Nakamura et al., 2020; Yücel et al., 2002, 2003) relevant, future work can also examine the relationship between the more variable slocs-d, pAngs-v, and pAngs-d and these features. “

      Reviewer #2 (Recommendations For The Authors):

      First, I would like to thank the authors for their important contribution to the field of sulcal studies and anatomo-functional correlates. My main comments about the work are treated in the public review, and I will only address details in this section. I have detected a number of typos which are harder to report from a document in which lines are not numbered. Could you please submit a numbered document for the next iteration?

      - p2. "hominoid-specific, shallow indentations, or sulci" - can lead to misunderstanding that sulci are hominoid-specific and shallow

      Sentence has been rewritten:

      “Of all the neuroanatomical features to target, recent work shows that morphological features of the shallower, later developing, hominoid-specific indentations of the cerebral cortex (also known as putative tertiary sulci, PTS) are not only functionally and cognitively meaningful, but also are particularly impacted by multiple brain-related disorders and aging (Amiez et al., 2019, 2018; Ammons et al., 2021; Cachia et al., 2021; Fornito et al., 2004; Garrison et al., 2015; Harper et al., 2022; Hathaway et al., 2023; Lopez-Persem et al., 2019; Miller et al., 2021, 2020; Nakamura et al., 2020; Parker et al., 2023; Voorhies et al., 2021; Weiner, 2019; Willbrand et al., 2023b, 2023c, 2022a, 2022b; Yao et al., 2022).”

      - p2. next sentence (starting with "The combination [...]": not clear that you are addressing tertiary sulci here, maybe introduce the concept beforehand?

      The previous sentence (just above) has been edited to introduce putative tertiary sulci beforehand.

      - p5. error in numbering of sulci relative to Fig1. (5,6,7,8 -> 6,7,8,9)

      Sulcal numbering has been fixed.

      -p5. reference to supp mat -> I would have expected the nomenclature used in Borne et al. 2020 to be discussed alongside with the state of the art. How would you relate F.I.P.r.int.1 and F.I.P.r.int.2 to the sulci you describe?

      We thank the Reviewer for bringing up this relevant literature. The F.I.P.r.int. 1 and 2 are described as rami of the IPS, whereas the slocs and pAngs are independent, small indentations near the IPS, but not part of the complex. Nevertheless, future work should integrate these two schematics together to establish the most comprehensive sulcal map of LPC/LOPJ. We have added text to the Supplementary Methods detailing the differences between the F.I.P.r.int.1 and F.I.P.r.int.2 and slocs-/pAngs:

      “slocs/pAng vs. F.I.P.r.int.1 and F.I.P.r.int.2

      Recent work (Borne et al., 2020; Perrot et al., 2011) identified two intermediate rami of the IPS (F.I.P.r.int.1 and F.I.P.r.int.2) that were not defined in the present investigation. Crucially, the newly classified sulci here (slocs and pAngs) are distinguishable from the two F.I.P.r.int. in that the F.I.P.r.int. are branches coming off the main body of the IPS (Borne et al., 2020; Perrot et al., 2011), whereas the slocs/pAngs are predominantly non-intersecting (“free”) structures that never intersected with the IPS (Supplementary Tables 1-4).”

      - p6. Fig 1.a. labelling discrepancy between line 1 and 2, column 4: the labels 10 and 11 from the inflated hemisphere do not match the labels 10 and 11 in the pial surface. Fig 1.b. swapped label 2 and 3 in the 4th hemisphere

      These aspects of Figure 1 have been edited accordingly.

      - p7. "(iii) the slocs-v was thicker than both the cSTS3 and lTOS" -> the slocs-v showed thicker gray matter?

      The sentence has been adjusted (Page 7):

      “(iii) the slocs-v showed thicker gray matter than both the cSTS3 and lTOS (ps < .001), “

      - p9. Six left hemisphere LPC/LPOJ sulci were related to spatial orientation task performance -> missing

      Fixed (Page 9):

      “Six left hemisphere LPC/LPOJ sulci were related to spatial orientation task performance (Fig. 3a, b). “

      - p14. "Steel and colleagues" -> missing space

      Fixed (Page 14):

      “Furthermore, the slocs-v appears to lie at the junction of scene-perception and place-memory activity (a transition that also consistently co-localizes with the HCP-MMP area PGp) as identified by Steel and colleagues (2021).”

      - p20. Probability maps "we share these maps with the field" -> specify link to data availability

      The link to data availability has been added (Page 21):

      “To aid future studies interested in investigating LPC/LPOJ sulci, we share these maps with the field (Data availability). “

      Reviewer #3 (Recommendations For The Authors):

      No detailed recommendations not already present in the rest of the review.

    1. Author response:

      eLife assessment

      Cav2 voltage-gated calcium channels play key roles in regulating synaptic strength and plasticity. In contrast to mammals, invertebrates like Drosophila encode a single Cav2 channel, raising questions on how diversity in Cav2 is achieved from a single gene. Here, the authors present convincing evidence that two alternatively spliced isoforms of the Cac gene (cacophony, also known as Dmca1A and nightblindA) enable diverse changes in Cav2 expression, localization, and function in synaptic transmission and plasticity. These valuable findings will be of interest to a variety of researchers.

      We suggest replacing “two alternatively spliced isoforms of the Cac gene” by “two alternatively spliced mutually exclusive exon pairs of the Cac gene”. 

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript by Bell et. al. describes an analysis of the effects of removing one of two mutually exclusive splice exons at two distinct sites in the Drosophila CaV2 calcium channel Cacophony (Cac). The authors perform imaging and electrophysiology, along with some behavioral analysis of larval locomotion, to determine whether these alternatively spliced variants have the potential to diversify Cac function in presynaptic output at larval neuromuscular junctions. The author provided valuable insights into how alternative splicing at two sites in the calcium channel alters its function.

      Strengths:

      The authors find that both of the second alternatively spliced exons (I-IIA and I-IIB) that are found in the intracellular loop between the 1st and 2nd set of transmembrane domains can support Cac function. However, loss of the I-IIB isoform (predicted to alter potential beta subunit interactions) results in 50% fewer channels at active zones and a decrease in neurotransmitter release and the ability to support presynaptic homeostatic potentiation. Overall, the study provides new insights into Cac diversity at two alternatively spliced sites within the protein, adding to our understanding of how regulation of presynaptic calcium channel function can be regulated by splicing.

      Weaknesses:

      The authors find that one splice isoform (IS4B) in the first S4 voltage sensor is essential for the protein's function in promoting neurotransmitter release, while the other isoform (IS4A) is dispensable. The authors conclude that IS4B is required to localize Cac channels to active zones. However, I find it more likely that IS4B is required for channel stability and leads to the protein being degraded, rather than any effect on active zone localization. More analysis would be required to establish that as the mechanism for the unique requirement for IS4B.

      We agree that we need to explain more clearly why IS4B is unlikely required for channel stability, but instead, likely has a unique function at the presynaptic active zone of fast synapses. We will address this by revising text and by providing additional data. If IS4B was required for evoked release because it supported channel protein stability, then the removal of IS4B should cause protein degradation throughout all sub-neuronal compartments and throughout the CNS, but this is not the case. First, upon removal of IS4B in adult motoneurons (which use cac channels at the presynapse and somatodendritically, Ryglewski et al., 2012) evoked release from axon terminals is abolished (as at the larval NMJ), but somatodendritic cac inward current is present. If IS4B was required for cac channel stability, somatodendritic current should also be abolished. We will add these data to the ms. Second, immunohistochemistry for tagged IS4B channels reveals that these are present not only at presynaptic active zones at the NMJ but also throughout the VNC motor neuropils. Excision of IS4B causes the absence of cac channels from the presynaptic active zones at the NMJ and throughout the VNC neuropils (and accordingly this is lethal). By contrast, tagged IS4A channels (with IS4B excised) are not found at the presynaptic terminals of fast synapses, but instead, in other distinct parts of the CNS. We will also provide data to show this. Together these data are in line with a unique requirement of IS4B at presynaptic active zones (not excluding additional functions of IS4B), whereas IS4A containing cac isoforms mediate different functions.

      We appreciate the additional reviewer suggestions to the authors that we will address point by point when revising the ms. 

      Reviewer #2 (Public Review):

      This study by Bell et al. focuses on understanding the roles of two alternatively spliced exons in the single Drosophila Cav2 gene cac. The authors generate a series of cac alleles in which one or the other mutually exclusive exons are deleted to determine the functional consequences at the neuromuscular junction. They find alternative splicing at one exon encoding part of the voltage sensor impacts the activation voltage as well as localization to the active zone. In contrast, splicing at the second exon pair does not impact Cav2 channel localization, but it appears to determine the abundance of the channel at active zones. Together, the authors propose that alternative splicing at the Cac locus enables diversity in Cav2 function generated through isoform diversity generated at the single Cav2 alpha subunit gene encoded in Drosophila.

      Overall this is an excellent, rigorously validated study that defines unanticipated functions for alternative splicing in Cav2 channels. The authors have generated an important toolkit of mutually exclusive Cac splice isoforms that will be of broad utility for the field, and show convincing evidence for distinct consequences of alternative splicing of this single Cav2 channel at synapses. Importantly, the authors use electrophysiology and quantitative live sptPALM imaging to determine the impacts of Cac alternative splicing on synaptic function. There are some outstanding questions regarding the mechanisms underlying the changes in Cac localization and function, and some additional suggestions are listed below for the authors to consider in strengthening this study. Nonetheless, this is a compelling investigation of alternative splicing in Cav2 channels that should be of interest to many researchers.

      We agree that some additional information on cac isoform localization (in particular for splicing at the IS4 site) will strengthen the manuscript. We will address this by providing additional data and revising text (see responses to reviewers 1 and 3). We are also grateful for the additional reviewer suggestions which we will address point by point when revising the ms.  

      Reviewer #3 (Public Review):

      Summary:

      Bell and colleagues studied how different splice isoforms of voltage-gated CaV2 calcium channels affect channel expression, localization, function, synaptic transmission, and locomotor behavior at the larval Drosophila neuromuscular junction. They reveal that one mutually exclusive exon located in the fourth transmembrane domain encoding the voltage sensor is essential for calcium channel expression, function, active zone localization, and synaptic transmission. Furthermore, a second mutually exclusive exon residing in an intracellular loop containing the binding sites for Caβ and G-protein βγ subunits promotes the expression and synaptic localization of around ~50% of CaV2 channels, thereby contributing to ~50% of synaptic transmission. This isoform enhances release probability, as evident from increased short-term depression, is vital for homeostatic potentiation of neurotransmitter release induced by glutamate receptor impairment, and promotes locomotion. The roles of the two other tested isoforms remain less clear.

      Strengths:

      The study is based on solid data that was obtained with a diverse set of approaches. Moreover, it generated valuable transgenic flies that will facilitate future research on the role of calcium channel splice isoforms in neural function.

      Weaknesses:

      (1) Based on the data shown in Figures 2A-C, and 2H, it is difficult to judge the localization of the cac isoforms. Could they analyze cac localization with regard to Brp localization (similar to Figure 3; the term "co-localization" should be avoided for confocal data), as well as cac and Brp fluorescence intensity in the different genotypes for the experiments shown in Figure 2 and 3 (Brp intensity appears lower in the dI-IIA example shown in Figure 3G)? Furthermore, heterozygous dIS4B imaging data (Figure 2C) should be quantified and compared to heterozygous cacsfGFP/+.

      We understand the reviewer’s comment and will do the following to convincingly demonstrate absence of cac from presynaptic active zones upon IS4B excision. First, we will show selective enlargements of IS4A and IS4B with Brp in presynaptic active zones to show distinct cac label in active zones following excision of IS4A but not following excision of IS4B. Second, we will provide Pearson’s co-localization coefficients of Brp with IS4B and with IS4A, respectively. Third, we will reduce the intensity of the green channels in figures 2C and 2H to the same levels as in 2A and B, and H control to allow a fair comparison of cac intensities following excision of IS4B versus excision of IS4A and control. We had increased intensity to show that following excision of IS4B, no distinct cac label is found in active zones, even at high exaggerated image brightness. However, we agree with the reviewer that the bright background hampers interpretation and thus will show the same intensity in all images that need to be compared.

      (2) They conclude that I-II splicing is not required for cac localization (p. 13). However, cac channel number is reduced in dI-IIB. Could the channels be mis-localized (e.g., in the soma/axon)? What is their definition of localization? Could cac be also mis-localized in dIS4B? Furthermore, the Western Blots indicate a prominent decrease in cac levels in dIS4B/+ and dI-IIB (Figure 1D). How do the decreased protein levels seen in both genotypes fit to a "localization" defect? Could decreased cac expression levels explain the phenotypes alone?

      We will precisely define channel localization, and we will explain why it is highly unlikely that the absence of IS4B channels as well as the lower number of I-IIA channels are simply a consequence of reduced expression, but instead of splice variant specific channel function and localization. For example, upon excision of IS4B no cac channels are found at the presynaptic active zones and these synapses are thus non-functional. The isoforms containing the mutually exclusive IS4A exon are expressed and mediate other functions (see also response to reviewer 1) but cannot substitute IS4B containing isoforms at the presynapse. In fact, our Western blots are in line with reduced cac expression if all isoforms that mediate evoked release are missing, again indicating that the presynapse specific cac isoforms cannot be replaced by other cac isoforms (see also below, response to (3)). Feedback mechanisms that regulate cac expression in the absence of presynapse specific cac isoforms are beyond the scope of this study.

      (3) Cac-IS4B is required for Cav2 expression, active zone localization, and synaptic transmission. Similarly, loss of cac-I-IIB reduces calcium channel expression and number. Hence, the major phenotype of the tested splice isoforms is the loss of/a reduction in Cav2 channel number. What is the physiological role of these isoforms? Is the idea that channel numbers can be regulated by splicing? Is there any data from other systems relating channel number regulation to splicing (vs. transcription or post-transcriptional regulation)?

      We will provide additional evidence that mutually exclusive splicing at the IS4 site results in cac channels that localize to the presynaptic active zone (IS4B) versus cac channels that localize to other brain parts and/or other subneuronal compartments (see response to reviewer 1).  In addition, we already show in figure 2J that IS4B is required for normal cac HVA current, and we can add data showing that IS4A is not essential for cac HVA current. Similarly, for I-II we find it unlikely that differential splicing regulates channel numbers, but rather splice variant specific functions in different brain parts and different sub-neuronal compartments. To substantiate this interpretation, we will add data from developing adult motoneurons showing that excision of I-IIA causes reduced activity induced calcium influx into dendrites (new data), but it does not reduce channel number at the larval NMJ (figure 4). In our opinion these data are not in line with the idea that splicing regulates cac expression levels, and this in turn, results in specific defects in distinct neuronal compartments. However, we agree that the lack of isoforms with specific functions results in altered overall cac expression levels as indicated by our Western data. If isoforms normally abundantly expressed throughout most neuropils are missing due to exon excision, we indeed find less cac protein in Westerns. By contrast, the lack of isoforms with little abundance has little effect on cac expression levels. This may be the results of unknown feedback mechanisms which are beyond the scope of this study.

      (4) Although not supported by statistics, and as appreciated by the authors (p. 14), there is a slight increase in PSC amplitude in dIS4A mutants (Figure 2). Similarly, PSC amplitudes appear slightly larger (Figure 3J), and cac fluorescence intensity is slightly higher (Figure 3H) in dI-IIA mutants. Furthermore, cac intensity and PSC amplitude distributions appear larger in dI-IIA mutants (Figures 3H, J), suggesting a correlation between cac levels and release. Can they exclude that IS4A and/or I-IIA negatively regulate release? I suggest increasing the sample size for Canton S to assess whether dIS4A mutant PSCs differ from controls (Figure 2E). Experiments at lower extracellular calcium may help reveal potential increases in PSC amplitude in the two genotypes (but are not required). A potential increase in PSC amplitude in either isoform would be very interesting because it would suggest that cac splicing could negatively regulate release.

      There are several possibilities to explain this, but as none of the effects are statistically significant, we prefer to not investigate this in depth. However, given that we cannot find IS4A at the presynaptic active zone, IS4A is unlikely to have a direct negative effect on release probability. Nonetheless, given that IS4A containing cac isoforms mediate functions in other neuronal compartments it may regulate release indirectly by affecting action potential shape. We will provide data in response to the more detailed suggestions to authors that will provide additional insight.

      (5) They provide compelling evidence that IS4A is required for the amplitude of somatic sustained HVA calcium currents. However, the evidence for effects on biophysical properties and activation voltage (p. 13) is less convincing. Is the phenotype confined to the sustained phase, or are other aspects of the current also affected (Figure 2J)? Could they also show the quantification of further parameters, such as CaV2 peak current density, charge density, as well as inactivation kinetics for the two genotypes? I also suggest plotting peak-normalized HVA current density and conductance (G/Gmax) as a function of Vm. Could a decrease in current density due to decreased channel expression be the only phenotype? How would changes in the sustained phase translate into altered synaptic transmission in response to AP stimulation?

      Most importantly, HVA current is mostly abolished upon excision of IS4B (not IS4A, we think the reviewer accidentally mixed up the genotype). This indicates that the cac isoforms that mediate evoked release encode HVA channels. However, the somatodendritic current shown in figure 2J that remains upon excision of IS4B is mediated by IS4A containing cac isoforms. Please note that these never localize to the presynaptic active zone, thus the small inactivating HVA that remains in figure 2J does normally not mediate evoked release. Therefore, the interpretation is that specifically HVA current encoded by IS4B cac isoforms is required for synaptic transmission. Reduced cac current density is not the cause for this phenotype because a specific current component is absent. 

      We agree with the reviewer that a deeper electrophysiological analysis of cac currents mediated by IS4B containing isoforms will be instructive. However, a precise analysis of activation and inactivation voltages and kinetics suffers form space clamp issues in recordings from the soma of such complex neurons (DLM motoneurons of the adult fly). Therefore, we will analyze the currents in a heterologous expression system and present these data to the scientific community as a separate study at a later time point.

      (6) Why was the STED data analysis confined to the same optical section, and not to max. intensity z-projections? How many and which optical sections were considered for each active zone? What were the criteria for choosing the optical sections? Was synapse orientation considered for the nearest neighbor Cac - Brp cluster distance analysis? How do the nearest-neighbor distances compare between "planar" and "side-view" Brp puncta?

      Max. z-projections would be imprecise because they can artificially suggest close proximity of label that is close in x and y but far away in z. Therefore, the analysis was executed in xy-direction of various planes of entire 3D image stacks. We considered active zones of different orientations (Fig. 4C, D). In fact, we searched the entire z-stacks until we found active zones of all orientations shown in figures 4C1-C6 within the same boutons. The same active zone orientations were analyzed for all exon-out mutants with cac localization in active zones. The distance between cac and brp did not change if viewed from the side.

      (7) Cac clusters localize to the Brp center (e.g., Liu et al., 2011). They conclude that Cav2 localization within Brp is not affected in the cac variants (p. 8). However, their analysis is not informative regarding a potential offset between the central cac cluster and the Brp "ring". Did they/could they analyze cac localization with regard to Brp ring center localization of planar synapses, as well as Brp-ring dimensions?

      In the top views (planar) we did not find any clear offset in cac orientation to brp between genotypes. This study focuses on cac splice isoform specific localization and function. Possible effects of different cac isoforms on Brp-ring dimensions or other aspects of scaffold structure are not central to our study, in particular given that Brp puncta are clearly present even if cac is absent from the synapse (Fig. 2H), indicating that cac is not instructive for the formation of the Brp scaffold.  

      (8) Given the accelerated PSC decay/ decreased half width in dI-IIA (Fig. 5Q), I recommend reporting PSC charge in Figure 3, and PPR charge in Figures 5A-D. The charge-based PPRs of dI-IIA mutants likely resemble WT more closely than the amplitude-based PPR. In addition, miniature PSC decay kinetics should be reported, as they may contribute to altered decay kinetics. How could faster cac inactivation kinetics in response to single AP stimulation result in a decreased PSC half-width? Is there any evidence for an effect of calcium current inactivation on PSC kinetics? On a similar note, is there any evidence that AP waveform changes accelerate PSC kinetics? PSC decay kinetics are mainly determined by GluR decay kinetics/desensitization. The arguments supporting the role of cac splice isoforms in PSC kinetics outlined in the discussion section are not convincing and should be revised.

      We agree that reporting charge in figure 3 will be informative and will do so. We also understand the reviewer’s concern attributing altered PSC kinetics to presynaptic cac channel properties. We will tone down our interpretation in the discussion and list possible alterations in presynaptic AP shape or Cav2 channel kinetics as alternative explanations (not conclusions). Moreover, we will quantify postsynaptic GluRIIA abundance to test whether altered PSC kinetics are caused by altered GluRIIA expression. In our opinion, the latter is more instructive than mini decay kinetic analysis because this depends strongly on the distance of the recording electrode to the actual site of transmission in these large muscle cells.

      (9) Paired-pulse ratios (PPRs): On how many sweeps are the PPRs based? In which sequence were the intervals applied? Are PPR values based on the average of the second over the first PSC amplitudes of all sweeps, or on the PPRs of each sweep and then averaged? The latter calculation may result in spurious facilitation, and thus to the large PPRs seen in dI-IIB mutants (Kim & Alger, 2001; doi: 10.1523/JNEUROSCI.21-24-09608.2001).

      We agree that the PP protocol and analyses have to be described more precisely in the methods, and we will do so. PPR values are based on the PPRs of each sweep and then averaged. We are aware of the study of Kim and Alger 2001, but it does not affect our data interpretation because all genotypes were analyzed identically, but only the I-IIB excision resulted in the large data spread shown in figure 5.

      (10) Could the dI-IIB phenotype be simply explained by a decrease in channel number/ release probability? To test this, I propose investigating PPRs and short-term dynamics during train stimulation at lower extracellular Ca2+ concentration in WT. The Ca2+ concentration could be titrated such that the first PSC amplitude is similar between WT and dI-IIB mutants. This experiment would test if the increased PPR/depression variability is a secondary consequence of a decrease in Ca2+ influx, or specific to the splice isoform.

      In fact, the interpretation that decreased PSC amplitude upon I-IIB excision is caused mainly by reduced channel number is precisely our interpretation (see discussion page 14, last paragraph to page 15, first paragraph). In addition, we are grateful for the reviewer’s suggestion to triturate the external calcium such that the first PSC amplitude matches the one in ΔI-IIB to test whether altered short term plasticity is solely a function of altered channel number or whether additional causes, such as altered channel properties, also play into this. We will conduct these experiments and include them in the revised manuscript.

      (11) How were the depression kinetics analyzed? How many trains were used for each cell, and how do the tau values depend on the first PSC amplitude? Time constants in the range of a few (5-10) milliseconds are not informative for train stimulations with a frequency of 1 or 10 Hz (the unit is missing in Figure 5H). Also, the data shown in Figures 5E-K suggest slower time constants than 5-10 ms. Together, are the data indeed consistent with the idea that dI-IIB does not only affect cac channel number, but also PPR/depression variability (p. 9)?

      For each animal, the amplitudes of each PSC were plotted over time and fitted with a single exponential. For depression at 1 and 10 Hz, we used one train per animal, and 5-6 animals per genotype (as reflected in the data points in Figs 5H and 5L). Given that the tau values are highly similar between control and excision of I-IIA, but ΔI-IIA tends to have larger single PSC amplitudes, differences in first PSC amplitude do not seem to skew the data (but see also response to comment 10 above). We thank the reviewer for pointing out that tau values in the range of ms are not informative at 1 and 10 Hz stimulations (Figs 5H and 5L). We mis-labeled (or did not label) the axes. The label should read seconds, not milliseconds. We apologize, and this will be corrected accordingly.

      In sum, pending the outcome of additional important control experiments for GluRIIA abundance (see response to comment 8) and trituration of control PSC amplitude for the first pulse of paired pulses in ΔI-IIB (see response to comment 10) we will either modify or further support that interpretation.

      (12) The GFP-tagged I-IIA and mEOS4b-tagged I-IIB cac puncta shown in Figure 6N appear larger than the Brp puncta. Endogenously tagged cac puncta are typically smaller than Brp puncta (Gratz et al., 2019). Also, the I-IIA and I-IIB fluorescence sometimes appear to be partially non-overlapping. First, I suggest adding panels that show all three channels merged. Second, could they analyze the area and area overlap of I-IIA and I-IIB with regard to each other and to Brp, and compare it to cac-GFP? Any speculation as to how the different tags could affect localization? Finally, I recommend moving the dI-IIA and dI-IIB localization data shown in Figure 6N to an earlier figure (Figure 1 or Figure 3).

      We will show panels with all three labels matched as suggested by the reviewer. For the size of the puncta: this could be different numbers and types of fluorophores on the different antibodies used and thus different point spread, chromatic aberration, different laser and detector intensities etc. We will re-analyze the data to test whether there are systematic differences in size. We do not want to speculate whether the different tags have any effect on localization precision because of the abovementioned reasons as well as artificial differences in localization precision that can be suggested by different antibodies. We prefer to not move the figure because we believe it is informative to show our finding that active zones usually contain both splice variants together with the finding that only one splice variant is required for PHP.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      In this useful study, the authors report the efficacy, hematological effects, and inflammatory response of the BPaL regimen (containing bedaquiline, pretomanid, and linezolid) compared to a variation in which Linezolid is replaced with the preclinical development candidate spectinamide 1599, administered by inhalation in tuberculosis-infected mice. The authors provide convincing evidence that supports the replacement of Linezolid in the current standard of care for drug-resistant tuberculosis. However, a limitation of the work is the lack of control experiments with bedaquiline and pretomanid only, to further dissect the relevant contributions of linezolid and spectinamide in efficacy and adverse effects.

      We acknowledge a limitation in our study due to lack of groups with monotherapy of bedaquiline and pretomanid however, similar studies to understand contribution of bedaquiline and pretomanid to the BPaL have been published already (references #4 and #60 in revised manuscript).  Our goal was to compare the BPaS versus the BPaL with the understanding that TB treatment requires multidrug therapy.   We omitted monotherapy groups to reduce complexity of the studies because the multidrug groups require very large number of animals with very intensive and complex dosing schedules. Even if B or Pa by themselves have better efficacy than the BPa or BPaL combination, patients will not be treated with only B or Pa because of very high risk of developing drug resistance to B or/and PA. If drug resistance is developed for B or Pa, the field will lose very effective drugs against TB. 

      Although the manuscript is well written overall, a re-formulation of some of the stated hypotheses and conclusions, as well as the addition of text to contextualize translatability, would improve value.

      Manuscript has been edited to address these critiques.  Answers to individual critiques are below.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This manuscript is an extension of previous studies by this group looking at the new drug spectinamide 1599. The authors directly compare therapy with BPaL (bedaquiline, pretomanid, linezolid) to a therapy that substitutes spectinamide for linezolid (BPaS). The Spectinamide is given by aerosol exposure and the BPaS therapy is shown to be as effective as BPaL without adverse effects. The work is rigorously performed and analyses of the immune responses are consistent with curative therapy.

      Strengths:

      (1) This group uses 2 different mouse models to show the effectiveness of the BPaS treatment.

      (2) Impressively the group demonstrates immunological correlates associated with Mtb cure with the BPaS therapy.

      (3) Linezolid is known to inhibit ribosomes and mitochondria whereas spectinaminde does not. The authors clearly demonstrate the lack of adverse effects of BPaS compared to BPaL.

      Weaknesses:

      (1) Although this is not a weakness of this paper, a sentence describing how the spectinamide would be administered by aerosolization in humans would be welcomed.

      We already reported on the aerodynamic properties of dry powder spectinamide 1599 within #3 HPMC capsules and its delivery from a RS01 Plastiape inhaler device (reference #59 in revised manuscript).  To address this critique, we added a last paragraph in discussion “It is proposed that human use of spectinamides 1599 will be administered using a dry powder formulation delivered by the RS01 Plastiape dry powder inhaler" (reference #59 in revised manuscript).  

      Reviewer #2 (Public Review):

      Summary:

      Replacing linezolid (L) with the preclinical development candidate spectinamide 1599, administered by inhalation, in the BPaL standard of care regimen achieves similar efficacy, and reduces hematological changes and proinflammatory responses.

      Strengths:

      The authors not only measure efficacy but also quantify histological changes, hematological responses, and immune responses, to provide a comprehensive picture of treatment response and the benefits of the L to S substitution.

      The authors generate all data in two mouse models of TB infection, each reproducing different aspects of human histopathology.

      Extensive supplementary figures ensure transparency. 

      Weaknesses:

      The articulation of objectives and hypotheses could be improved.

      We edited to "The AEs were associated with the long-term administration of the protein synthesis inhibitor linezolid. Spectinamide 1599 (S) is also a protein synthesis inhibitor of Mycobacterium tuberculosis with an excellent safety profile, but which lacks oral bioavailability. Here, we propose to replace L in the BPaL regimen with spectinamide administered via inhalation and we demonstrate that inhaled spectinamide 1599, combined with BPa ––BPaS regimen––has similar efficacy to that of BPaL regimen while simultaneously avoiding the L-associated AEs.

      Reviewer #3 (Public Review):

      Summary:

      In this paper, the authors sought to evaluate whether the novel TB drug candidate, spectinamide 1599 (S), given via inhalation to mouse TB models, and combined with the drugs B (bedaquiline) and Pa (pretomanid), would demonstrate similar efficacy to that of BPaL regimen (where L is linezolid). Because L is associated with adverse events when given to patients long-term, and one of those is associated with myelosuppression (bone marrow toxicity) the authors also sought to assess blood parameters, effects on bone marrow, immune parameters/cell effects following treatment of mice with BPaS and BPaL. They conclude that BPaL and BPaS have equivalent efficacy in both TB models used and that BPaL resulted in weight loss and anemia (whereas BPaL did not) under the conditions tested, as well as effects on bone marrow.

      Strengths:

      The authors used two mouse models of TB that are representative of different aspects of TB in patients (which they describe well), intending to present a fuller picture of the activity of the tested drug combinations. They conducted a large body of work in these infected mice to evaluate efficacy and also to survey a wide range of parameters that could inform the effect of the treatments on bone marrow and on the immune system. The inclusion of BPa controls (in most studies) and also untreated groups led to a large amount of useful data that has been collected for the mouse models per se (untreated) as well as for BPa - in addition to the BPaS and BPaL combinations which are of particular interest to the authors. Many of these findings related to BPa, BPaL, untreated groups, etc corroborate earlier findings and the authors point this out effectively and clearly in their manuscript. To go further, in general, it is a well-written and cited article with an informative introduction.

      Weaknesses:

      The authors performed a large amount of work with the drugs given at the doses and dosing intervals started, but at present, there is no exposure data available in the paper. It would be of great value to understand the exposures achieved in plasma at least (and in the lung if more relevant for S) in order to better understand how these relate to clinical exposures that are observed at marketed doses for B, Pa, and L as well as to understand the exposure achieved at the doses being evaluated for S. If available as historical data this could be included/cited. Considering the great attempts made to evaluate parameters that are relevant to clinical adverse events, it would add value to understand what exposures of drug effects such as anemia, weight loss, and bone marrow effects, are being observed. It would also be of value to add an assessment of whether the weight loss, anemia, or bone marrow effects observed for BPaL are considered adverse, and the extent to which we can translate these effects from mouse to patient (i.e. what are the limitations of these assessments made in a mouse study?). For example, is the small weight loss seen as significant, or is it reversible? Is the magnitude of the changes in blood parameters similar to the parameters seen in patients given L? In addition, it is always challenging to interpret findings for combinations of drugs, so the addition of language to explain this would add value: for example, how confident can we be that the weight loss seen for only the BPaL group is due to L as opposed to a PK interaction leading to an elevated exposure and weight loss due to B or Pa?

      We totally agree with this critique but the studies suggested by the reviewer are very expensive and

      logistically/resource intensive. Data reported in this manuscript was used as preliminary data in a RO1 application to NIH-NIAID that included studies proposed above by this reviewer. The authors are glad to report that the application got a fundable score and is currently under consideration for funding by NIH-NIAID.   The summary of proposed future studies is included in the last paragraph of the discussion in this revised manuscript. 

      Turning to the evaluations of activity in mouse TB models, unfortunately, the evaluations of activity in the BALB/c mouse model as well as the spleens of the Kramnik model resulted in CFU below/at the limit of detection and so, to this reviewer's understanding of the data, comparisons between BPaL and BPaS cannot be made and so the conclusion of equivalent efficacy in BALB/c is not supported with the data shown. There is no BPa control in the BALB/c study, therefore it is not possible to discern whether L or S contributed to the activity of BPaL or BPaS; it is possible that BPa would have shown the same efficacy as the 3 drug combinations. It would be valuable to conduct a study including a BPa control and with a shorter treatment time to allow comparison of BPa, BPaS, and BPaL. 

      We agree with the reviewer these studies need to be done.  Some of them were recently published by our colleague Dr. Lyons (reference #60 in revised manuscript). The studies proposed by the reviewer will be performed under a new award under consideration for funding by the NIH-NIAID, the summary of future studies is included in the last paragraph of the discussion in this revised manuscript. 

      In the Kramnik lungs, as the authors rightly note, the studies do not support any contribution of S or L to BPa - i.e. the activity observed for BPa, BPaL, and BPaS did not significantly differ. Although the conclusions note equivalency of BPaL and BPaS, which is correct, it would be helpful to also include BPa in this statement;

      We edited and now included in lines #191 as requested 

      It would be useful to conduct a study dosing for a longer period of time or assessing a relapse endpoint, where it is possible that a contribution of L and/or S may be seen - thus making a stronger argument for S contributing an equivalent efficacy to L. The same is true for the assessment of lesions - unfortunately, there was no BPa control meaning that even where equivalency is seen for BPaL and BPaS, the reader is unable to deduce whether L or S made a contribution to this activity.

      Added in the future plans in the last paragraph of discussion

      “Future studies are already under consideration for funding by NIH-NIAID to understand the pharmacokinetics of mono, binary and ternary combinations of BPaS. These studies also aim to identify the optimal dose level and dosing frequency of each regimen along with their efficacy and relapse free-sterilization potential. Studies are also planned using a model-based pharmacokinetic-pharmacodynamic (PKPD) framework, guided by an existing human BPa PKPD model (reference #61 in revised manuscript), to find allometric human dose levels, dosing frequencies and treatment durations that will inform the experimental design of future clinical studies. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Although this is not a weakness of this paper, a sentence describing how the spectinamide would be administered by aerosolization in humans would be welcomed.

      Last paragraph of discussion was added “It is proposed that human use of spectinamides 1599 will be administered using a dry powder formulation delivered by the RS01 Plastiape dry powder inhaler". We already reported on the aerodynamic properties of dry powder spectinamide 1599 within #3 HPMC capsules and delivered from a RS01 Plastiape inhaler device (reference #59 in revised manuscript)

      Reviewer #2 (Recommendations For The Authors):

      Major comments

      The Abstract lacks focus and could more clearly convey the key messages.

      Edited as requested 

      The two mouse models and why they were chosen need to be described earlier. Currently, it's covered in the first section of the Discussion, but the reader needs to understand the utility of each model in answering the questions at hand before the first results are described, either in the introduction or in the opening section of the results.

      Thank you for suggestion, we agree.  We moved the first paragraph in discussion to last paragraph in Introduction. 

      Line 130: Please justify the doses and dosing frequency for S. A reference to a published manuscript could suffice if compelling.

      The dosing and regimens were previously reported by our groups in ref 21 and 22 in revised manuscript.- 

      (21) Robertson GT, Scherman MS, Bruhn DF, Liu J, Hastings C, McNeil MR, et al. Spectinamides are effective partner agents for the treatment of tuberculosis in multiple mouse infection models. J Antimicrob Chemother.

      2017;72(3):770–7. 

      (22) Gonzalez-Juarrero M, Lukka PB, Wagh S, Walz A, Arab J, Pearce C, et al. Preclinical Evaluation of Inhalational Spectinamide-1599 Therapy against Tuberculosis. ACS Infect Dis. 2021;7(10):2850–63. 

      Figures 1 E to H: several "ns" are missing, please add them.

      Edited as requested 

      Line 184 to 190: suggest moving the body weight plots to a Supplemental Figure, and at least double the size of the histology images to convey the message of lines 192-203.

      Please include higher magnification insets to illustrate the histopathological findings. In that same section, please add a sentence or two describing the lesion scoring concept/method. It is a nice added feature, not widespread in the field, and deserves a brief description.

      Edited as requested.  We added detailed description for scoring method in M&M under histopathology and lesion scoring

      Line 206: please add an introductory sentence explaining why one would expect S to cause (or not) hematological disruption, and why MCHC and RDW were chosen initially (they are markers of xyz). The first part of Figure 3 legend belongs to the Methods.

      To address this critique we added in #225-226 “The effect of L in the blood profile of humans and mouse has been reported (references #38-42 in revised manuscript) but the same has not been reported for S” . In line #229-230 we added “Of 20-blood parameters evaluated, two blood parameters were affected during treatment”. 

      The first part of Figure 3 legend belongs to the Methods.

      We edited Figure 3 to “During therapy of mice in Figure 1, the blood was collected at 1, 2- and 4-weeks posttreatment. The complete blood count was collected in VETSCAN® HM5 hematology analyzer (Zoetis)”.

      Line 218: please explain why the 4 blood parameters that are shown were selected, out of the 20 parameters surveyed.

      We added an explanation in line 239-240 “out 20-blood parameters evaluated, a total of four blood parameters were affected at 2 and 4-weeks-of treatment”.

      Line 243 and again Line 262 (similar to comment Line 206): please add an introductory paragraph explaining the motivation to conduct this analysis and the objective. Can the authors put the experiment in the context of their hypothesis?

      To address this critique, we added in line #235-237 “The Nix-TB trial associated the long-term administration of L within the BPaL regimen as the causative agent resulting in anemia in patients treated with the BPaL regimen (5).”

      Figure 4C (and the plasma and lung equivalent in the SI). This figure needs adequate labeling of axes: X axis = LOG CFU? Please add tick marks for all plots since log CFU is only shown for the bottom line. Y axes have no units: pg/mL as in B?

      Figure legend were edited to add (Y axis:pg/ml) and (X axis; log10CFU).  

      Line 255-256: please remove "pronounced" and "profound". There is a range of CFU reduction and cytokine reduction, from minor to major. The correlation trend is clear and those words are not needed.

      Edited as requested 

      Line 277-289, Figure 6: given the heterogeneity of a C3HeB/FeJ mouse lung (TB infected), and the very heterogeneous cell population distribution in these lungs (Fig. 6A), the validity of whole lung analysis on 2 or 3 mice (the legend should state what 1, 2 and 3 means, individual mice?) is put into question. "F4/80+ cells were observed significantly higher in BPaS compared to UnRx control": Figure S14 suggests a statistically significant difference, but nothing is said about the other cell type, which appears just as much reduced in BPaS compared to UnRx as F4/80+. Overall, sampling the whole lung for these analyses should be mentioned as a limitation in the Discussion.

      We agree with the reviewer that "visually" it appears as other populations in addition to F4/80 have statistical significance.  We run again the two way Anova with Tukey test and only the BPaS and UnRx for F4/80 is significant. 

      We edited figure S16 (previously S14) to add ns for every comparation.  

      In Figure 6A was edited ;  N=2 are 2 mice for Unrx and n=3 mice for BPaL/BPaS each.

      Line 355-360: "The BPa and BPaL regimens altered M:E in the C3HeB/FeJ TB model by suppressing myeloid and inducing erythroid lineages" This suggests that altered M:E is not associated with L, putting into question the comparison between BPaS, BPaL, and UnRx. Can the authors comment on how M:E is altered in BPa and not in BPaS?

      Our interpretation to this result was that addition of S in our regimen BPsS was capable of restoring the M:E ratio altered by the BPa and BPaL. This interpretation was included in main text in line #263-264 and is also now added to abstract

      Line 379: discuss the limitations of working with whole lungs.

      Sorry we cannot understand this request. In our studies we always work with whole lungs if the expected course of histopathology/infection among lung lobes is very variable (as is the case of C3HeB/Fej TB model)

      Concluding paragraph: "Here we present initial results that are in line with these goals." If such a bold claim is made, there needs to be a discussion on the translatability of the route of administration and the dose of S. Otherwise, please rephrase.

      We added the following last paragraph to discussion:

      To conclude, the TB drug development field is working towards developing shorter and safer therapies with a common goal of developing new multidrug regimens of low pill burden that are accessible to patients, of short duration (ideally 2-3 months) and consist of 3-4 drugs of novel mode-of-action with proven efficacy, safety, and limited toxicity. Here we present initial results for new multidrug regimens containing inhaled spectinamide 1599 that are in line with these goals. It is proposed that human use of spectinamides 1599 will be administered using a dry powder formulation delivered by the RS01 Plastiape dry powder inhaler.  We already reported on the aerodynamic properties of dry powder spectinamide 1599 within #3 HPMC capsules and delivered from a RS01 Plastiape inhaler device (reference #59 in revised manuscript). Future studies are already under consideration for funding by NIHNIAID to understand the pharmacokinetics of mono, binary and ternary combinations of BPaS. These studies also aim to identify the optimal dose level and dosing frequency of each regimen along with their efficacy and relapse free-sterilization potential. Studies are also planned using a model-based pharmacokinetic-pharmacodynamic (PKPD) framework, guided by an existing human BPa PKPD model (references #60 and 61 in revised manuscript) , to find allometric human dose levels, dosing frequencies and treatment durations that will inform the experimental design of future clinical studies.

      Minor edits

      Adverse events, not adverse effects (side effects)

      Edited as requested

      BALB/c (not Balb/c, please change throughout).

      Edited as requested

      Line 92: replace 'efficacy' with potency or activity.

      Edited as requested

      "Live" body weight: how is that different from "body weight"? Suggest deleting "live" throughout, or replace with "longitudinally recorded" if that's what is meant, although this is generally implied.

      Edited as requested

      The last line of Figure 2 legend is disconnected. 

      Line 331: delete "human".

      Edited as requested

      Reviewer #3 (Recommendations For The Authors):

      We thank the reviewer for these suggestions.  The data presented in this manuscript with 4 weeks of treatment along with monitoring of effects of therapy in blood, bone marrow and immunity have been submitted for a RO1 application to NIH-NIAID, which have received a fundable score and is under funding consideration. All the points suggested by the reviewer(s) here are included in the research proposed in the RO1 application including manufacturing and physico-chemically characterize larger scale of dry powders of spectinmides and evaluation of their aerodynamic performance for human or animal use; Pharmacokinetics and efficacy studies to determine the optimal dose level and dosing frequency for new multidrug regimens containing spectinamides. These studies include mono, binary and ternary combinations of each multidrug regimen along with their efficacy and relapse free- sterilization potential. These studies will also develop PK/PD simulation-based allometric scaling to aid in human dose projections inhalation. We hope the reviewer will understand all together these studies will last 4-5 years.  

      Although I truly appreciate the great efforts of the authors, I suggest that in order to better evaluate the contribution of S versus L to BPa in these models, repeat studies be run that:

      (a) include BPa groups to allow the contribution of S and L to be assessed. Included in research proposed RO1 application mentioned above

      (b) use shorter treatment times in BALB/c to allow comparisons at end of Tx CFU above the LOD. We have added new data for 2 weeks treatment with BPaL and BPaS in Balb/c mice infected with MTb that was removed from previous submission of this manuscript

      (c) use longer treatment times and ideally a relapse endpoint in Kramnik to allow

      assessment of L and S as contributors to BPa (i.e. give a chance to see better efficacy of BPaL or BPaS versus BPa) and also measure plasma exposures of all drugs (or lung levels if this is the translatable parameter for S) to allow detection of any large DDI and also understand the translation to the clinic. Related to the safety parameters, it would be really great to understand whether or not the observations for BPaL would be labeled adverse in a toxicology study/in a clinical study, and it would be useful to include information on the magnitude of observations seen here versus in the clinic (eg for the hematological parameters).

      The research proposed in the RO1 application mentioned above included extensive PK, extended periods of treatment beyond 1 month of treatment (2-5 months as needed to reach negative culturable bacterial from organs) and of course relapse studies. 

      Minor point: I suggest rewording "high safety profile" when describing spectinomides in the intro - or perhaps qualify the length of dosing where the drug is well tolerated

      "high safety profile" was replaced by “an acceptable safety profile”

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment  

      This important study builds on a previous publication (with partially overlapping authors), demonstrating that T. brucei has a continuous endomembrane system, which probably facilitates high rates of endocytosis. Using a range of cutting-edge approaches, the authors present compelling evidence that an actomyosin system, with the myosin TbMyo1 as the molecular motor, is localized close to the endosomal system in the bloodstream form (BSF) of Trypanosoma brucei. It shows convincingly that actin is important for the organization and integrity of the endosomal system, and that the trypanosome Myo1is an active motor that interacts with actin and transiently associates with endosomes, but a role of Myo1 in endomembrane function in vivo was not directly demonstrated. This work should be of interest to cell biologists and microbiologists working on the cytoskeleton, and unicellular eukaryotes.

      We were delighted at the editors’ positive assessment and the reviewers’ rigorous, courteous, and constructive responses to the paper. We agree that a direct functional role for TbMyo1 in endomembrane activity was not demonstrated in the original submission, but have incorporated some new data (see new supplemental Figure S5) using the TbMyo1 RNAi cell line which are consistent with our earlier observations and interpretations.  

      Public Reviews:   

      Reviewer #1 (Public Review):  

      Using a combination of cutting-edge high-resolution approaches (expansion microscopy, SIM, and CLEM) and biochemical approaches (in vitro translocation of actin filaments, cargo uptake assays, and drug treatment), the authors revisit previous results about TbMyo1 and TbACT in the bloodstream form (BSF) of Trypanosoma brucei. They show that a great part of the myosin motor is cytoplasmic but the fraction associated with organelles is in proximity to the endosomal system. In addition, they show that TbMyo1 can move actin filaments in vitro and visualize for the first time this actomyosin system using specific antibodies, a "classical" antibody for TbMyo1, and a chromobody for actin. Finally, using latrunculin A, which sequesters G-actin and prevents F-actin assembly, the authors show the delocalization and eventually the loss of the filamentous actin signal as well as the concomitant loss of the endosomal system integrity. However, they do not assess the localization of TbMyo1 in the same conditions.  

      Overall the work is well conducted and convincing. The conclusions are not over-interpreted and are supported by the experimental results. 

      We are very grateful to Reviewer1 for their balanced assessment. The reviewer is correct that we did not assess the localisation of TbMyo1 following latrunculin A treatment, but it is worth noting that Spitznagel et al. carried out this exact experiment in the earlier 2010 paper – we have mentioned this in the revised manuscript.  

      Reviewer #2 (Public Review):  

      Summary:  

      The study by Link et al. advances our understanding of the actomyosin system in T. brucei, focusing on the role 

      of TbMyo1, a class I myosin, within the parasite's endosomal system. Using a combination of biochemical fractionation, in vitro motility assays, and advanced imaging techniques such as correlative light and electron microscopy (CLEM), this paper demonstrates that TbMyo1 is dynamically distributed across early and late endosomes, the cytosol, is associated with the cytoskeleton, and a fraction has an unexpected association with glycosomes. Notably, the study shows that TbMyo1 can translocate actin filaments at velocities suggesting an active role in intracellular trafficking, potentially higher than those observed for similar myosins in other cell types. This work not only elucidates the spatial dynamics of TbMyo1 within T. brucei but also suggests its broader involvement in maintaining the complex architecture of the endosomal network, underscoring the critical role of the actomyosin system in a parasite that relies on high rates of endocytosis for immune evasion. 

      Strengths:  

      A key strength of the study is its exceptional rigor and successful integration of a wide array of sophisticated techniques, such as in vitro motility assays, and advanced imaging methods, including correlative light and electron microscopy (CLEM) and immuno-electron microscopy. This combination of approaches underscores the study's comprehensive approach to examining the ultrastructural organization of the trypanosome endomembrane system. The application of functional data using inhibitors, such as latrunculin A for actin depolymerization, further strengthens the study by providing insights into the dynamics and regulatory mechanisms of the endomembrane system. This demonstrates how the actomyosin system contributes to cellular morphology and trafficking processes. Furthermore, the discovery of TbMyo1 localization to glycosomes introduces a novel aspect to the potential roles of myosin I proteins within the cell, particularly in the context of organelles analogous to peroxisomes. This observation not only broadens our understanding of myosin I functionality but also opens up new avenues for research into the cellular biology of trypanosomatids, marking a significant contribution to the field. 

      We are very pleased that the Reviewer felt the work is a significant contribution to the state of the art.  

      Weaknesses:  

      Certain limitations inherent in the study's design and scope render the narrative incomplete and make it challenging to reach definitive conclusions. One significant limitation is the reliance on spatial association data, such as colocalization of TbMyo1 with various cellular components-or the absence thereof-to infer functional relationships. Although these data suggest potential interactions, the authors do not confirm functional or direct physical interactions.  

      While TbMyo1's localization is informative, the authors do not directly demonstrate its biochemical or mechanical activities in vivo, leaving its precise role in cellular processes speculative. Direct assays that manipulate TbMyo1 levels, activity, and/or function, coupled with observations of the outcomes on cellular processes, would provide more definitive evidence of the protein's specific roles in T. brucei. A multifaceted approach, including genetic manipulations, uptake assays, kinetic trafficking experiments, and imaging, would offer a more robust framework for understanding TbMyo1's roles. This comprehensive approach would elucidate not just the "what" and "where" of TbMyo1's function but also the "how" and "why," thereby deepening our mechanistic insights into T. brucei's biology.  

      The reviewer is absolutely correct that the study lacks data on direct or indirect interactions between TbMyo1 and its intracellular partners, and this is an obvious area for future investigation. Given the generally low affinities of motor-cargo interactions, a proximity labelling approach (such has already been successfully used in studies of other myosins) would probably be the best way to proceed. 

      The reviewer is also right to highlight that a detailed mechanistic understanding of TbMyo1 function in vivo is currently lacking. We feel that this would be beyond the scope of the present work, but have included some new data using the TbMyo1 RNAi cell line (Figure S5), which are consistent with our previous findings.  

      Reviewer #3 (Public Review):  

      Summary:  

      In this work, Link and colleagues have investigated the localization and function of the actomyosin system in the parasite Trypanosoma brucei, which represents a highly divergent and streamlined version of this important cytoskeletal pathway. Using a variety of cutting-edge methods, the authors have shown that the T. brucei Myo1 homolog is a dynamic motor that can translocate actin, suggesting that it may not function as a more passive crosslinker. Using expansion microscopy, iEM, and CLEM, the authors show that MyoI localizes to the endosomal pathway, specifically the portion tasked with internalizing and targeting cargo for degradation, not the recycling endosomes. The glycosomes also appear to be associated with MyoI, which was previously not known. An actin chromobody was employed to determine the localization of filamentous actin in cells, which was correlated with the localization of Myo1. Interestingly, the pool of actomyosin was not always closely associated with the flagellar pocket region, suggesting that portions of the endolysomal system may remain at a distance from the sole site of parasite endocytosis. Lastly, the authors used actin-perturbing drugs to show that disrupting actin causes a collapse of the endosomal system in T. brucei, which they have shown recently does not comprise distinct compartments but instead a single continuous membrane system with subdomains containing distinct Rab markers.  

      Strengths:  

      Overall, the quality of the work is extremely high. It contains a wide variety of methods, including biochemistry, biophysics, and advanced microscopy that are all well-deployed to answer the central question. The data is also well-quantitated to provide additional rigor to the results. The main premise, that actomyosin is essential for the overall structure of the T. brucei endocytic system, is well supported and is of general interest, considering how uniquely configured this pathway is in this divergent eukaryote and how important it is to the elevated rates of endocytosis that are necessary for this parasite to inhabit its host.  

      We are very pleased that the Reviewer formed such a positive impression of the work. 

      Weaknesses:  

      (1) Did the authors observe any negative effects on parasite growth or phenotypes like BigEye upon expression of the actin chromobody?  

      Excellent question! There did appear to be detrimental effects on cell morphology in some cells, and it would definitely be worth doing a time course of induction to determine how quickly chromobody levels reach their maximum. The overnight inductions used here are almost certainly excessive, and shorter induction times would be expected to minimise any detrimental effects. We have noted these points in the Discussion.  

      (2) The Garcia-Salcedo EMBO paper cited included the production of anti-actin polyclonal antibodies that appeared to work quite well. The localization pattern produced by the anti-actin polyclonals looks similar to the chromobody, with perhaps a slightly larger labeling profile that could be due to differences in imaging conditions. I feel that the anti-actin antibody labeling should be expressly mentioned in this manuscript, and perhaps could reflect differences in the F-actin vs total actin pool within cells.  

      Implemented. We have explicitly mentioned the use of the anti-actin antibody in the Garcia-Salcedo paper in the revised Results and Discussion sections.  

      (3) The authors showed that disruption of F-actin with LatA leads to disruption of the endomembrane system, which suggests that the unique configuration of this compartment in T. brucei relies on actin dynamics. What happens under conditions where endocytosis and endocyctic traffic is blocked, such as 4 C? Are there changes to the localization of the actomyosin components? 

      Another excellent question! We did not analyse the localisation of TbMyo1 and actin under temperature block conditions, but this would definitely be a key experiment to do in follow-up work.

      (4) Along these lines, the authors suggest that their LatA treatments were able to disrupt the endosomal pathway without disrupting clathrin-mediated endocytosis at the flagellar pocket. Do they believe that actin is dispensable in this process? That seems like an important point that should be stated clearly or put in greater context.  

      Whether actin plays a direct or indirect role in endocytosis would be another fascinating question for future enquiry, and we do not have the data to do more than speculate on this point. Recent work in mammalian cells (Jin et al., 2022) has suggested that actin is primarily recruited when endocytosis stalls, and it could be that a similar role is at play here. We have noted this point in the Discussion. The observation of clathrin vesicles close to the flagellar pocket membrane and clathrin patches on the flagellar pocket membrane itself in the LatA-treated cells might suggest that some endocytic activity can occur in the absence of filamentous actin. 

      Recommendations for the authors:

      Note from the Reviewing Editor:  

      During discussion, all reviewers agreed that the role of TbMyo1 in vivo in endomembrane function had not been directly demonstrated. This could be done by testing the endocytic trafficking of (for example) fluorophoreconjugated TfR and BSA in the existing Myo1 RNAi line, using wide-field microscopy. Examining the endosomes/lysosomes' organization by thin-section EM would be even better. The actin signal detected by the chromobody tends to occupy a larger region than the MyoI. It's therefore conceivable that actin filamentation and stabilization via other actin-interacting proteins create the continuous endosomal structure, while MyoI is necessary for transport or other related processes. 

      These are all excellent points and very good suggestions. We have now incorporated new data (supplemental Figure S5) that includes BSA uptake assays in the TbMyo1 RNAi cell line and electron microscopy imaging after TbMyo1 depletion – the results are consistent with our earlier observations.   

      Reviewer #1 (Recommendations For The Authors):  

      -  Figure S2E. This panel is supposed to show the downregulation of TbMyo1 in the PCF compared to BSF but there is no loading control to support this claim. This is important because the authors mention in lines 381-383 that this finding conflicts with the previous study (Spitznagel et al., 2010). The authors also indicate in the figure legend that there is 50% less signal but there is no explanation about this quantification.   

      Good point. Equal numbers of cells were loaded in each lane, but we did not have an antibody against a protein known to be expressed at the same level in both PCF and BSF cells to use as a loading control. Using a total protein stain would have been similarly unhelpful in this context, as the proteomes of PCF and BSF cells are dissimilar. The quantification was made by direct measurement after background subtraction, but without normalisation owing to the lack of a loading control. This makes the conclusion somewhat tentative, but given the large difference in signal observed between the two samples (and the fact that this is consistent with the proteomic data obtained by Tinti and Ferguson) we feel that the conclusion is valid. We have clarified these points in the figure legend and Discussion.  

      -  It is mentioned in the discussion, as unpublished observations, that the predicted FYVE motif of TbMyo1 can bind specifically PI(3)P lipids. This is a very interesting point that would be new and would strengthen the suggested association with the endosomal system mainly based on imaging data. 

      We agree that this is – potentially – a very exciting observation and it is an obvious direction for future enquiry.  

      The data are preliminary at this stage and will form the basis of a future publication. Given that the predicted FYVE domain of TbMyo1 and known lipid-binding activity of other class I myosins makes this activity not wholly unexpected, we feel that it is acceptable at this stage to highlight these preliminary findings.  

      -  The authors use the correlation coefficient to estimate the colocalization (lines 223-226). Although they clearly explain the difference between the correlation coefficient and the co-occurrence of two signals, I wonder if it would not be clearer for the audience to have quantification of the overlapping signals. Also, it is not mentioned on which images the correlation coefficient was measured. It seems that it is from widefield images (Figures 3E and 6E), and likely from SIM images for Figure 3C but the resolution is different. Are widefield images sufficient to assess these measurements? 

      With hindsight, and given the different topological locations of TbMyo1 and the cargo proteins (cytosolic and lumenal, respectively) it would probably have been wiser to measure co-occurrence rather than correlation, but we would prefer not to repeat the entire analysis at this stage. The correlations were measured from widefield images using the procedure described in the Materials & Methods. These are obviously lower resolution than confocal or SIM images would be, but are still of value, we believe. One further point – upon re-examination of some of the TbMyo1 transferrin (Tf) and BSA data, we noticed that there are many pixels with a value of 0 for Tf/BSA and a nonzero value for TbMyo1 and vice-versa. The incidence of zero-versus-nonzero values in the two channels will have lowered the correlation coefficient, and in this sense, the correlation coefficients are giving us a hint of what the immuno-EM images later confirm: that the TbMyo1 and cargo are present in the same locations, but in different proportions. We have added this point to the discussion.  

      -  It would be good to know if the loss of the endosomal system integrity (using EBI) is the same upon TbMyo1 depletion than in the latrunculin A treated parasites. 

      We agree! We have now included new data (Figure S5) that suggests endosomal system morphology is altered upon TbMyo1 depletion. We would predict that the effect upon TbMyo1 depletion is slower or less dramatic than upon LatA treatment (as LatA affects both actin and TbMyo1, given that TbMyo1 depends upon actin for its localisation).

      -  Conversely, it would be of interest to see how the localization of TbMyo1 changes upon latrunculin A treatment.

      This experiment was done in 2010 by Spitznagel et al., who observed a delocalisation of the TbMyo1 signal after LatA treatment. We have noted this in the Results and Discussion.

      Minor corrections:  

      -  Line 374: Figure S1 should be Figure S2. 

      Implemented (many thanks!).  

      -  Panel E of Figure S2 refers to TbMyo1 and should therefore be included in Figure S1 and not S2. 

      We would prefer not to implement this suggestion. We did struggle over the placing of this panel for exactly this reason, but as the samples were obtained as part of the experiments described in Figure S2, we felt that its placement here worked best in terms of the narrative of the manuscript.    

      -  Figure S2F: the population of TbMyo21 +Tet seems lost after 48 h although the authors mention that there is no growth defect. 

      Good eyes! We have re-added the panel, which shows that there was no growth defect in the tetracycline-treated population.  

      Reviewer #2 (Recommendations For The Authors):  

      Fig 1 vs. Figure 3: The biochemical fractionation experiments have been well-controlled, showing that 40% of TbMyo1 is found in both the cytosolic and cytoskeletal fractions, with only 20% in the organelle-associated fraction. The conclusion is supported by the experimental design, which includes controls to rule out crosscontamination between fractions. However, does this contrast with the widefield microscopy experiments, where the vast majority of the signal is in endocytic compartments and nowhere else? 

      This is a good point. There are three factors that probably explain this. First, given that the actin cytoskeleton is associated with the endosomal system, a large proportion of the material partitioning into the cytoskeleton (P2) fraction is probably localised to the endosomal system (a fun experiment would be to repeat the fractionation with addition of ATP to the extraction buffer to make the myosin dissociate and see whether more appeared in the SN2 fraction as a result). Second, the 40% of the TbMyo1 that is cytosolic is distributed throughout the entire cellular volume, whereas the material localised to the endosomes is concentrated in a much smaller space, by comparison, and producing a stronger signal. Third, the widefield microscopy images have had brightness and contrast adjusted in order to reduce “background” signal, though this will also include cytosolic molecules. We hope these explanations are satisfactory, but would welcome any additional thoughts from either the reviewer or the community.  

      The section title 'TbMyo1 translocates filamentous actin at 130 nm/s' could mislead readers by not specifying that the findings are from an in vitro experiment with a recombinant protein, which may not fully reflect the cell's complex context. Although this detail is noted in the figure legend, incorporating it into the main text and considering a title revision would ensure clarity and accuracy.  

      Good point. Implemented – we have amended the section title to “TbMyo1 translocates filamentous actin at 130 nm/s in vitro” and the figure legend title to “TbMyo1 translocates filamentous actin in vitro”.  

      The discussion of the translocation experiment could be better phrased addressing certain limitations. The in vitro conditions might not fully capture the complexity and dynamic nature of cellular environments where multiple regulatory mechanisms, interacting partners, and cellular compartments come into play. 

      Good point, implemented. We have added a note on this to the Discussion.  

      It is puzzling that RNAi, which is widely used in T. brucei was not used to further investigate the functional roles of TbMyo1 in Trypanosoma brucei. Given that the authors already had the cell line and used it to validate the specificity of the anti-TbMyo1. RNAi could have been employed to knock down TbMyo1 expression and observe the resultant effects on actin filament dynamics and organization within the cell. This would have directly tested TbMyo1's contribution to actin translocation observed in the in vitro experiments. 

      It would obviously be interesting to carry out an in-depth characterisation of the phenotype following TbMyo1 depletion and whether this has an effect on actin dynamics. We have now included additional data (supplemental Figure S5) using the TbMyo1 RNAi cells and the results are consistent with our earlier observations and interpretations. It is worth noting too that at least for electron microscopy studies of intracellular morphology, the slower onset of an RNAi phenotype and the asynchronous replication of T. brucei populations make observation of direct (early) effects of depletion challenging – hence the preferential use of LatA here to depolymerise actin and trigger a faster phenotype.  

      I found that several declarative statements within the main text may not be fully supported by the overall evidence. I suggest modifications to present a more balanced view,  

      Line 227: "The results here suggest that although the TbMyo1 distribution overlaps with that of endocytic cargo, the signals are not strongly correlated." This conclusion about the lack of strong correlation might mislead readers about the functional relationship between TbMyo1 and endocytic cargo, as colocalization does not directly imply functional interaction. 

      We would prefer not to alter this statement. It was our intention to phrase this cautiously, as we have not directly investigated the functional interplay between TbMyo1 and endocytic cargo and the subsequent sentence directs the reader to the Discussion for more consideration of this issue.    

      Line 397: "This relatively high velocity might indicate that TbMyo1 is participating in intracellular trafficking of BSF T. brucei and functioning as an active motor rather than a static tether." The statement directly infers TbMyo1's functional role from in vitro motility assay velocities without in vivo corroboration.

      We have amended the sentence in the Discussion to make it clear that it is speculative.  

      The hypothesis that cytosolic TbMyo1 adopts an auto-inhibited "foldback" configuration, drawn by analogy with findings from other studies, is intriguing. Yet, direct evidence linking this configuration to TbMyo1's function in T. brucei is absent from the data presented. 

      We have amended the sentence in the Discussion to make it clear that it is speculative. Future in vitro experiments will test this hypothesis directly.  

      The suggestion that a large cytosolic fraction of TbMyo1 indicates dynamic behavior, high turnover on organelles, and a low duty ratio is plausible but remains speculative without direct experimental evidence. Measurements of TbMyo1 turnover rates or duty ratios in T. brucei through kinetic studies would substantiate this claim with the necessary evidence.  

      We have amended the sentence in the Discussion to make it clear that it is speculative, and deleted the reference to a possible low duty ratio. Again, future in vitro experiments will measure the duty ratio of TbMyo1 using stopped-flow. 

      Reviewer #3 (Recommendations For The Authors):  

      Lines 171-172: The authors mention that MyoI could be functioning as a motor rather than a tether. The differences in myosin function have not been introduced prior to this. I would recommend explaining these differences and what it could mean for the function of the motor in the introduction to help a non-expert audience.

      Good point. Implemented.  

      Line 94-95: This phenotype only holds for the bloodstream form- the procyclic form are quite resistant to actin RNAi and MyoI RNAi. I would clarify. 

      Good point. Implemented.  

      Line 142-146: did the authors attempt to knock out the Myo21? 

      Good point. No, this was not attempted. Given the extremely low expression levels of TbMyo21 in the BSF cells we would not expect a strong phenotype, but this assumption would be worth testing. 

      Figure 3D: is there a reason why the authors chose to show the single-channel images in monochrome in this case?  

      Not especially. These panels are the only ones that show a significant overlap in the signals between the two channels (unlike the colabelling experiments with ER, Golgi), so greyscale images were used because of their higher contrast. 

      Line 397-398: I'm struggling a bit to understand how MyoI could be involved in intracellular trafficking in the endosomal compartments if the idea is that we have a continuous membrane? Some more detail as to the author's thinking here would be useful. 

      Implemented. We have noted that this statement is speculative, and emphasised that being an active motor does not automatically mean that it is involved in intracellular traffic – it could instead be involved in manipulating endosomal membranes. We have noted too that the close proximity between TbMyo1 and the lysosome (Figures

      3-5) could be important in this regard. The lysosome is not contiguous with the endosomal system, and it is possible that TbMyo1 is working as a motor to transport material (class II clathrin-coated vesicles) from the endosomal system to the lysosome.  

      Line 493-496: Does this mean that endocytosis from the FP does not require actin? This would be hard to explain considering the phenotypes observed in the original actin RNAi work. Is the BigEye phentopye observed in BSF actin RNAi and Myo1 RNAi cells due to some indirect effect? 

      It seems possible that actin is not directly or essentially involved in endocytosis, and the characterisation of the actin RNAi phenotype would be worth revisiting in this respect – we have noted this in the Discussion. Although RNAi of actin was lethal, the phenotype appears less penetrant than that seen following depletion of the essential endocytic cofactor clathrin (based on the descriptions in Garcia-Salcedo et al., 2004 and Allen et al., 2003). BigEye phenotypes occur in BSF cells whenever there is some perturbation of endomembrane trafficking and are not necessarily a direct consequence of depletion – this is why careful investigation of early timepoints following RNAi induction is critical.

    1. Author response:

      We are very appreciative of the reviewers’ assessment that we used “solid and creative” methods to provide a “convincing demonstration” of “compelling theoretical results” on a “crucial but less-explored issue” in cognitive neuroscience. We are also grateful for their thoughtful suggestions for analyses and for pointing out areas where our analysis descriptions need more clarity. While we will respond to all comments in a future response and revision, here we provide information and clarification on a few central points.

      Localization of semantic content:

      Regarding our semantic analysis, one reviewer rightly pointed out that items with a high degree of semantic association, as captured by word2vec, tend to occur in the same images, and they expressed concern that this could drive our similarity results. We wish to clarify here (and will revise the manuscript accordingly) that we excluded all pairs of co-occurring items in our word2vec semantic analysis in order to avoid this issue. Thus, our results cannot be driven by the number of images within which items co-occurred. We also agree with the reviewer who stated that “semantic information” is a nebulous term in the cognitive neurosciences, and it appears to have led to some confusion as to the nature of our claims. We take a broad view of this term, with the perspective that visual features (e.g., color, shape) can contribute to semantic content rather than necessarily competing with it. In our work, we use word2vec to identify neural representations that reflect the kind of semantic content present in word embedding models—but the conclusions we draw do not depend on these representations being devoid of visual content. That is, we do not use word2vec to examine semantic versus visual representations, but rather to narrow down the set of representations to be considered in subsequent analyses. While there are a range of legitimate views on what should be considered a “semantic” representation, our broad view, which is inclusive of visual content, along with our strategy for localizing semantic content are both standardly used in the visual neuroscience literature. Prior work in this literature has compared the ability of word2vec and low-level visual models to predict neural responses to natural images and found that the brain regions in which activity is accurately predicted by the models are considerably distinct: whereas a low-level visual model best predicts activity in V1, V2, and V4, word2vec performs better in more anterior regions, including in visual areas such as lateral occipital cortex (Güçlü & van Gerven, 2015, arXiv). This suggests that our effects are unlikely to be explained by overlap in the kinds of low-level visual features mentioned by the reviewers. However, the semantic content we localize and the representation of high-level visual features may indeed overlap, and this is compatible with our claims. We will do more in our revision to be explicit about our intended meaning in our use of the word “semantic” and how our approach relates to and builds on prior work in this literature.

      Long-term representational drift:

      We want to clarify our claims regarding the representational drift analysis. One reviewer stated that, while we show evidence of representational drift, we “provide no evidence suggesting that this long-term neural representational drift reflects a drift in semantic representation.” Another reviewer said: “The inference is that this [drift] is due to an updating of knowledge about the associations each item has had with other items,” and that our finding that semantic structure remains stable within these regions seems “to contradict the claims about semantic plasticity.” The claim we intended to make, which will be unpacked more clearly in our revision, is that the neural representations underlying semantic content drift over time, even if the semantic content itself is unchanging. In other words, we do not claim that our across-session drift analyses show changes in knowledge about object associations. Indeed, one of the reasons that representational drift has recently captured the attention of neuroscientists is that the neural representations underlying certain behaviors or cognitive content appear to drift over time even when the behaviors or cognitive content remain fixed. The relational structure of the neural representations can remain stable, even if the particular neurons recruited to represent each stimulus change over time (see, e.g., the T-maze in Rule, O’Leary, & Harvey., 2019, Curr Opin Neurobiol). Here we are translating these ideas, which were developed using animal models and/or primarily focused on low-level vision, to the semantic system in humans. The neural representations we identify in our paper capture semantic information because they share a similarity structure with word2vec, and the level of similarity to word2vec remains stable over time. Thus, our findings provide a simple demonstration of long-term representational drift in the human semantic system akin to that reported in animals—drift in the neural semantic representations of items even as the relations between these item representations appear stable.

      Signal-to-noise variability across the MTL:

      A reviewer raised the possibility that differences between our ROIs could be driven by variability in signal-to-noise ratio (SNR) across regions, particularly within the medial temporal lobe (MTL). We looked at noise ceiling SNR brain maps for each participant, which reflect the reliability of neural responses across repetitions of the same image. Preliminary analyses indicate that SNR differences do not account for our object encoding, semantic content, representational drift, or short-term plasticity measures across the MTL.

    1. Author response:

      The following is the authors’ response to the original reviews.

      (1) Please provide more background about Rpgrip1l in the introduction, particularly the past studies of mammalian homolog of Rpgrip11, if any? Is there any human disease associated with Rpgrip1l? Do these patients have scoliosis phenotype? 

      • We have added more background on the human ciliopathies caused by RPGRIP1L mutations and on their occasional association with early onset scoliosis (lines 45-54 page 2 in the introduction, see cited references). 

      (2) The allele is a large deficiency of most of the coding region of rpgrip1l, can you give details in the Supplementary data of how you show this by genotyping? It would be good to explain that this mutation is most likely behaving as a null, if you have RNAseq data that supports this please note that. Otherwise, it may be incorrect to assume it is a null allele as your shorthand nomenclature states. If you do not have stronger evidence that the deficiency allele is behaving as a null allele, then please think about using an allele nomenclature as outlined at ZFIN:  

      • We now describe in the results section (Lines 72-76, page 3) the extent of the deletion of rpgrip1l ∆/∆ (22 exons out of 26) that creates an early stop at position 88 of 1256 aas. We have submitted to ZFIN our two novel mutant lines: rpgrip1l∆  is recorded as rpgrip1l bps1 and rpgrip1l ex4 as rpgrip1l bps2 , and we provide this information in the text. Transcriptomics data confirmed this allele is behaving as a null as the most down-regulated transcript found in the brain of rpgrip1l ∆/∆ is rpgrip1l transcript itself, (volcano plot in Fig 5A, described in the results, Line 270-71, page 9).

      • We also have provided in Supplementary Figure 1 A’ a picture of a typical genotyping gel for the rpgrip1l∆ allele. Sequences of both CRISPR guide RNAs and genotyping primers are provided in the Math & Meth section. 

      (3) Throughout the manuscript, the authors refer to zebrafish mutant phenotypes as "juvenile scoliosis". However, scoliosis may not appear until 11 weeks post-fertilization in some animals. After 6-8 weeks of age, it would be more appropriate to describe the phenotype as "late-onset or adult scoliosis" to differentiate between other reported scoliosis mutants (such as hypomorphic or dominant negative alleles of scospondin) that start body curvatures at 3-5 dpf .

      • We think we can really qualify rpgrip1l-/- scoliosis as being a “juvenile scoliosis” as shown by the time course displayed in Fig 1B: rpgrip1l-/- scoliosis develops asynchronously between 4 weeks and 9 weeks (from 0.8 cm/1 cm to 1.6 cm, corresponding to juvenile stages according to Parichy et al, 2009 PMID: 19891001), after which it reaches a plateau. Half of the mutants are already scoliotic by 5 weeks and no scoliosis develops at adult stage, ie from 10 weeks on. We have acknowledged the late onset scoliosis in page 3 line 93.

      (4) A more careful demonstration of the individual vertebrae, using magnified high-resolution pictures in Figures 1D-G, should be made to more clearly show no obvious vertebral malformations are present. 

      • We now provide a movie in Sup Data that presents 3D views of controls and mutant spines, which show the intervertebral spaces as well as vertebral shape and size. With these images we could exclude vertebral fusion and the presence of dysmorphic vertebrae.

      (5) On page 5: the authors comment on transgenic expression of RPGRIP1L in foxj1a-lineages as "rescuing" scoliosis. This terminology is confusing, as rescuing a condition could be interpreted as inducing it where it was once absent. "Suppressing" scoliosis may be a more appropriate term. 

      • We agree with the reviewers, the “rescue” term is confusing, we changed it for “suppress” in the title of the paragraph (line 95 page 3) and within the text (line 115 page 3).

      (6) On page 5, lines 155-156: the authors state that "Indeed, no tissue-specific rescue has been performed yet in zebrafish ciliary gene mutants". This is misleading, as ptk7a and katnb1 mutations both disrupt cilia, and transgenic reintroduction of both ptk7a and katnb1 in foxj1a- expressing lineages has previously been shown to suppress cilia defects as well as scoliosis in these models. The statement should be removed for accuracy. 

      • We agree that we were not precise enough in our sentence: when we mentioned “ciliary gene” mutants, we were referring to genes whose products are enriched within cilia and directly affecting ciliogenesis, cilia content and maintenance such as TZ or BBS genes, without encompassing genes like ptk7 and katnb1 whose products perform multiple functions on top of cilia maintenance such as Wnt signalling and remodelling of the whole microtubule network respectively. We have therefore modified our sentence by adding zebrafish ciliary “TZ and BBS” genes (line 104, page 4).

      (7) Figure 2: panels A-B: In the text (line 196) you state that cilia length was increased and that Arl13b content was severely reduced. However, Panel B shows no significant length difference between scoliotic mutants and controls. This statement and graph should be corrected for accuracy. Also, the Arl13b staining is difficult to see in panel A - can channels be split, and/or quantified? 

      • We have now split the Arl13b and glutamylated tubulin channels (Fig 2 A-C”). We think that the reduction of Arl13b staining intensity is now obvious in both straight and scoliotic mutants (Compare 2A” with 2B” and 2C”). We were not able to quantify Arl13b staining using ciliary masks from glutamylated tubulin staining since both staining only partially overlap along the length of the cilium, Arl13b being more distal than glutamylated tubulin (Fig 2A’). 

      • Ciliary length was significantly increased (from 3.4 to 5.3 µ) in straight rpgrip1l-/-, while the average mean values for scoliotic rpgrip1l-/- were heterogenous (mean 4.1µ) and therefore not significantly different when compared to controls. This heterogeneity stems from the combined presence of both shorter and longer cilia in scoliotic fish, a finding we interpreted by the potential breakage over time of extra-long and thin cilia observed in scoliotic fish (as in Sup figure 1 H’’’, Sup Fig 2M’ and 2O’). 

      • We changed the text to be more accurate: we now state that cilia length increased in straight mutants, and became more heterogenous than controls in scoliotic mutants (line 143-144, page 5). 

      (8) Figure 3: Page 7, line 206: authors state that SCO-spondin secreting cells varied in number along SCO length. What is the evidence that these cells secrete SCO-spondin? The staining shown in Figure 3L-O appears to demonstrate extracellular accumulation of sspo:GFP. What is the evidence that this staining originated from cells in proximity to it? 

      The claim of SCO-secreting cells in Figure 2E-J is confusing. I assume you are using anatomy to infer the SCO is captured in these sections. This should be done in sspo-GFP animals (as in Figure 3) and/or dual anti-body labeling can be done to show SCO-secreting cells and cilia. 

      • We now show in Supplementary Figure 2 A-D a double staining for Sco-spondin-GFP and cilia (Ac-tub, Glu-Tub). Analyzing GFP staining along SCO length on successive sections, we identified the SCO producing cells on the diencephalic dorsal midline by their position under the posterior commissure (PC), which forms an Acetylated Tubulin positive arch), and counted the nuclei surrounded by cytoplasmic GFP from the most anterior region ( 24 cells wide, Sup Fig 2A-A’) to the most posterior region (4-8 cells wide, Sup Fig 2 C).` 

      • Furthermore, the close-ups presented on Fig 2A’ and 2B’ allow to detect the cytoplasmic Sspo-GFP staining around SCO nuclei, above the region presenting primary cilia pointing towards the diencephalic ventricle, both in controls and mutants at scoliosis onset (tail-up mutants), showing that the extracellular staining in B’ very likely originates from these cells. In these tail-up mutants, extracellular Sspo aggregates have not yet filled the whole diencephalic ventricle as in Fig 3 N and Q. 

      (9) Figure 5: Is the transcriptome data and proteomic data consistent for any transcripts and encoded protein products? Please highlight those consistent targets in both analyses. 

      • We would like to emphasize that the transcriptomic study was performed at scoliosis onset, at 5 weeks, while the proteomics analysis was performed at adult stage (3 months) so they cannot be directly compared.

      Moreover, low abundance proteins (such as centrosomal proteins and transcription factors like Foxj1a ) are not detected by label-free proteomics, without prior subcellular fractionation procedure (Lindemann et al, 2017 PMID: 28282288). The extraction protocol also does not allow to purify short neuropeptides such as Urp1-2.

      Nevertheless, we found four targets in common, now highlighted in red in Fig 5, Panel E: Anxa2, complement proteins

      C4 and C7a, and Stat3, all related to immune response, a GO term enriched in both studies as explained in the text (Lines 308-311, page 10). 

      The absence of many inflammation markers or immune response proteins at adult stage in scoliotic mutants most probably indicates a transient inflammatory episode at scoliosis onset, while astrogliosis, as detected by GFAP staining, increases with scoliosis severity. Along the same lines, the two-fold increase of Lcp1 cells within the tectum is present before axis curvature (in straight mutants) and disappears in scoliotic fish (Graph G in Sup Figure S5) as explained in the text, Lines 378-381, page 12, 

      (10) Supplementary Figure 1 F-H: What stage/age samples were used for SEM? It is only stated that they were 'adults'. It is also stated that cilia tufts in straight rpgrip1l-/- fish were morphologically normal but 'less dense'- this was not obvious from the figure. Can density be quantified? (otherwise, data does not support the statement). Similarly, can the statement that "cilia of mono-ciliated ependymal cells showed abnormal irregular structures compared to controls, with either bulged or thinner parts" be supported with measurements/quantification? 

      • The SEM study was performed on 3 months old fish, 3 controls and 5 mutants. We added this information in the figure legend. We could not quantify the number of ciliary tufts in the brain ventricle of the sole straight mutant that was analyzed. We therefore removed the statement that cilia were less dense in the straight mutant. Along the same lines, we mentioned that we could find mutant cilia of irregular shape as shown in Supplementary Figure S1, F”,G’’, H’’ and H’’’) (page 4, lines 124-129). 

      (11) Supplementary Figure 1D-E is never mentioned in the text. The Supplemental Figure legend also refers to a graph of cilia length that is not in the figure itself. As a result, many of the subsequent panel references are out of register. 

      • We now provide the correct version of the legend and refer to Sup Fig 1D-E in the text (page 3, lines 79-81) and its legend, page 53, lines 1616-1620.

      (12) Supplementary Figure 2A-F: Of interest, in panels C and F, it looks as though sspo:GFP is accumulating on cilia within the ventricles of rpgrip1l mutants. Can this be explored? Is it possible that abnormal aggregation of SSPO on cilia is ultimately leading to cilia loss, as you report for multi-ciliated cells surrounding the subcommissural organ? This could be a very interesting finding and possible mechanism for cilia loss.

      • Our observation of all brain sections led us to conclude that the majority of Sspo-GFP aggregates were floating within the brain ventricles of rpgrip1l-/- fish while a portion of aggregates were stuck on ventricle walls, in close contact with cilia as now shown on Supplementary figure S2 B’, outlined in legend page 54, lines 1634-1637. We agree that the contact between Sspo aggregates and cilia might have damaging consequences, either on cilia maintenance or on immune reaction induction and we now mention these possibilities in the discussion page16, lines 524-526. These research lines will be explored in the near future.

      (13) Supplementary Figure 5A-F is not mentioned in the manuscript. Please clarify the role of Anxa2 in neuroinflammation. Is increased Anxa2 expression in rpgrip1l mutant zebrafish reduced after anti-inflammatory drug treatment? What is the expression level of anxa2 in cep290 mutant zebrafish? 

      • We have now added mention to Supplementary Figure 5A-F in the text page 10 lines 328-331. 

      • We unfortunately did not have enough histological material to test Anxa2 staining on NACET treated fish after performing GFAP and Lcp1 staining, neither for dilatation measurement or multiciliated cells quantification. We agree this would have helped to better define which defect might be an indirect consequence of an inflammatory environment.

      • We tested the expression level of Anxa2 in cep290-/- fish. No labelling above control level was detected on cep290-/- brain sections that were positive for GFAP (N = 5). As GFAP staining in 3-4 weeks cep290-/- was not as intense and widespread as in adult rpgrip1l-/- (50% of GFAP + cells compared to 100% in the SCO for example), we concluded that Anxa2 expression may be upregulated after widespread or long-term astrogliosis/inflammation. Alternatively, Anxa2 overexpression could be specific to rpgrip1l-/- fish. 

      (14) A summary diagram at the end would be helpful for understanding the main findings. 

      We added a Graphical Abstract summarizing the main conclusions and hypotheses of this study. It is mentioned and explained in the Discussion section, p. 16 lines 504-508 and 516-529. 

      (15) The sspo-GFP zebrafish line should be listed in the STAR methods section: 

      The sspo-GFP line is now listed in the STAR methods, Scospondin-GFPut24, (Troutwine et al., 2020 PMID: 32386529), p.43, last line.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      Comment 1) The CIRSPR screen designed to detect regulators of damage-induced transcriptional repression is based on EU incorporation following a 7-day selection of stable knockout cells. As the authors point out, cell cycle arrest reduces rDNA transcription on its own. The screen, which assesses changes in sgRNA distribution in EU high cells, is thus likely to be dominated by factors that affect cell cycle progression. This is exemplified in the analyses of top hits related to neddylation. The screen's limitations in terms of identifying DDR effectors of damage-induced silencing need to be clearly stated. 

      Notably, our screen did identify known DNA damage response effectors of damage-induced silencing, for example ATM was a top hit, as discussed in the paper and shown in Fig. 5B. We consider that our unbiased approach had advantages because in addition to finding known DDR effectors, we uncovered novel requirements, such as the need for cells to be cycling, for transcriptional silencing in response to DNA damage. We didn’t find the canonical key cell cycle regulators in our screen. One possibility might be that cell cycle arrest or cell death upon their knock down may lead to out-competition during the seven-day treatment with doxycycline resulting in depletion from, rather than enrichment in, the targeting gRNAs from cells that maintain transcription 7 days after DNA damage.

      Comment 2) The authors confirm previous findings of DNA damage-induced repression of rDNA and histone gene transcription. The authors propose that these highly transcribed genes are more susceptible to silencing than the bulk of protein-coding genes and propose a global damage-induced signaling event that is independent of DNA breaks in cis. While this is possible, it is not demonstrated in this manuscript, and the authors should acknowledge alternative explanations. For example, the loci found to be repressed by bulk IR are highly repetitive gene arrays that tend to form nuclear sub-compartments (nucleoli, histone bodies). As such, their likelihood of being in the vicinity of DNA damage is high, at least for a fraction of gene copies. The findings, therefore, remain consistent with cis-induced silencing. Moreover, silencing may spread through the relevant nuclear sub-compartments, consistent with the formation of DNA damage compartments described recently (PMID: 37853125). 

      The reason for us “suggest(ing) that the reduced bulk abundance of nascent transcripts after IR may occur in trans as a programmed event” was based on the gene length-independent and IR dose-independent nature of the gene silencing shown in Fig. 2D and Fig. 4C), not that rDNA and histone gene expression went down the most after IR. Indeed, we stated that “Those genes that were normally most highly transcribed were repressed after IR, while genes that were normally expressed at intermediate or low levels tended to be induced after IR (Fig. 4A). The mechanistic reason for this is unclear.” We thank the reviewer for the suggestion that this may be due to these genes existing in nuclear sub-compartments. We have now incorporated this possibility into the discussion.

      Other comments: 

      (1) The statement that silencing is due to transcription initiation rather than elongation is not sufficiently supported by the data. Could equivalent nascent transcript reduction not be the result of the suppression of elongating RNA PolII? To draw the proposed conclusion, the authors would need to demonstrate that RNA PolII initiation is altered, using RNA PollII ChIP and/or analysis of relevant RNA PolII phosphorylation patterns. 

      Figure 4F shows the distribution of nascent transcript reads throughout the open reading frame of the repressed genes. It shows that the transcript abundance throughout the ORF, including at the 5’ end, is reduced. This pattern is consistent with a defect in initiation. We have now clarified the description of these results to state that: “Our data is consistent with the possibility that the major mechanism for the repression of the ~1,000 protein coding genes after IR is at the transcriptional initiation stage. However, our data do not rule out that elongation may be additionally repressed after IR, as this would not be observed in our analyses due to concomitant repression of transcriptional initiation.” 

      (2) The lack of rDNA silencing in arrested cells is interesting, though the underlying mechanism remains unclear. To further corroborate the proposed defect in ATM-mediated signaling, the authors should look directly at ATM and Treacle phosphorylation upstream of TOPBP1. 

      We would love to have shown that ATM dependent phosphorylation does not occur upon IR. We had attempted this multiple times but unfortunately the available phospho Treacle antibodies were not suitable for rigorous analyses in our hands.

      (3) The "change in relative heights of the EU low (G1) and EU high (S/G2) peaks" in Figures 5D, 5E, and 6B is central to the proposed model of transcriptional changes being affected by cell cycle arrest. These differences should be visualized more clearly and quantified across independent experiments. Ideally, the cell cycle stage should be dissected as in Figure 2B. How do the authors envision cell cycle arrest triggers the defect in transcriptional silencing? 

      In the previous version, the last paragraph described one possibility for how rDNA may fail to be repressed in arrested cells after IR, based on the results shown in Fig. 7F and G.  We have now added a paragraph in the discussion section beginning “Why would cell cycle arrest in G1 or G2 phases of the cell cycle prevent transcriptional repression of rDNA and histone genes after IR?”

      Reviewer #2:

      (1) Define ERCC normalization. 

      We apologize for this omission. We now have explained ERCC normalization and have added a citation to a commentary that we wrote on spike-in controls 2015 for further explanation.

      (2) On page 8, the authors speculate that genes involved in immune response after IR was activated due to cytoplasmic DNA in pre-B cells. Where are these cytoplasmic DNAs from? Is there any literature indicating that 30 30-minute IR treatment can induce cytoplasmic DNA? 

      We have removed this speculation, as there is no evidence currently to support it.

      (3) Related to the points above, are ERVs or repetitive DNA elements up-regulated upon IR treatment, which in turn results in increased expression of genes involved in immune response? 

      The induction of cytokines as a rapid response to irradiation is a major part of the immediate early gene program induced in response to ROS (and now is explained in the manuscript).

      (4) Please explain in the result section how overlap levels of transcription determined by EU are reduced after IR, and yet the number of genes with increased expression upon IR treatment is much more than that of genes with reduced expression. 

      We have explained that while less genes have reduced expression after IR than the number of genes that increase expression after IR, those genes that have reduced expression are extremely highly expressed to start off with. As a result, the bulk amount of transcripts is reduced after IR.

      (5) Do cells treated with MLN4924 block the down-regulation of histone genes and ribosomal genes? 

      We have not addressed this directly. However, given that the reduction of gene expression that occurs after IR is largely due to repression of histone and rDNA genes, it is safe to speculate that these are the genes that are no longer repressed during cell cycle arrest.

      (6) Is IR-induced down-regulation of histone genes due to cell cycle changes? 

      We do not know for sure if this is the case. It is relevant to note that even without IR, histone expression per se is regulated by cell cycle changes, being lower outside of S phase – and the majority of  non-arrested cells in our study are in S phase (Fig. 2B). As such, arrest of cells per se outside of S phase would be sufficient to reduce histone expression level.

      We would like to thank the reviewers again for their insightful suggestions and comments.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      This manuscript dissects the contribution of the CaBP 1 and 2 on the calcium current in the cochlear inner hair cells. The authors measured the calcium current inactivation from the double knock-out CaBP1 and 2 and showed that both proteins contribute to voltage-dependent and calcium-dependent inactivation. Synaptic release was reduced in the double KO. As a consequence, the authors observed a depressed activity within the auditory nerve. Taken together, this study identifies a new player that regulates the stimulation-secretion coupling in the auditory sensory cells. 

      Strengths: 

      In this study, the authors bring compelling evidence that CaBP 1 and 2 are both involved in the inactivation of the calcium current, from cellular up to system level, and by taking care to probe different experimental conditions such as different holding potentials and by rescuing the phenotype with the re-expression of CaBP2. Indeed, while changing the holding potential worsens the secretion, it completely changes the kinetics of the inactivation recovery. It alerts the reader that probing different experimental conditions that may be closer to physiology is better suited to uncovering any deleterious phenotype. This gave pretty solid results. 

      Weaknesses: 

      Although this study clearly points out that CaBP1 is involved in the calcium current inactivation, it is not clear how CaBP1 and CaBP2 act together (but this is probably beyond the scope of the study). Another point is that the authors re-express CaBP2 to largely rescue the phenotype in the double KO but no data are available to know whether the re-expression of both CaBP1 and CaBP2 would achieve a full recovery and what would be the effect of the sole re-expression of CaBP1 in the double KO.

      We would like to thank the reviewer for the appreciation of our work. We agree that the effect of the sole re-expression of CaBP1 in the double KO remains elusive and have planned to address this question in a follow-up study. 

      Reviewer #2 (Public Review): 

      Summary: 

      In the manuscript by Oestreicher et al, the authors use patch-clamp electrophysiology, immunofluorescent imaging of the cochlea, auditory function tests, and single-unit recordings of auditory afferent neurons to probe the unique properties of calcium signaling in cochlear hair cells that allow rapid and sustained neurotransmitter release. The calcium-binding proteins (CaBPs) are thought to modify the inactivation of the Cav1.3 calcium channels in IHCs that initiate vesicle fusion, reducing the calcium-dependent inactivation (CDI) of the channels to allow sustained calcium influx to support neurotransmitter release. The authors use knockout mice of Cabp1 and Cabp2 in a double knockout (Cabp1/2 DKO) to show that these molecules are required for enabling sustained calcium currents by reducing CDI and enabling proper IHC neurotransmitter release. They further support their evidence by re-introducing Cabp2 using an injection of AAV containing the Cabp2 sequence into the cochlea, which restores some of the auditory function and reduces CDI in patch-clamp recordings. 

      Strengths: 

      Overall the data is convincing that Cabp1/2 is required for reducing CDI in cochlear hair cells, allowing their sustained neurotransmitter release and sound encoding. Figures are well-prepared, recordings are careful and stats are appropriate, and the manuscript is well-written. The discussion appropriately considers aspects of the data that are not yet explained and await further experimentation.

      Weaknesses: 

      There are some sections of the manuscript that pool data from different experiments with slightly different conditions (wt data from a previous paper, different calcium concentrations, different holding voltages, tones vs clicks, etc). This makes the work harder to follow and more complicated to explain. However, the major conclusion, that cabp1 and 2 work together to reduce calcium-dependent inactivation of L-type calcium channels in cochlear inner hair cells, still holds. 

      Another weakness is that the authors used injections of AAV-containing sequences for Cabp2, but do not present data from sham surgeries. In most cases, the improvement of hearing function with AAV injection is believable and should be attributed to the cabp2 function. However, in at least one instance (Figure 4B), the results of the AAV injection experiments may be overinterpreted - the authors show that upon AAV injection, the hair cells have a much longer calcium current recovery following a large, long depolarization to inactivate the calcium channels. Without comparison to sham surgery, it is not known if this result could be a subtle result of the surgery or indeed due to the Cabp2 expression.  It would be great to see the auditory nerve recordings in AAV-injected animals that have a recovery of ABRs. However, this is a challenging experiment that requires considerable time and resources, so is not required.

      We would like to thank the reviewer for the appreciation of our work. We agree with the reviewer that sham surgery may convey more information that might benefit the interpretation of our data. The recovery experiments were very tedious and these long patch-clamp paradigms required extremely stable recordings. Based on our observations, we plan to address the recovery kinetics into more detail in the follow-up study. However, we would consider off-side effects of the surgery (as it may mainly affect middle ear function) and of the empty AAV-vector on inner hair cell calcium current recovery rather unlikely, but we cannot exclude them. We thus added a sentence in the discussion to alert to that. Based on previously published data of the effect of PHP.eB-Cabp2eGFP in WT animals we expect some (mild) adverse effects on hearing from overexpression of CaBP2 and/or eGFP in the inner ear. In the future, we thus plan to further optimize the treatment. In terms of the in vivo recordings from the auditory nerve fibers of the rescued mice, we could not agree more. That is in plan for the follow-up study.

      Reviewer #3 (Public Review): 

      Summary: 

      The authors attempted to unravel the role of the Ca2+-binding proteins CaBP1 and CaBP2 for the hitherto enigmatic lack of Ca2+-dependent inactivation of Ca2+ currents in sensory inner hair cells (IHCs). As Ca2+ currents through Cav1.3 channels are crucial for exocytosis, the lack of inactivation of those Ca2+ currents is essential for the indefatigable sound encoding by IHCs. Using a deaf mouse model lacking both CaBP1 and CaBP2, the authors convincingly demonstrate that both CaBP1 and CaBP2 together confer a lack of inactivation, with CaBP2 being far more effective. This is surprising given the mild phenotype of the single knockouts, which has been published by the authors before. Readmission of CaBP2 through viral gene transfer into the inner ear of double-knockout mice largely restored hearing function, normal Ca2+ current properties, and exocytosis. 

      Strengths: 

      (1) In vitro electrophysiology: perforated patch-clamp recordings of Ca2+/Ba2+ currents of inner hair cells (IHCs) from 3-4 week-old mice - very difficult recordings - necessary to not interfere with intracellular Ca2+ buffers, including CaBP1 and CaBP2. 

      (2) Capacitance (exocytosis) recordings from IHCs in perforated patch mode. 

      (3) The insight that a negative holding potential might underestimate the impact of lack of CaBP1/2 on the inactivation of ICa in IHCs. As the physiological holding potential is much more positive than a preferred holding potential in patch clamp experiments it has a strong impact on inactivation in the pauses between depolarization mimicking receptor potentials. This truly advances our thinking about the stimulation of IHCs and accumulating inactivation of the Cav1.3 channels. 

      (4) Insight that the voltage sine method with usual voltage excursions (35 mV) to determine the membrane capacitance (for exocytosis measurements) also favors the inactivated state of Cav1.3 channels 

      (5) Use of double ko mice (for both CaBP1 and CaBP2, DKO) and use of DKO with virally injected CaBP2eGFP into the inner ear. 

      (6) Use of DKO animals/IHCs/SGNs after virus-mediated CaBP2 gene transfer shows a great amount of rescue of the normal ICa inactivation phenotype.

      (7) In vivo measurements of SGN AP responses to sound, which is highly demanding. 

      (8) In vivo measurements of hearing thresholds, DPOAE characteristics, and ABR wave I amplitudes/latencies of DKO mice and DKO+injected mice compared to WT mice. 

      Very thorough analysis and presentation of the data, excellent statistical analysis.

      The authors achieved their aims. Their results fully support their conclusions. The methods used by the authors are state-of-the-art. 

      The impacts on the field are the following:

      Regulation of inactivation of Cav1.3 currents is crucial for the persistent functioning of Cav1.3 channels in sensory transduction. 

      The findings of the authors better explain the phenotype of the human autosomal recessive DFNB93, which is based on the malfunction of CaBP2. 

      Future work - by the authors or others - should address the molecular mechanisms of the interaction of CaBP1 and 2 in regulating Cav1.3 inactivation. 

      Weaknesses: 

      I do not see weaknesses. 

      What is not explained (but was not the aim of the authors) is how the CaBPs 1 and 2 interact with the Cav1.3 channels and with each other to reduce CDI. Also, why DFNB93, which is based on mutation of the CaBP2 gene, lead to a severe phenotype in humans in contrast to the phenotype of the CaBP2 ko mouse.

      We would like to thank the reviewer for the appreciation of our work and the amount of effort that went into these experiments. These are the questions that we are posing ourselves as well and would like to address them in the future.   

      Recommendations for the authors:

      Reviewing editor: 

      In the Introduction, the authors may also mention that Ca2+-dependent and voltage-dependent inactivation of L-type Ca channels has been reported at ribbon synapses of retinal bipolar cells (see von Gersdorff & Mathtews, J Neurosci. 1996, 16(1):115-122). These are critical retinal interneurons involved in the continuous exocytosis of synaptic vesicles onto retinal ganglion cells. 

      We would like to thank the reviewing editor for pointing that out, we have added the reference in the revised version of the manuscript.

      Reviewer #1 (Recommendations For The Authors): 

      Conditions worsen with age but no numbers regarding the threshold shift are provided. 

      For better readability, we now included click threshold values for both genotypes and age groups in the MS text, results section.   

      Do the authors correlate the re-expression level of CaBP2 using GFP to the rescuing phenotype (for exocytosis or BK channels immunostaining)?

      The restoration of BK expression in the virus-treated IHC was a side observation of our study, which was not performed in sufficient replicates for proper quantification. In the future, we will address this question into greater detail, possibly with improved viral constructs. In a previous study, we attempted to correlate eGFP fluorescence intensity with residual depolarization-evoked calcium current in CaBP2-injected IHC of Cabp2 single KO animals. At that time, we were unable to establish a convincing correlation. This could be related to (i) large variability in the data, possibly requiring much larger datasets to observe potential correlation above the noise, (ii) variable imaging conditions from prep to prep, or (iii) additional parameters that could influence the outcome of the current rescue, e.g. uncontrolled expression of the transgene. However, we did analyse the correlation between ABR click thresholds and mean IHC eGFP fluorescence in another, preliminary set of data that included different viruses at different titres. There, we were able to observe a relatively good correlation. Interestingly, some of the highest expression levels resulted in poorer threshold recovery, which could indicate harmful overexpression. Moreover, the correlation was only detected when the difference of the mean eGFP expression levels per organ was large. Furthermore, significantly less efficient ABR threshold recovery was observed in the non-injected contralateral ears, which showed a significantly lower viral expression of the transgene. In our follow-up study, we will investigate the question of dose dependence of rescue in more detail.  

      Reviewer #2 (Recommendations For The Authors): 

      -  There are two paragraphs in the results text about supplemental figure #2, which suggests that it should be moved to the main figures. 

      We would like to thank the reviewer for this suggestion. Figure S2 has now been moved to the main figures (as current Figure 5) and has been modified to accommodate the BK cluster analysis panel. The histogram with the number of ribbon synapses was removed as the data was redundant with the numbers given in the MS text.  

      -  Overall it is hard to distinguish between dark blue and black in many figures, including the dual-color asterisks.

      To improve the readability and clarity of the figures, we exchanged dark blue with magenta.  Dual-color asterisks in Fig. 3 were changed to single-color asterisks and what they refer to is explain in the figure legend.  

      -  Figure 4 legend - there is a mis-spelling of cabp in the fourth line from the bottom. 

      -  Figure 4 legend - the last line does not make sense - describes recovery as being both 'much faster' and 'slowest'.

      -  Figure 6 title - consider removing 'nearly blocked' and replacing it with 'impaired'.

      We would like to thank the reviewer for noticing these mistakes that have been corrected in the revised version, as suggested.

      -  The calculations of VDI and CDI could be better explained, specifically detailing that VDI is calculated first from currents using barium as a divalent, followed by the calculation of CDI. 

      We included an explanatory sentence in the results section as suggested and are additionally referring the readers to the methods section for the mathematical formulas.

      -  Why were two different tests (one parametric and one non-parametric) used for the Figure 3B data? 

      We performed a point-by-point-comparison of data. The choice of test was made based on the distribution and the variance of the data points. We now opted for a unified test, t test with Welch correction, which assumes that samples come from populations with normal distribution, but does not make assumption about equal variances. The outcome of these tests were similar. 

      -  The much broader tuning of the auditory nerve fibers is interesting, consider including this in a figure. 

      For recording tuning curves, we use an automated algorithm which adapts the tone burst intensity and frequency depending on the preceding results. The threshold criterion is an increase of spiking by 20Hz above spontaneous rate. This routine works fairly well in wild-type animals. However, DKO SGNs typically had very high thresholds at >80 dB across all frequencies, which can partly be explained by the fact that they had very low spike rates and did not reach that criterion. Besides tuning curve runs, we also tried systematic frequency sweeps and manual frequency control to determine a best frequency, followed by a rate intensity function at that frequency to determine “best threshold”. 

      All this was difficult, because in the DKO SGNs, sound threshold detection was challenged by the strong dependence of spiking on the duration of the preceding silent interval. A preceding stimulus outside the frequency response area or below the activation threshold of the SGN would thus improve spiking by allowing for longer recovery, while a preceding efficient stimulus would reduce it. Thus, the sound threshold determined in a rate level sweep varied depending on the interstimulus interval and possibly even on the (randomized) order at which the intensities were played. 

      A meaningful threshold measure would require long silent interstimulus intervals, i.e. a long recording time. As tuning curves require multiple threshold measures, it seemed impossible to obtain a useful dataset at high quality. As we deemed the spike rate dependence on interstimulus intervals more important than the tuning we rather focused on tone burst responses acquired at frequency/intensity combinations at which the hair cells and their synapses were maximally activated. In wild-types, these would be tone bursts at characteristic frequency or noise bursts in the saturated part of the rate intensity function, which typically has a dynamic range of 10-25dB. As we assume (based on DPOAE) that cochlear micromechanics and amplification are mostly normal in the DKOs, we hypothesize that the sensitivity and dynamic range of basilar membrane motion and  inner hair cell transduction are normal and that the increase in single unit thresholds and loss of sharp tuning are another readout of synaptic dysfunction. 

      - Figure S2 - please show separate panels for each channel, it is very difficult to make out the changes by eye in the merged panels. 

      Done.  

      - Figure S2 G - the results text stated that the BK channel clusters 'appeared' smaller - why was this not measured? 

      We have performed additional experiments to enable proper analysis of the BK channel clusters. The analysed data shows that the BK clusters are considerably larger and more abundant in the WT as compared to CaBP1/2-deficient IHCs of approx. 4-week-old mice. The results of the analysis are included in the immunohistochemistry figure (now Fig. 5) and are further commented in the results section.  

      Reviewer #3 (Recommendations For The Authors): 

      I have only a few minor points on the MS: 

      (1) Some labels in Figure 1 are too small and hard to read, e.g. y-axis in B-F. Wherever you use subscripts on the axes, the labeling needs to be larger.

      (2) Fig. 1A: the colors for CaM and CaBP1.2 are too similar, at least on my printout. Please use more distant colors.

      (3) Reference 24 should be corrected (no longer in press).

      These points have been addressed in the revised version of the MS.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Response to Reviewer #1:

      Comment 1:

      Summary:

      The authors sought to investigate the associations of age at breast cancer onset with the incidence of myocardial infarction (MI) and heart failure (HF). They employed a secondary data analysis of the UK Biobank. They used descriptive and inferential analysis including Cox proportional hazards models to investigate the associations. Propensity score matching was also used. They found that Among participants with breast cancer, younger onset age was significantly associated with elevated risks of MI (HR=1.36, 95%CI: 1.19 to 1.56, P<0.001) and HF (HR=1.31, 95% CI: 1.18 to 1.46, P<0.001). the reported similar findings after propensity matching.

      Strengths:

      The use of a large dataset is a strength of the study as the study is well-powered to detect differences. Reporting both the unmatched and the propensity-matched estimates was also important for statistical inference.

      Weaknesses:

      Despite the merits of the paper, readers may get confused as to whether authors are referring to “age at breast cancer onset” or “age at breast cancer diagnosis”. I suppose the title refers to the latter, in which case it will be best to be consistent in using “age at breast cancer diagnosis” throughout the manuscripts. I would recommend a revision to the title to make it explicit that the authors are referring to “age at breast cancer diagnosis”.

      Thank you for your nice comments and suggestions. Yes, as you mentioned, in this study, we focused on age at breast cancer diagnosis, which was obtained from the cancer registry data in the UK Biobank and was used in all the analyses. We agree with you that it would be better to consistently use “age at diagnosis of breast cancer” throughout the manuscripts for a better understanding; therefore, we have replaced “age at breast cancer onset” with “age at diagnosis of breast cancer”.

      Change in the manuscript:

      “Age at breast cancer onset” was replaced with “age at diagnosis of breast cancer” in the title and throughout the manuscripts.

      Recommendations For The Authors:

      Kindly review the references for the location of the full stop. Putting the full stop at the end of the parenthesis makes reading smother than its current form as it is difficult to know when the new sentence begins.

      Thank you for your suggestion. We have made revisions to the location of the full stop next to a reference.

      Change in the manuscript:

      The full stop was put at the end of the parenthesis of a reference throughout the manuscripts.

      Response to Reviewer #2:

      Comment 1:

      This is a well-presented large analysis from the UK Biobank of nearly 250,000 female adults. The authors examined the associations of breast cancer diagnosis with incident myocardial infarction and heart failure by different onset age groups. Based on results from a series of statistical analyses, the authors concluded that younger onset age of breast cancer was associated with myocardial infarction and heart failure, highlighting the necessity of careful monitoring of cardiovascular status in women diagnosed with breast cancer, especially those younger ones.

      Comments to consider:

      It’s thoughtful for the authors to have included and adjusted for menopausal status, breast cancer surgery, and hormone replacement therapy in their sensitivity analysis. It would be informative if the authors presented the number and percentages of menopause and cancer treatments.

      Thank you for your comments. As suggested, we have provided more detailed information on the number and percentage of menopausal status and breast cancer treatments.

      Change in the manuscript:

      Page 11, Lines 208 to 211: added “Among participants with breast cancer, 11 460 (70.6%) participants were postmenopausal, 14 255 (87.6%) participants had undergone breast cancer surgery, and 6 784 (41.8%) participants had received hormone replacement therapy.”

      Change in the supplementary material:

      The number and percentage of menopausal status, breast cancer surgery, and hormone replacement therapy were added to Table S13.

      aAdjusted for age, ethnicity, education, current smoking, current drinking, obesity, exercise, low-density lipoprotein cholesterol, depressed mood, hypertension, diabetes, antihypertensive drug use, antidiabetic drug use, statin use, menopausal status, breast cancer surgery, and hormone replacement therapy.

      HR, hazard ratio; CI, confidence interval.

      Comment 2:

      The analytical baseline used for follow-up should be pointed out in the methods section. It’s confusing whether the analytic baseline was defined as the study baseline or the time at breast cancer diagnosis.

      We apologize for the confusion. In this study, the analytical baseline used for follow-up was defined as the baseline of UK Biobank (2006-2010) and we have pointed it out in the methods section as suggested.

      Change in the manuscript:

      Page 9, Lines 165 to 166: added: “The analytical baseline used for follow-up was defined as the baseline of UK Biobank (2006-2010).”

      Comment 3:

      Did the older onset age group have a longer follow-up duration? Could the authors provide information on the length of follow-up by age of onset in Supplementary Table S4? It would give the readers more information regarding different age groups.

      Thank you for your question. We compared the time of follow-up among the three diagnosis age groups and found that although the durations of follow-up among the three groups were quite similar (as shown in Table S4), statistical analysis revealed a significant difference with the older diagnosis age group demonstrating a longer follow-up duration (P for Kruskal-Wallis test <0.001). This is understandable as with large sample sizes, even a slight difference could lead to statistical significance. According to your suggestion, we have added information on the length of follow-up by age of diagnosis in Supplementary Table S4.

      Change in the supplementary material:

      Added the median and interquartile range of follow-up in Supplementary Table S4.

      The results are presented as the mean ± standard deviation, or No. (%).

      aThe effect sizes are standardized mean differences for continuous outcomes and the Phi coefficient for dichotomous outcomes.

      LDL-C, low-density lipoprotein cholesterol.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      This study addresses the temporal patterning of a specific Drosophila CNS neuroblast lineage, focusing on its larval development. They find that a temporal cascade, involving the Imp and Syb genes changes the fate of one daughter cell/branch, from glioblast (GB) to programmed cell death (PCD), as well as gates the decommissioning of the NB at the end of neurogenesis.

      I believe there are some inaccuracies in this summary. We address temporal patterning during larval and pupal stages until the adult stage. The Imp and Syp genes change the fate of one daughter cell/branch from survival to programmed cell death (PCD). The change from glioblast (GB) to PCD, which occurs at an early time point, is not addressed here. The main point of the paper is missing:

      • Last-born MNs undergo apoptosis due to their failure to express a functional TF code, and this code is post-transcriptionally regulated by the opposite expression of Imp and Syp in immature MNs.

      Reviewer #2 (Public Review):

      Summary:

      Guan and colleagues address the question of how a single neuroblast produces a defined number of progeny, and what influences its decommissioning. The focus of the experiments are two well-studied RNA-binding proteins: Imp and Syp. The Authors find that these factors play an important role in determining the number of neurons in their preferred model system of VNC motor neurons coming from a single lineage (LinA/15) by separate functions taking place at specific stages of development of this lineage: influencing the life-span of the LinA neuroblast to control its timely decommissioning and functioning in the Late-born post-mitotic neurons to influence cell death after the appropriate number of progeny is generated. The post-mitotic role of Imp/Syp in regulating programmed-cell death (PCD) is also correlated with a specific code of key transcription factors that are suspected to influence neuronal identity, linking the fate of neuronal survival with its specification. This paper addresses a wide scope of phenotypes related to the same factors, thus providing an intriguing demonstration of how the nervous system is constructed by context-specific changes in key developmental regulators.

      The bulk of conclusions drawn by the authors are supported by careful experimental evidence, and the findings are a useful addition to an important topic in developmental neuroscience.

      I cannot summarize better the paper.

      Strengths:

      A major strength is the use of a genetic labeling tool that allows the authors to specifically analyze and manipulate one neuronal lineage. This allows for simultaneous study of both the progenitors and post-mitotic progeny. As a result the paper conveys a lot of useful information for this particular neuronal lineage. Furthermore addressing the association of cell fate specification, taking advantage of this lab's extensive prior work in the system, with developmentally-regulated programmed celldeath is an important contribution to the field.

      Beyond Imp/Syp, additional characterization of this model system is provided in characterizing a previously unrecognized death of a hemilineage in early-born neurons.

      Thanks!

      Weaknesses:

      The main observations that distinguish this study from others that have investigated Imp/Syp in the fly nervous system is the role played in late-born post-mitotic neurons to regulate programmed cell death. This is an important and plausible (based on the presented findings) newly discovered role for these proteins. However the precision of experiments is not particularly strong, which limits the authors claims. The genetic strategy used to manipulate Imp/Syp or the TF code appears to be done throughout the entire lineage, or all neuronal progeny, and not restricted to only the late born cells. Can the authors rule out survival of the early born hemi-lineage normally fated to die? Therefore statements such as this: 

      To further investigate this possibility, we used the MARCM technique to change the TF code of lastborn MNs without affecting the expression of Imp and Syp should be qualified to specify that the result is obtained by misexpressing these factors throughout the entire lineage.

      We agree that our genetic manipulations affect the entire lineage or all neuronal progeny. We do not have genetic tools to gain such precision. We have changed our descriptions to specify the entire lineage or all neuronal progeny. As the reviewer raised, we were also concerned about the possibility that the overexpression of Imp or knockdown of Syp could induce the survival of the early-born hemilineage. We have two experiments that rule out this possibility:

      (1) In late LL3 larvae, Imp OE or syp MARCM clones do not change the number of cells in LL3 larvae (see Guan et al., 2022), indicating that the hemilineage that died by PCD is not affected. If Imp or Syp played a role in the survival of the hemilineage, we would see at least a 50% increase in the number of MNs at this stage.

      (2) The MARCM experiment using the VGlut driver to overexpress P35 or Imp allows us to manipulate only elav+ VGlut+ neurons. The hemilineage removed by PCD is elav- VGlut- and is not affected by this experiment. Consequently, the increase in MNs in adults with genetic manipulation can only be the result of the survival of the other hemilineage (elav+, VGlut+). Moreover, this experiment shows an increase in the number of neurons in the adult but not in LL3, demonstrating that the hemilineage (elav- VGlut-) is still removed by PCD with this genetic manipulation.

      The authors make an observation that differs from other systems in which Imp/Syp have been studied: that the expression of the two proteins appears to be independent and not influenced by cross-regulation. However there is a lack of investigation as to what effect this may have on how Imp/Syp regulate temporal identity. A key implication of the previously observed cross-regulation in the fly mushroom body is that the ratio of Imp/Syp could change over the life of the NB which would permit different neuronal identities. Without cross-regulation, do the authors still observe a gradient in the expression pattern of time? Because the data is presented with Imp and Syp stained in different brain samples, and without quantification across different stages, this is unclear. The authors use the term 'gradient' but changes in levels of these factors are not evident from the presented data.

      We have now quantified the transcriptional activity of Imp and Syp in the NB over time using smFISH. We have also quantified the relative expression of Imp and Syp protein in the NB over time by co-immunostaining. Additionally, we quantified the relative expression of Imp and Syp protein in postmitotic neurons as a function of their birth order in late LL3 larvae. All these data show an opposite temporal gradient of Imp and Syp in the NB and an opposite spatial gradient in immature neurons according to their birth order (Figure. 4). How these gradients are established in our system remains to be elucidated. 

      Reviewer #3 (Public Review):

      This study by Guan and co-workers focuses on a model neuronal lineage in the developing Drosophila nervous system, revealing interesting aspects about: a) the generation of supernumerary cells, later destined for apoptosis; and, b) new insights into the mechanisms that regulate this process. The two RNA-binding proteins, Imp and Syp, are shown to be expressed in temporally largely complementary patterns, their expression defining early vs later born neurons in this lineage, and thus also regulating the apoptotic elimination. Moreover, neuronal 'fate' transcription factors that are downstream of Imp and signatures of early-born neurons, can also be sufficient to convert later born cells to an earlier 'fate', including survival.

      The authors provide solid evidence for most of their statements, including the temporal windows during which the early and the later-born motoneurons are generated by this model lineage, how this relates to patterns of cell death by apoptosis and that mis-expression of early-born transcription factors in later-born cells can be sufficient to block apoptosis (part of, and perhaps indicative of the late-born identity).

      Other studies have previously outlined analogous, mutually antagonistic roles for Imp and Syp during nervous system development in Drosophila, in different parts and at different stages, with which the working model of this study aligns.

      Overall, this study adds to and extends current working models and evidence on the developmental mechanisms that underlie temporal cell fate decisions.

      I cannot summarize better the paper.

      Reviewer #1 (Recommendations For The Authors):

      While this is an interesting topic, I raised two issues in my original review.

      (1) Against the backdrop of numerous previous studies linking many developmental regulators, including tTFs, to programmed cell death in the developing CNS, which in several cases have involved identifying key PCD genes and decoding the molecular regulatory interplay between regulators and PCD genes, this study does not provide any new insight into the regulation of developmental PCD in the CNS.

      The authors have not added any new data to address this shortcoming.

      I agree with the reviewer that we did not attempt to link Imp/Syp with the temporal transcription factor (tTF) cascade or spatial selectors such as Hox genes. However, this decision was intentional as our primary focus was on studying immature MNs. It is worth noting that the decommissioning of NBs by autophagic cell death or terminal differentiation, which is mediated by Imp/Syp in other lineages, has not been correlated with tTFs or spatial selectors. Although we have not directly examined the involvement of the hb + sv > kr > pdm > cas > cas-svp > Grh cascade in the decommissioning of the Lin A neuroblast, our preliminary data indicate that Hb, Sv, Pdm, and Cas are not expressed in the Lin A NB, while Grh is consistently expressed in the NB (Wenyue et al., 2022). Thus, it is less likely that this particular tTF cascade is not implicated in Lin A neuroblast decommissioning. In contrast, spatial selectors, such as the Hox gene Antp, play an opposing role compared to HOX transcription factors in abdominal NBs. In the Lin A lineage, Antp promotes survival (Baek, Enriquez, & Mann, 2013). Here, to avoid repeating what has already been described in the literature, we focused on the role of Imp/Syp in postmitotic neurons and revealed that the precise elimination of MNs is linked to the control of TFs expressed in the MNs.

      (2) I raised the issue that it is unclear if Imp/Syp acts in the NB, and/or in IMC/GMC, and/or in the daughter cells generated from these.

      I agree with the reviewer's concern regarding the unclear function of Imp/Syp, i.e., whether it acts in the NB, IMC/GMC, or daughter cells. To address this, one possible approach would be to attempt rescuing Imp and Syp mutants by transgenic expression in specific cell types, such as NBs, IMC/GMC, or GB/daughter cells. However, we have not conducted such experiments as we were skeptical about the outcome. Previous published work has used drivers expressed in NBs, IMC/GMC, or postmitotic neurons to decipher the function of a gene in a specific cell type. But the results of these experiments must be taken with caution. Using NB/GMC drivers to study gene function can lead to effects not only in the NB but also in its progeny, including GMC or postmitotic neurons, due to the perdurance and stability of the Gal4 and UAS-gene expression system. For instance, dpn-Gal4 UASGFP not only labels the NB but also many of its progeny, even if Dpn is only expressed in NBs. And elav-Gal4 is expressed in the NB and GMCs.

      However, our overexpression of Imp in immature neurons using Vglut demonstrates that Imp promotes cell survival through an autonomous function in these neurons. This driver is only expressed in postmitotic neurons (elav+) and not in the NB, IMC/GMC, or in the hemilineage eliminated by cell death (elav-vglut-).

      Reviewer #2 (Recommendations For The Authors):

      Oddly knockdown of Imp in the neuroblast (Fig. 5D) only led to death at 8h APF, when Imp is no longer expressed. Do the authors have an explanation as to how the stem cell can survive until this point? A discussion would be helpful.

      The simple explanation is the efficiency of RNAi. The imp-/- MARCM clones (Guan et al., 2022) lead to a stronger reduction of MNs in LL3.

      A simple experiment I would recommend is to repeat the antibody stainings of staged larvae/pupae (Fig. 4) having the anti-Imp/Syp antibodies in the same brain sample, and perhaps a quantification of the ratio in the NB. Given the species in which the ABs were raised seem compatible, this should be feasible. As it stands now, there is no indication of whether the ratio of Imp vs Syp change over time.

      We have now quantified the transcriptional activity of Imp and Syp in the NB over time. We have also quantified the relative expression of Imp and Syp proteins in the NB over time and quantified the relative expression of Imp and Syp proteins in postmitotic neurons as a function of their birth in late LL3 larvae. How these gradients are established in our system still remains to be 

      Minor errors/suggestions:

      Fig 4. Time legend at the top goes A, B, C, E, F (no D). So it doesn't match the panels below

      Yes, we have made the corrections.

      Sentence repeated in Intro:

      The process of terminating NB neurogenesis through autophagic cell death or terminal differentiation is commonly referred to as decommissioning.

      Yes, corrections have been made.

      IN FIGURE 1 THEY SAY 'TYPE IB' AND IN FIGURE 2 THEY SAY 'TYPE 1B'

      We have changed it to type 1b.

      In Fig2A-It's hard to see lack of Elav and Fig2G-It's hard to see presence of Dcp1. Panels could be adjusted to emphasize these results

      We have increased the size of the panels and made two separate panels where only the elav and Dcp1 signals are present.

      Observations that the result is equivalent in all thoracic segments is expected, since all legs need the same number of neurons. This is nice to have but can be in the supplement.

      Overall the figure number seems excessive, especially considering much of the results included(particularly the NB results) are findings consistent with previous papers and some is characterization of the system that does not fit well with the main focus regarding Imp/Syp (i.e death of one hemi-lineage:

      Figure 5 and 6 can be joined as one.

      We have combined Figures 5 and 6, showing only the T1 segments.

      There is some discrepancy between graphs Fig7F and K: At LL3 the number of neurons is different for the control in 7F and the count in K

      Yes, because the genetic backgrounds are not the same and we are not counting the same type of cells. In 7F, we are counting the elav+ and VGlut+ cells, whereas in Figure 7K, we are counting all the elav+ in Lin A, including those elav+ VGlut-. VGlut expression arrives a bit later after elav+, which is why we have fewer elav+ cells in 7F. In other words, VGlut MARCM clones do not label all Lin A elav+ cells. I have clarified this in the figure.

      Reviewer #3 (Recommendations For The Authors):

      Main comment: on the notion of Imp and Syp gradients:

      p. 5, related to figure 4 - there are clearly distinct windows for predominantly (if not exclusively) Imp, and later, Syp expression in lineage 15, with a phase of co-expression.

      However, based on the data shown, it is unclear whether these windows represent gradients, as repeatedly stated. If the notion of gradients is derived from other studies, on other lineages, then this would be good to clarify. Alternatively, the idea of temporally opposing gradients of Imp and Syp would need to be demonstrated for this lineage.

      For example, a more accurate way to describe this study's data is given on p.7 "In conclusion, our findings demonstrate that the opposite expression pattern of Imp and Syp in postmitotic neurons precisely shapes the size of Lin A/15 lineage by controlling the pattern of PCD in immature MNs (Fig. 8)."

      We have now quantified the transcriptional activity of Imp and Syp in the NB over time. We have also quantified the relative expression of Imp and Syp proteins in the NB over time. We have also quantified the relative expression of Imp and Syp proteins in postmitotic neurons as a function of their birth in late LL3 larvae. How these gradients are established in our system still remains to be identified.

      Minor points:

      p.6, related to figure 7: Are numbers of EDU- early born and EDU+, late born, MNs expressed as means in the main text? As written, it suggests absence of any variability, which one would expect and which is shown in Fig.7 data.

      Yes, we have added averages in the text.

      Methods: the author name 'Lacin' has been mis-spelled

      Sorry about that, it's been corrected.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This paper proposes a valuable new method for the assessment of the mean kurtosis for diffusional kurtosis imaging by utilizing a recently introduced sub-diffusion model. The evidence supporting the claims that this technique is robust and accurate in brain imaging is incomplete. The work could be of interest in the research and clinical arena.

      We thank the editors for their assessment and the reviewers for their careful reading and feedback that helped to improve the manuscript. We have addressed all the reviewers’ concerns and would like to request an update of the assessment to reflect the revisions we have made.

      Below, we address the reviewers’ comments.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This study introduces an innovative method for assessing the mean kurtosis, utilizing the mathematical foundation of the sub-diffusion framework. In particular, a new fitting technique that incorporates two different diffusion times is proposed to estimate the parameters of the sub-diffusion model. The evaluation of this technique, which generates kurtosis maps based on the sub-diffusion framework, is conducted through simulations and the examination of data obtained from human subjects.

      We thank Reviewer #1 for pointing out the novelty and innovation of our work.

      Strengths:

      The utilization of the sub-diffusion model for tissue characterization is a significant conceptual advancement for the field of diffusion MRI. This study adeptly harnesses this approach for an accurate estimation of the parameters of the widely employed diffusion model, DKI, leveraging their established analytical interconnection as evidenced in prior research. Notably, this approach not only proposes a robust, fast, and accurate technique for DKI parameter estimation but also underscores the viability of deploying the sub-diffusion model for tissue characterization, substantiated by both simulated and human subject analyses. The paper is very-well written; well-organized; and coherent. The simulation study included different aspects of water diffusion as captured by diffusion-weighted MRI such as varying diffusion times and different b-value subpopulations, resulting in a comprehensive and thorough discussion.

      We thank Reviewer #1 for highlighting the the strengths of our work.

      Weaknesses:

      The primary objective of this study is to demonstrate a robust approach for estimating DKI parameters by directly calculating them using the parameters of the sub-diffusion model. This premise, however, relies on the assumption that the sub-diffusion model effectively characterizes the diffusion MRI signal and that its parameters are both robust and accurate. Throughout the manuscript, the term "ground truth kurtosis K" is frequently used to denote the "true K" value in the context of the simulation study. Nonetheless, given that the data is simulated using the new sub-diffusion model - an approximation of the DKI-based signal expression- this value cannot truly be considered the "ground truth K". The simulation study highlights the robustness and accuracy of D* and K*, but it inherently operates under the assumption that the observed data is in the form of the sub-diffusion model.

      It is correct that our study operates under the assumption that the observed data is in the form of the sub-diffusion model, and indeed one of the key outcomes of this work is to demonstrate the effectiveness of that assumption and the new possibilities it brings. Naturally, using any mathematical model at all carries assumptions. Over the past two decades, many mathematical and biophysical models have been proposed to characterise diffusion MRI signals. However, model validation remains an open challenge in the field. In this, as well as in our previous work (Yang et al, NeuroImage, 2022), we have shown that our proposed sub-diffusion model not only provides a much better fitting compared to the traditional DKI method, overcoming the major limitation of the traditional DKI method on the maximum b-value, but also generates brain maps with superior tissue contrast and elucidates previously unseen structure.

      We have replaced the term “ground truth kurtosis K” with “true kurtosis K”.

      The comment “… using the new sub-diffusion model – an approximation of the DKI-based signal expression…” is a bit misleading. In fact we propose that the reverse interpretation is the more suitable way to view the relationship: the DKI model is a degree-2 approximation of the sub-diffusion model, as in eq. (7).

      Reviewer #2 (Public Review):

      Summary: The authors present a technique for fitting diffusion magnetic resonance images (dMRI) to a sub-diffusion model of the diffusion process within brain imaging. The authors suggest that their technique provides robust and accurate calculation of diffusional kurtosis imaging parameters from which high quality images can be calculated from short dMRI data acquisitions at two diffusion times.

      Strengths: If the authors can show that the dMRI signal in brain tissue follows a sub-diffusion model decay curve then their technique for accurately and robustly calculating diffusional kurtosis parameters from multiple diffusion times would be of benefit for tissue microstructural imaging in research and clinical arenas.

      In Figure 7, we showed that the diffusion MRI signals follow the sub-diffusion model decay curves.

      Weaknesses: The applied sub-diffusion model has two parameters that are invariant to diffusion time, D_β and β which are used to calculate the diffusional kurtosis measures of a diffusion time dependent D* and a diffusion time invariant K*. However, the authors do not demonstrate that the D_β, β and K* parameters are invariant to diffusion time in brain tissue.

      In our proposed sub-diffusion model, D_β and β are assumed to be time-independent parameters, which is a key strength of the approach. The goal is to characterise tissue-specific properties (D_β for diffusivity and β for the extent of tissue complexity) that do not rely on the diffusion time setting in diffusion MRI experiments. To extract such time-independent properties, we proposed a new sampling and fitting strategy – fitting at least two diffusion time data together.

      The authors' results visually show that there is time dependence of the K* measure (in Figure 6) that is more apparent in white matter with K* values being higher for diffusion times of ∆=49 ms than ∆ = 19 ms. The diffusion time dependence of K* indicates there is also diffusion time dependence of β.

      The discrepancies in the fitted K* for ∆ = 19 ms and ∆ = 49 ms separately do not necessarily imply that there is a true time dependence in these parameters. Rather, this can be explained by a deficiency of data when fitting a two-dimensional surface (S is a function of q and ∆) based on data along a single curve for a fixed value of ∆.  Without properly sampling the surface across two independent coordinates, one cannot expect a fully reliable fit.  Indeed, a great advantage of our proposed method is to allow fitting data with multiple values of ∆, and thereby getting a richer data set with which to fit the full signal surface S(q, ∆).  The results for fitting ∆ = 19 ms and ∆= 49 ms data together clearly show the benefits of this approach, with superior contrast achieved.

      Furthermore, Figure 7 shows that there is a tissue specific root mean squared error in model fitting over the two diffusion times which indicates greater deviation from the model fit in white matter than grey matter.

      Although the errors are not completely tissue-independent, please note the magnitude of the RMSE is very small. The quality of the fitting in both white and grey matter is shown in sub-figures (A)-(H) for several representative voxels.

      To show that the sub-diffusion model is robust and accurate (and consequently that K* is robust and accurate) the authors would have to demonstrate that there is no diffusion time-dependence in both D_β and β in application to brain imaging data for each diffusion time separately. Simulated data should not be used to demonstrate the robustness and accuracy of the sub-diffusion model or to determine optimization of dMRI acquisition parameters without first demonstrating that D_β and β are invariant to diffusion time. This is because simulated signals calculated by using the sub-diffusion characteristic equation of dMRI signal decay will necessarily have diffusion time invariant D_β and β parameters. Without further information demonstrating diffusion time invariance of D_β, β and K* it is not possible to determine whether the authors have achieved their aims or that their results support their conclusions.

      First, as explained above, the dMRI signal S is a function of q and ∆, i.e., a two-dimensional surface S(q, ∆), and hence fitting data sampled from single diffusion time (i.e., one curve on the surface) cannot provide reliable parameters, as seen in the discrepancies in K* in Figure 6 (bottom two rows). Our proposed new sampling and fitting strategy overcomes this issue. That is, to obtain a reliable fitting, one should fit data from at least two diffusion times together (i.e., sampling data from at least two curves on the signal surface).

      Second, to demonstrate that D_β and β are time invariant, one would require data at several diffusion times with high b values. Such data cannot be easily obtained. The data used in this current study is the MGH Connectome 1.0 human brain data, which only contains two diffusion times, ∆ = 19 ms and ∆ = 49 ms.

      Hence, we conducted numerical experiments to demonstrate our idea. In Figure 3, we showed that (i) the variability of the fitted parameters is significantly reduced when moving from fitting single diffusion time data to two diffusion time data, and (ii) the difference in fitting three diffusion times compared to two is very minor, indicating convergence towards the correct time-independent parameter values. The results from fitting human brain data (Figure 6 and Tables 2-4) agree with the expectations from our numerical experiments. Hence, we believe that we have provided sufficient evidence to support our proposed sub-diffusion model and its optimal fitting strategy.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      It is clear that the authors preferred generating the data by using sub-diffusion model's signal expression as it has many benefits, such as allowing different diffusion times to be incorporated, and hence investigation of the effect of the number of diffusion times on the accuracy of the parameter fitting. I recommend adding another simulation study by generating the data with the DKI model expression (as the goal of the study is to provide an accurate mapping of diffusional mean kurtosis), fitting the data to the sub-diffusion model's expression in Eq. (10), and then calculating K* and D* by Eqs. (8) and (9) only for a fixed diffusion time and one b-value subset.

      We appreciate the suggestion. However, unfortunately it is not appropriate to generate data with the DKI model, as the maximum b-value is limited to 2000~3000s/mm^2 and hence the DKI model cannot represent diffusion MRI signals from a full spectrum of b-values. A key strength of our proposed model is that it removes this limitation.

      There is a typo on Page 24, Line 581; "b<=2400" should be b>=2400.

      We have fixed this typo.

      Reviewer #2 (Recommendations For The Authors):

      As the authors state the sub-diffusion model has two parameters, D_β and β that are invariant to diffusion time, and give rise to a time-varying diffusion coefficient in mm^2s^-1 and a time invariant kurtosis. However, there is a need to be clearer and more specific about the implications of the sub-diffusion model. The manuscript would be improved by the authors:

      (a) Defining the time-varying diffusion coefficient that arises from the model, its functional form and properties.

      We refer Reviewer#2 to eq.(5) and eq.(8) for the definition of time-varying diffusion coefficients D* and D_SUB and their relationship.

      (b) Clearly discuss the implications of this with respect to other time-varying diffusion coefficient methods in the current literature.

      We refer Reviewer#2 to the section “Time-dependence of diffusivity and kurtosis” under “Discussions”.

      (c) Demonstrating that D_β and β do not vary with diffusion time when estimated from dMRI acquired on human participants.

      We have addressed this comment in the public review.

      The manuscript would benefit from increases in clarity in all sections and the authors identifying typographical errors.

      We have updated the relevant text in the revised manuscript to make it clearer, including fixing typos.

      Specific improvements to clarity in the methods and results section would include:

      Line 620: Why were parameter approximations for model fitting to simulated data restricted to the ranges D_β∈[10^(-4),10^(-3) ] and β∈[0.5,1] but in fitting to brain imaging data the ranges were D_β>0 and 0<β<=1.

      The parameter ranges for model fitting to both the simulated and human data were set to the same: D_β>0 and 0<β<=1. To generate simulated data, D_β and β ranges were restricted to reflect observations in human brain data. We have updated the text to make this clearer.

      Lines 622, 628 & 629: Which goodness of fit measure was used?

      The goodness of fit measure for all simulated results is the coefficient of determination, or R^2 value, as noted in the “Goodness-of-fit and region-based statistical analysis” section under Methods. We have updated the text to make this clearer.

      Line 666: The method for computation of R^2 within the coefficient of determination should be stated as there are several ways of calculating an R^2 value.

      The formula for computing R^2 has been added to the text.

      Line 685: A t-test is mentioned but it is not clear as to the inputs to this test, or where the results of this analysis are presented.

      We have updated the text to make this clearer. The results of this analysis are presented in Table 5. The entries identified in italic under the optimal b-value heading were found to be significantly different from the benchmark mean K* reported in Table 2.

      Line 696: It is not clear how the intra-class correlation coefficient histograms are computed from six subjects. This applies to results in Figure 10 that require greater clarity in the description.

      The formula for computing the intra-class correlation coefficient has been added to the sub-section “Scan-rescan analysis using intraclass correlation coefficient (ICC)” under “Methods”.

      It would be helpful if the authors primarily report results pertaining to the model parameters D_β and β. This is because D* and K* are calculated from D_β and β. Conditions for robust and accurate estimation of D_β and β will provide robust and accurate measures for D* and K*.

      Two new tables for the model parameters D_β and β have been added. Please see Tables 3 and 4 in the revised manuscript.

      The authors state that fitted model parameters are not affected by maximum b-value (paragraph beginning line 366). This statement is based on their model simulation results. Could the authors provide data to support this based on the application of their model to the human brain imaging data?

      We would like to clarify that our statement is indeed based on human brain imaging. As stated in the paragraph beginning line 366, both results in Table 2 (using full dataset) and Table 5 (using dataset with optimal b-value sampling) are generated from the Connectome human brain data. If maximum b-value dependence is present, benchmark (Table 2) versus optimal region-specific results (Table 5, or previously Table 3) should show some systematic difference.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors investigate the role of chirping in a species of weakly electric fish. They subject the fish to various scenarios and correlate the production of chirps with many different factors. They find major correlations between the background beat signals (continuously present during any social interactions) or some aspects of social and environmental conditions with the propensity to produce different types of chirps. By analyzing more specifically different aspects of these correlations they conclude that chirping patterns are related to navigation purposes and the need to localize the source of the beat signal (i.e. the location of the conspecific).

      The study provides a wealth of interesting observations of behavior and much of this data constitutes a useful dataset to document the patterns of social interactions in these fish. Some data, in particular the high propensity to chirp in cluttered environments, raises interesting questions. Their main hypothesis is a useful addition to the debate on the function of these chirps and is worth considering and exploring further.

      After the initial reviewers' comments, the authors performed a welcome revision of the way the results are presented. Overall the study has been improved by the revision. However, one piece of new data is perplexing to me. The new Figure 7 presents the results of a model analysis of the strength of the EI caused by a second fish to localize when the focal fish is chirping. From my understanding of this type of model, EOD frequency is not a parameter in the model since it evaluates the strength of the field at a given point in time. Therefore the only thing that matters is the phase relationship and strength of the EOD. Assuming that the second fish's EOD is kept constant and the phases relationship is also the same, the only difference during a chirp that could affect the result of the calculation is the potential decrease in EOD amplitude during the chirp. It is indeed logical that if the focal fish decreased its EOD amplitude the target fish's EOD becomes relatively stronger. Where things are harder to understand is why the different types of chirps (e.g. type 1 vs type 2) lead to the same increase in signal even though they are typically associated with different levels of amplitude modulations. Also, it is hard to imagine that a type 2 chirps that is barely associated with any decrease in EOD amplitude (0-10% maybe), would cause doubling of the EI strength. There might be something I don't understand but the authors should provide a lot more details on how this result is obtained and convince us that it makes sense.

      We thank the author for the comments and we agree that the approach could have been better detailed. As anticipated by the Reviewer, the Boundary Element Method (BEM) model can be used simply to calculate the electric field and electric image at a specific point in time (instantaneously), regardless of EOD frequency. However, our model allows for the concatenation of consecutive instants and thus is able to render an entire sequence of electric fields - and resulting electric images - incorporating realistic EOD characteristics such as shape, duration, and frequencies (see Pedraja et al., 2014).

      Chirp-triggered EIs were modeled using real chirps produced by interacting fish. Each chirp was thus associated to its duration and peak parameters, as well as the fish positional information (distance and angle). 

      However, since we did not know the beat phase at which chirps were produced, we computed electric images for each fish position and chirp scenario by simulating various phases (here referred to the initial offset of the two EODs, set at 4 phases, equally spaced). These are intended as phases of the sender EOD and simply refer to the initial OFFSET between the two interacting EODs. However, since our simulations were run over a time window of 500 msec, all phases are likely to be covered, with a different temporal order relative to the chirp (always centered within the 500 msec).

      The simulation was run maintaining consistent timing for both chirp and non-chirp conditions, across approximately 800 body nodes. At each node, the current flow was calculated from the peak-to-peak of the EOD sum (i.e. the point-to-point of the difference between the beat positive and negative envelopes). Analyzing the EIs over this fixed time window enables us to assess the unitary changes of current flow induced by chirps over units of time (ΔI/Δt). From this, we can calculate a cumulative sum of current flow changes - expressed as delta(EI) and use it to show the effect of the chirps on the spatiotemporal EI (Figure 7C).

      One can express this cumulative change mapped onto the fish body (keeping the 800 points separated, as in Figure 7C) or further sum the current changes to obtain a single total (as shown in Figure 7D).

      One can check this by considering that a sum for example of a set of 500/800 points - judging from the size of the blue areas in C not all 800 points have a detectable change - each valued 0.1-to-0.3 mA/s, one could get circa 100 mA/s, which is what is shown in D. (is this what is happening ?)

      We do not know why chirps of different types triggered similar effects. It is possible that, since EI measurements are pooled over several chirps produced at different angles and distances, in case of a lower amount of chirps considered for a given type (as in the case of rises, very low) these measurements may not highlight more marked differences among types. In a publication we are currently working on, we are considering a larger dataset to better assess these results.

      The methods section has been edited to clarify the approach (not yet).

      Reviewer #2 (Public Review):

      Studying Apteronotus leptorhynchus (the weakly electric brown ghost knifefish), the authors provide evidence that 'chirps' (brief modulations in the frequency and amplitude of the ongoing electric signal) function in active sensing (specifically homeoactive sensing) rather than communication. Chirping is a behavior that has been well studied, including numerous studies on the sensory coding of chirps and the neural mechanisms for chirp generation.

      Chirps are largely thought to function in communication behavior, so this alternative function is a very exciting possibility that could have a great impact on the field.

      We thank the Reviewer for the extensive and constructive comments. We would like to add that, while it is true that many detailed studies have been published on the anatomy and physiology of the circuits implicated in the production and modulation of “electric chirps”, most of this  research assumed, and focused exclusively on, their possible role in communication.  In addition, most behavioral studies did the same and a meta-analysis of the existing literature on chirping allows to trace back the communication idea mainly to two studies: Hagedorn and Heiligenberg, 1985 (“Court and spark: electric signals in the courtship and mating of gymnotoid fish”) and Hopkins, 1974 (“Electric Communication: Functions in the Social Behavior of Eigenmannia Virescens”), among the main sources. Importantly, in these studies only contextual observations have been made (no playback experiment or other attempts to analyze more quantitatively the correlation of chirping with other behaviors).

      The authors do provide convincing evidence that chirps may function in homeoactive sensing. However, their evidence arguing against a role for chirps in communication is not as strong, and fails to sufficiently consider the evidence from a large body of existing research. Ultimately, the manuscript presents very interesting data that is sure to stimulate discussion and follow-up studies, but it suffers from dismissing evidence in support of, or consistent with, a communicative function for chirps.

      Although the tone of some statements present in our earlier draft may suggest otherwise, through our revisions, we have made an effort to clarify that we do not intend to dismiss a function of chirps in communication, we only intend to debate and discuss valid alternative hypothesis, advanced from reasonable considerations.

      Before writing this manuscript, we have attempted to survey  literally all the existing literature on chirps (including studies focused on behavior, peripheral sensory physiology as well as brain physiology). Although it is not unlikely that some studies have eluded our attention, an effort for a comprehensive review was made. Based on this survey we realized that none of the studies provided a clear  and  unambiguous piece of evidence to support the communication hypothesis (we refer here to the weak points highlighted in the discussion and mentioned in the previous comment). Which in fact does not come without its weak points and contradictions (see later comments).

      It follows a summary of the mentions made to the communication theory in the different section of the manuscript including several edits we have applied in response to the Reviewer’s concern:

      In the abstract we clearly state that we are considering an alternative that is only hypothetically complementary, not for sure.  Nonetheless, we have identified a couple of instances that could sound dismissive of the “communication hypothesis” in the following section.

      In the introduction we write in fact about the possibility of interference between communication signals and conspecific electrolocation cues, as they are both detected as beat perturbations. We did not mean to use “Interference” here as “reciprocal canceling”, rather we intended it as “partial or more or less conspicuous overlap” in the responses triggered in electroreceptors.

      Hoping to convey a clearer message, we have edited the related statement and changed it to “both types of information are likely to overlap and interact in highly variable ways”.

      We have also removed the statement: “According to this idea, beats and chirps are not only detected through the same input channel, but also used for the same purpose.” as at this point in the manuscript it may be too strong.

      In the results section we do not include statements that might be seen as dismissive of the communication hypothesis but only statements in support of the “probing with chirps” idea (which is the central hypothesis of the study).

      In the discussion paragraphs we elaborate on why the current functional view is either flawed or incomplete (first paragraph “existing functional hypotheses''). Namely: 1)  multiple triggering factors implied in chirp responses covary and need to be disentangled (example DF/ sex), 2) findings on brown ghosts and a few other gymnotiforms have been used to advance the hypothesis of “communication through chirps'' in all weakly electric fish (including pulse species). 3) social encounters - in which chirps are recorded - imply also other behaviors (such as probing) which have not been considered so far. This point is related to the first one on covariates. 4) most studies referring to big chirps as courtship chirps were not done in reproductive animals (added now)  and 5) no causal evidence has been provided so far to justify a role of chirps in social communication.

      We are discussing these points as challenges to the communication hypothesis, not to dismiss the hypothesis, but rather to motivate future studies addressing these challenges.

      We do not want to appear dismissive of the communication hypothesis and had therefore previously edited the manuscript to avoid the impression of exclusivity of the probing hypothesis. We have now gone over the manuscript once more and edited several sentences. Nevertheless, we want to point out again that - despite the large consensus - the communication hypothesis has, until now, never been investigated with the kind of rigor applied here.

      The authors do acknowledge that chirps could function as both a communication and homeactive sensing signal, but it seems clear they wish to argue against the former and for the latter, and the evidence is not yet there to support this.

      In both rounds of revision we have made an effort to convey a more inclusive interpretation of our findings. We tried our best to express our ideas as hypothetical, not as proof that communication through chirps does not exist. The aim of this study is to propose an alternative view, and this cannot be done without underlining the weak points of an existing hypothesis while providing and supporting reasonable arguments in favor of the alternative we advance. The actual evidence for a role of chirping in communication is much less strong than appears from the pure number of articles that have discussed chirps in this context.

      Regarding the weak evidence against communication, here we can list a few additional important points related to the proposed interpretations of chirp function (more specific than those made earlier):

      (1) A formally sound assessment of signal value/meaning - as typically done in animal communication studies should involve: 

      a) the isolation of a naturally occurring signal and determination of the context in which it is produced 

      b) the artificial replication of the signal

      c) the observation that such mimic is capable of triggering reliable and stereotyped responses in a group of individuals (identified by sex and/or species) under the same conditions (conditioned, unconditioned, state-dependent, etc.). As discussed for instance in Bradbury and Vehrencamp, 2011; Laidre and Johnstone, 2013; Wyatt, 2015; Rutz et al., 2023.

      This approach has so far not been applied to weakly electric fish. The initial purpose of the present study was in fact to conduct this type of validation.

      (2) The hypothesis of chirps used for DF-sign discrimination - for “social purposes” - although plausible in the face of theoretical considerations,  does not seem to be reasonable in practice, when one considers emission rates of 150 chirps per minute. We do find a strong correlation of chirp type with DF, which is often very abrupt and sudden (as if the fish were tracking beat frequency to guess its value) but the consideration made above on chirp rates seems to discourage this interpretation.

      (3) The hypothesis of chirp-patterning (i.e. chirping may have meaning based on the sequence of chirps of different types, a bit like syllables in birdsongs) - assessed by only one study conducted in our group - has not been enough substantiated by replication. We have surveyed all possible combinations of chirps produced by interacting pairs in different behavioral conditions using different value for chirp sequence size: 2, 3,... ,8 chirps (both considering the sender alone as well as sender+receiver together). In all cases we found no evidence for  a context dependent “modulation” of chirp types (i.e. no specific chirp type sequence in specific contexts).

      (4) The hypothesized role of “large chirps” as courtship signals could be easily criticized by noting the symmetrical distribution of these events around  a DF of 0 Hz . Although one could argue about a failure to discriminate DF-sign, to explain this well known pattern. However, we know from Walter Heiligenberg’s work and physiological considerations that such task can be solved easily through t-units and … in principle even just by motion (which would change the EOD phase in frequency dependent ways, thus potentially revealing the DF sign).

      Overall, these considerations made us think that certainly chirping occurs in a social context, but it is the meaning of this behavior that remains elusive.  We noticed that environmental factors are also strongly implied … we then formulate an alternative hypothesis to explain chirping but we do so  without dismissing the communication idea.

      All this seems to us just a careful way to critically discuss our results and those of other studies, without considering the issue resolved.

      In the introduction, the authors state, "Since both chirps and positional parameters (such as size, orientation or motion) can only be detected as perturbations of the beat, and via the same electroreceptors, the inputs relaying both types of information are inevitably interfering." I disagree with this statement, which seems to be a key assumption. Both of these features certainly modulate the activity of electroreceptors, but that does not mean those modulations are ambiguous as to their source. You do not know whether the two types of modulations can be unambiguously decoded from electroreceptor afferent population activity.

      We thank the Reviewer for noting this imprecision. We have addressed the Reviewer’s concern in another reply (see above).

      My biggest issue with this manuscript is that it is much too strong in dismissing evidence that chirping correlates with context. In your behavioral observations, you found sex differences in chirping as well as differences between freely interacting and physically separated fish. Chirps tended to occur in close proximity to another fish. Your model of chirp variability found that environmental experience, social experience, and beat frequency (DF) are the most important factors explaining chirp variability. Are these not all considered behavioral or social context? Beat frequency (DF) in particular is heavily downplayed as being a part of "context" but it is a crucial part of the context, as it provides information about the identity of the fish you're interacting with. The authors show quite convincingly that the types of chirps produced do not vary with these contexts, but chirp rates do.

      We believe the “perceived claim” may be an issue of unclear writing. We have now tried to better clarify that “context” affects chirp rates, but it does not affect chirp types as much (except when beat frequency is high).  

      We have edited two statements possibly susceptible to misinterpretation: 

      (1) In the results: “It also indicates that chirp parameters such as duration and FM do not seem to be associated with any particular context in a meaningful way, other than being affected by beat frequency.”

      (2) In the discussion: the statement

      “Recordings from interacting fish pairs confirmed the absence of any significant correlation between chirp type choice and behavioral context (Figure S2) although the variance of chirp parameters appears to be significantly affected by this factor (Figure 2). This may suggest that the effect of behavioral context is mainly detectable in the number of chirps produced (Figure S1), rather than the type (Figure S2).”

      has been changed to:

      “Recordings from interacting fish pairs confirmed the absence of any significant correlation between chirp type choice and behavioral context, except for those cases characterized by higher beat frequencies  (Figure S2). This suggests that the effect of behavioral context highlighted in our factor analysis (Figure 2) is mainly due to the number of chirps produced (Figure S1), rather than their type (Figure S2).”

      Eventually, in the results we emphasize the relatively higher impact of previously unexplored factors on chirp variance: “The plot of individual chirps (Figure 2C) shows the presence of clustering around different categorical variables and it reveals that experience levels or swimming conditions are important factors affecting chirp distribution (note for instance the large central “breeding” cluster in which fish are divided and the smaller ones in which fish are free). Sender or receiver identity does not individuate any clear clustering relative to either sex (see the overlap of male_s/male_r and female_s/female_r) or social status (dominant/subordinate). Chirps labeled based on tank experience (i.e. resident vs intruder) are instead clearly separated.”.

      Further, in your playback experiments, fish responded differently to small vs. large DFs, males chirped more than females, type 2 chirps became more frequent throughout a playback, and rises tended to occur at the end of a playback. These are all examples of context-dependent behavior.

      We do note that male brown ghosts chirp more than females. But we do also say - and show in figure 8 - that males move more in proximity to and around conspecifics. We do acknowledge that chirp time-course may be different during playbacks in a type-dependent manner. But how this can support the communication hypothesis - or other alternatives - is unclear. This result could equally imply the use of different chirp types for different probing needs. Since we cannot be sure about either, we do not want to put too much emphasis to it. Eventually, the fact that “context” (here meant broadly to define different experimental situations in which social but also physical and environmental parameters are altered) affects chirping is undeniable: cluttered and non-cluttered environments do represent different contexts which differently affect chirping in conspicuous ways.

      In the results, the authors state, "Overall, the majority of chirps were produced by male subjects, in comparable amounts regardless of environmental experience (resident, intruder or equal; Figure S1A,C), social status (dominant or subordinate; Figure S1B) or social experience (novel or experienced; Figure S1D)." This is not what is shown in Figure S1. S1A shows clear differences between resident vs. intruder males, S1B shows clear differences between dominant vs. subordinate males, and S1D shows clear differences between naïve and experienced males. The analysis shown in Figure 2 would seem to support this. Indeed, the authors state, "Overall, this analysis indicated that environmental and social experience, together with beat frequency (DF) are the most important factors explaining chirp variability."

      The Reviewer is right in pointing at this imprecise reference and we are grateful for spotting this incongruence. The writing refers probably to an earlier version of the figure in which data were grouped and analyzed differently. We now edited the text and changed it to: “Overall, the majority of chirps were produced by male subjects, at rates that seemed  affected by environmental experience (resident, intruder or equal; Figure S1A,C), social status (dominant or subordinate; Figure S1B) and social experience (novel or experienced; Figure S1D).”

      The choice of chirp type varied widely between individuals but was relatively consistent within individuals across trials of the same experiment. The authors interpret this to mean that chirping does not vary with internal state, but is it not likely that the internal states of individuals are stable under stable conditions, and that individuals may differ in these internal states across the same conditions? Stable differences in communication signals between individuals are frequently interpreted as reflecting differences between those individuals in certain characteristics, which are being communicated by these signals.

      It seems here we have been unclear in the writing: while it is true that behavioral states are stable and can imply stable chirp patterning (if the two are related), since chirp types vary abruptly and in a reliable DF-dependent manner, different types of chirps are unlikely to be matched to different internal states following the same temporal order in such a reliable way (similarly repeated through consecutive trials).

      This would imply the occurrence of different internal states in rapid sequence, reliably triggered by repeated EOD ramps, regardless of whether the playback is 20 sec long or 180 sec long.

      We have edited this paragraph to better explain this: “The reliability by which the chirping response adapts to both the rate and direction of beat frequency is variable across individuals but rather stable across trials (relative to a given subject), further suggesting that chirp type variations may not reflect changes in internal states or in the animal motivation to specific behavioral displays (which are presumably subject to less abrupt variations and stereotypical patterning based on DF).”

      I am not convinced of the conclusion drawn by the analysis of chirp transitions. The transition matrices show plenty of 1-2 and 2-1 transitions occurring.

      The only groups in which 1-2 and 2-1 transitions are as frequent as 1-1 and 2-2 (being 1 and 2 the numerical IDs of the two interacting fish) are F-F pairs. This is a result of the fact that in females chirp rates are so low that within-fish-correlations end up being as low as between-fish-correlations. We believe the impression of the Reviewer could be due to the fact that these are normalized maps (see legend of Figure 5A-B).

      Further, the cross-correlation analysis only shows that chirp timing between individuals is not phase-locked at these small timescales. It is entirely possible that chirp rates are correlated between interacting individuals, even if their precise timing is not.

      We agree with the Reviewer, this is a possibility. To address this point, we did edit the results section to acknowledge that what we see may be related to the time window chosen (i.e. 4 sec):

      “More importantly, they show that - at least in the social conditions analyzed here and within small-sized time windows - chirp time series produced by different fish during paired interactions are consistently independent of each other.”

      Further, it is not clear to me how "transitions" were defined. The methods do not make this clear, and it is not clear to me how you can have zero chirp transitions between two individuals when those two individuals are both generating chirps throughout an interaction.

      We thank the Reviewer for bringing up this unclear point. We have now clarified how transitions were calculated in the method section: “The number of chirp transitions present in each recording (dataset used for Figures 1, 2, 5) was measured by searching in a string array containing the 4 chirp types per fish pair, all their possible pairwise permutations (i.e. all possible permutations of 4+4=8 elements are: 1-1, 1-2, 1-3 … 7-6, 7-7, 7-8; considering the following legend 1 = fish1 type 1, 2 = fish 1 type 2, 3 = fish1 type 3 … 6 = fish2 type 2, 7 = fish2 type 3 and 8 = fish2 rise).”.

      Zero transitions are possible if two fish (or groups of fish) do not produce chirps of all types. Only transitions of produced types can be counted.

      In the results, "Although all chirp types were used during aggressive interactions, these seemed to be rather less frequent in the immediate surround of the chirps (Figure 6A)." A lack of precise temporal correlation on short timescales does not mean there is no association between the two behaviors. An increased rate of chirping during aggression is still a correlation between the two behaviors, even if chirps and specific aggressive behaviors are not tightly time-locked.

      The Reviewer is right in pointing out the limited temporal scaling of our observations/analysis. We have now edited the last paragraph of the results related to figure 6 to include the possibility mentioned by the Reviewer: “The significantly higher extent of chirping during swimming and locomotion, consistently confirmed by 4 different approaches (PSTH, TM, CN, MDS), suggests that - although chirp-behavior correlations may exist at time-scales larger than those here considered - chirping may be linked more strongly with scanning and environmental exploration than with a particular motivational state, thus confirming findings from our playback experiments.”

      The Reviewer here remarks an important point, yet, due to space limitations, we have considered only a sub-second scale. Most playback experiments in weakly electric fish implied the use of EOD mimics for a few tens of seconds - to avoid habituation in the fish behavioral responses -  while inter-chirp intervals usually range between a few hundreds of milliseconds to seconds (depending on how often a fish would chirp). This suggested to us that a 4 second time window may not be a bad choice to start with.

      In summary, it is simply too strong to say that chirping does not correlate with context, or to claim that there is convincing evidence arguing against a communication function of chirps. Importantly, however, this does not detract from your exciting and well-supported hypothesis that chirping functions in homeoactive sensing. A given EOD behavior could serve both communication and homeoactive sensing. I actually suspect this is quite common in electric fish (both gymnotiforms and mormyrids), and perhaps in other actively sensing species such as echolocating animals. The two are not mutually exclusive.

      We agree with the Reviewer that context - broadly speaking - does affect chirping (as we mentioned above). We hope we have improved the writing and clarified that we do not dismiss communication functions of chirping, but we do lean towards electrolocation based on the considerations above made and our results.

      We do conclude the manuscript remarking that communication and electrolocation are not mutually exclusive: ”probing cues could function simultaneously as proximity signals to signal presence, deter approaches, or coordinate behaviors like spawning, if properly timed (Henninger et al., 2018).” (see the conclusion paragraph of the discussion) .

      Therein, we further add “These findings aim to stir the pot and initiate a discussion on possible alternative functions of chirps beyond their presumed communication role.”.

      With this, we hope we’ve made it clear how we intend our manuscript to be read.

      Reviewer #3 (Public Review):

      Summary:

      This important paper provides the best-to-date characterization of chirping in weakly electric fish using a large number of variables. These include environment (free vs divided fish, with or without clutter), breeding state, gender, intruder vs resident, social status, locomotion state and social and environmental experience, without and with playback experiments. It applies state-of-the-art methods for reducing the dimensionality of the data and finding patterns of correlation between different kinds of variables (factor analysis, K-means). The strength of the evidence, collated from a large number of trials with many controls, leads to the conclusion that the traditionally assumed communication function of chirps may be secondary to its role in environmental assessment and exploration that takes social context into account. Based on their extensive analyses, the authors suggest that chirps are mainly used as probes that help detect beats caused by other fish and as well as objects.

      Strengths:

      The work is based on completely novel recordings using interaction chambers. The amount of new data and associated analyses is simply staggering, and yet, well organized in presentation. The study further evaluates the electric field strength around a fish (via modelling with the boundary element method) and how its decay parallels the chirp rate, thereby relating the above variables to electric field geometry.

      The main conclusions are that the lack of any significant behavioural correlates for chirping, and the lack of temporal patterning in chirp time series, cast doubt on a primary communication goal for most chirps. Rather, the key determinants of chirping are the difference frequency between two interacting conspecifics as well as individual subjects' environmental and social experience. The paper concludes that there is a lack of evidence for stereotyped temporal patterning of chirp time series, as well as of sender-receiver chirp transitions beyond the known increase in chirp frequency during an interaction.

      These conclusions by themselves will be very useful to the field. They will also allow scientists working on other "communication" systems to perhaps reconsider and expand the goals of the probes used in those senses. A lot of data are summarized in this paper, with thorough referencing to past work.

      The alternative hypotheses that arise from the work are that chirps are mainly used as environmental probes for better beat detection and processing and object localization, and in this sense are self-directed signals. This led to their prediction that environmental complexity ("clutter") should increase chirp rate, which is fact was revealed by their new experiments. The authors also argue that waveform EODs have less power across high spatial frequencies compared to pulse-type fish, with a resulting relatively impoverished power of resolution. Chirping in wave-type fish could temporarily compensate for the lower frequency resolution while still being able to resolve EOD perturbations with a good temporal definition (which pulse-type fish lack due to low pulse rates).

      The authors also advance the interesting idea that the sinusoidal frequency modulations caused by chirps are the electric fish's solution to the minute (and undetectable by neural wetware) echo-delays available to it, due to the propagation of electric fields at the speed of light in water. The paper provides a number of experimental avenues to pursue in order to validate the non-communication role of chirps.

      We thank the reviewer for the kind assessment.

      Weaknesses:

      My main criticism is that the alternative putative role for chirps as probe signals that optimize beat detection could be better developed. The paper could be clearer as to what that means precisely, especially since beating - and therefore detection of some aspects of beating due to the proximity of a conspecific - most often precedes chirping. One meaning the authors suggest, tentatively, is that the chirps could enhance electrosensory responses to the beat, for example by causing beat phase shifts that remediate blind spots in the electric field of view.

      We agree with the Reviewer that a better and more detailed explanation of how beat processing for conspecific electrolocation may be positively affected by chirps would be important to provide. We are currently working on a follow-up manuscript in which we intend to include these aspects. For space limitations and readability we had to discard from the current manuscript a lot of results that could further clarify these issues.

      A second criticism is that the study links the beat detection to underwater object localization. The paper does not significantly develop that line of thought given their data - the authors tread carefully here given the speculative aspect of this link. It is certainly possible that the image on the fish's body of an object in the environment will be slightly modified by introducing a chirp on the waveform, as this may enhance certain heterogeneities of the object in relation to its environment. The thrust of this argument derives mainly from the notion of Fourier analysis with pulse type fish EOD waveforms (see above, and radar theory more generally), where higher temporal frequencies in the beat waveform induced by the chirp will enable a better spatial resolution of objects. It remains to be seen whether experiments can show this to be significant.

      Perhaps the Reviewer refers to the last discussion paragraph before the conclusions in which we mention the performance of pulse or wave-type EODs in electrolocation (referring here to ideas illustrated in a recent review by Crampton, 2019). We added to this paragraph a statement which could better clarify that we do not propose that chirping could enhance object electrolocation. What we mean is that, in a context in which object electrolocation occurs through wave-type EODs - given the generally lower performance of such narrow-band signals in resolving the spatial features of any object, even a 3D electric field  - chirping could improve beat detection during social encounters by increasing the amount of information obtained by the fish.

      The edited paragraph now reads: “While broadband pulse signals may be useful to capture highly complex environments rich in foliage, roots and other structures common in vegetation featuring the more superficial habitats in which pulse-type fish live, wave-type EODs may be a better choice in the relatively simpler river-bed environments in which many wave-type fish live (e.g., the benthic zone of deep river channels; Crampton, 2019). In this case, achieving a good spatial resolution is critical during social encounters, especially considering the limited utility of visual cues in these low-light conditions. In such habitats, social encounters may “electrically” be less “abrupt”, but spatially less “conspicuous” or blurred (as a 3D electric field may be). In such a scenario, chirps could serve as a means to supplement the spatial information acquired via the beat, accentuating these cues during periods of reduced resolution.”

      Recommendations for the authors:

      Reviewer #3 (Recommendations For The Authors):

      None, my points in the original review have been properly addressed in this resubmission.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      This manuscript presented a useful toolkit designed for CyTOF data analysis, which integrates 5 key steps as an analytical framework. A semi-supervised clustering tool was developed, and its performance was tested in multiple independent datasets. The tool was compared to human experts as well as supervised and unsupervised methods. 

      Strengths: 

      The study employed multiple independent datasets to test the pipeline. A new semi-supervised clustering method was developed. 

      Weaknesses: 

      The examination of the whole pipeline is incomplete. Lack of descriptions or justifications for some analyses. 

      We thank the reviewer’s overall summary and comments of this manuscript. In the last part of the results, we showcased the functionalities of ImmCellTyper in covid dataset, including quality check, BinaryClust clustering, cell abundance quantification, state marker expression comparison within each identified cell types, cell population extraction, subpopulation discovery using unsupervised methods, and data visualization etc. We added more descriptions in the text based on the reviewer’s suggestions. 

      Reviewer #2 (Public Review): 

      Summary: 

      The authors have developed marker selection and k-means (k=2) based binary clustering algorithm for the first-level supervised clustering of the CyTOF dataset. They built a seamless pipeline that offers the multiple functionalities required for CyTOF data analysis. 

      Strengths: 

      The strength of the study is the potential use of the pipeline for the CyTOF community as a wrapper for multiple functions required for the analysis. The concept of the first line of binary clustering with known markers can be practically powerful. 

      Weaknesses: 

      The weakness of the study is that there's little conceptual novelty in the algorithms suggested from the study and the benchmarking is done in limited conditions. 

      We thank the reviewer’s overall summary and comments of this manuscript. While the concept of binary clustering by k-means is not novel, BinaryClust only uses it for individual markers to identify positive and negative cells, then combine it with the pre-defined matrix for cell type identification. This has not been introduced elsewhere. Furthermore, ImmCellTyper streamlines the entire analysis process and enhances data exploration on multiple levels. For instance, users can evaluate functional marker expression level/cellular abundance across both main cell types and subpopulations; Also, this computational framework leverages the advantages of both semi-supervised and unsupervised clustering methods to facilitate subpopulation discovery. We believe these contributions warrant consideration as advancements in the field.  

      As for the benchmarking, we limited the depth only to main cell types rather than subpopulations. The reason is because we only apply BinaryClust to identify main cell types; For the cell subsets discovery, unsupervised methods integrated in this pipeline has already been published and widely used by the research community. Therefore, it does not seem to be necessary for additional benchmarking.

      Reviewer #3 (Public Review): 

      Summary: 

      ImmCellTyper is a new toolkit for Cytometry by time-of-flight data analysis. It includes BinaryClust, a semi-supervised clustering tool (which takes into account prior biological knowledge), designed for automated classification and annotation of specific cell types and subpopulations. ImmCellTyper also integrates a variety of tools to perform data quality analysis, batch effect correction, dimension reduction, unsupervised clustering, and differential analysis. 

      Strengths: 

      The proposed algorithm takes into account the prior knowledge. 

      The results on different benchmarks indicate competitive or better performance (in terms of accuracy and speed) depending on the method. 

      Weaknesses: 

      The proposed algorithm considers only CyTOF markers with binary distribution. 

      We thank the reviewer’s overall summary and comments of this manuscript. Binary classification can be considered as an imitation of human gating strategy, as it is applied to each marker. For example, when characterizing the CD8 T cells, we aim for CD19-CD14-CD3+CD4- population, which is binary in nature (either positive and negative) and follows the same logic as the method (BinaryClust) we developed. Results indicated that it works very well for well-defined main cell lineages, particularly when the expression of the defining marker is not continuous. However, the limitation is for subpopulation identification, because a handful makers behave in a continuum manner, so we suggest unsupervised method after BinaryClust, which also brings another advantage of identifying unknown subsets beyond our current knowledge, and none of the semi-supervised tools can achieve that. To address the reviewer’s concern, we considered the limitation of binary distribution, but it does not profoundly affect the application of the pipeline.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Many thanks for the reviewers’ comments and suggestions, please see below the point-to-point response:

      (1) The style of in-text reference citation is not consistent. Many do not have published years.

      The style of the reference citation has been revised and improved.  

      (2) The font size in the table of Figure 1 is too small, so is Figure 2. 

      The font size has been increased.

      (3) Is flowSOM used as part of BinaryClust? How should the variable running speed of BinaryClust be interpreted, given that it is occasionally slower and sometimes faster than flowSOM in the datasets?

      To answer reviewer’s question, flowSOM is not a part of BinaryClust. They are separate clustering methods that have been incorporated into the ImmCellTyper pipeline. As described in Figure 1, BinaryClust, a semi-supervised method, is used to classify the main cell lineages; while flowSOM, an unsupervised method, is recommended here for further subpopulation discovery. So, they operate independently of each other. To avoid confusions, we slightly modified Figure 1 for clarification.

      Regarding the variability in running speed in Figure 4. The performance of algorithms can indeed be influenced by the characteristics of the datasets, such as size and complexity. The differences observed between the covid dataset and the MPN dataset, such as marker panel, experimental protocol, and data acquisition process etc., could account for this variation. Our explanation is that flowSOM suits better the data structure of covid dataset, which might be the reason why it is slightly faster to analyse compared to the MPN dataset. Moreover, for the covid dataset, the runtime for both BinaryClust and flowSOM is less than 100s, and the difference is not notable. 

      (4) In the Method section ImmCellTyper workflow overview, it is difficult to link the description of the pipeline to Figure 8. There are two sub-pipelines in the text and seven steps in the figure. What are their relations? Some steps are not introduced in the text, such as Data transformation and SCE object construction. What is co-factor 5?

      Figure 8 provides an overview of the entire workflow for CyTOF data analysis, starting from the raw fcs file data and proceeding until downstream analysis (seven steps). But the actual implementation of the pipeline was divided into two separate sections, as outlined in the vignettes of the ImmCellTyper GitHub page (https://github.com/JingAnyaSun/ImmCellTyper/tree/main/vignettes).

      Users will initially run ‘Intro_to_batch_exam_correct’ to perform data quality check and identify potential batch effects, followed by ‘Intro_to_data_analysis’ for data exploration. We agree with the reviewer that the method for this section is a bit confusing, so we’ve added more description for clarification.

      In processing mass cytometry data, arcsine transformation is commonly applied to handle zero values, skewed distributions, and to improve visualization as well as clustering performance. The co-factor here is used as a parameter to scale down the data to control the width of the linear region before arcsine transformation. We usually get the best results by using co-factor 5 for CyTOF data.   

      (5) For differential analysis, could the pipeline analyze paired/repeated samples?

      For the statistical step, ImmCellTyper supports both two-study group comparison using Mann-Whitney Wilcoxon test, and multiple study group comparison (n>2) using Kruskal Wallis test followed by post hoc analysis (pairwise Wilcoxon test or Dunn’s test) with multiple testing correction using Benjamini-Hochberg Procedure.

      Certainly, this pipeline allows flexibilities, users can also extract the raw data of cell frequencies and apply suitable statistical methods for testing.

      (6) In Figure 2A, the range of the two axes is different for Dendritic cells, which could be misleading. Why the agreement is bad for dendritic cells?

      The range for the axes is automatically adapted to the data structure, which explains why they may not necessarily be equal. The co-efficient factor for the correlation of DCs is 0.958, compared to other cell types (> 0.99), it is relatively worse but does not indicate poor agreement.

      Moreover, the abundance of DCs is much less than other cell types, comprising approximately 2-5% of whole cells. As a result, even small differences in abundance may appear to as significant variations. For example, a difference of 1% in DC abundance represents a 2-fold change, which can be perceived as substantial.

      Overall, while the agreement for DCs may appear comparatively lower, it is not necessarily indicative of poor performance, considering both the coefficient factor and the relative abundance of DCs compared to other cell types.

      (7) In the Results section BinaryClust achieves high accuracy, what method was used to get the p-value, such as lines 212, 213, etc.?

      The accuracy of BinaryClust was tested using F-measure and ARI against ground truth (manual gating), the detailed description/calculation can be found in methods. For line 212 and 213, the p-value was calculated using ANOVA for the interaction plot shown in Figure 3. We’ve now added the statistical information into the figure legend.   

      (8) The performance comparison between BinaryClust and LDA is close. The current comparison design looks unfair. Given LDA only trained using half data, LDA may outperform BinaryClust.

      It is true that LDA was trained using half data, which is because this method requires manual gating results as training dataset to build a model, then apply the model to the rest of the files to label cell types. Here we used 50% of the whole dataset as training set. We are of course very happy to implement any additional suggestions for a better partition ratio.

      (9) There are 5 key steps in the proposed workflow. However, not every step was presented in the Results.

      Thanks for the comments. The results primarily focused on demonstrating the precision and performance of BinaryClust in comparison with ground truth and existing tools. Additionally, a case study showcasing the application/functions of the entire pipeline in a dataset was also presented. Due to limitation in space, the implementation details of the pipeline were described in the method section and github documentations, which users/readers can easily access.

      Reviewer #2 (Recommendations For The Authors): 

      The tools suggested by the authors could be potentially useful to the community. However, it's difficult to understand the conceptual novelty of the algorithms suggested here. The concept of binary clustering has been described before (https://doi.org/10.1186/s12859-022-05085-zhttps://doi.org/10.1152/ajplung.00104.2022), and it mainly utilizes k-means clustering set to generate binary clusters based on selected markers. Other algorithms associated with the package are taken from other studies. 

      We acknowledge the reviewer’s comment regarding the novelty of our method. While the concept of binary clustering by k-means has been previously described to transcriptome data, our approach applies it to CyTOF data analysis, which has not been introduced elsewhere. Furthermore, ImmCellTyper streamlines the entire analysis process and enhances data exploration on multiple levels. For instance, users can evaluate functional marker expression level/cellular abundance across both main cell types and subpopulations; Also, as stated in the manuscript, this computational framework leverages the advantages of both semi-supervised and unsupervised clustering methods to facilitate subpopulation discovery. We believe these contributions warrant consideration as advancements in the field.  

      In addition, the benchmarking of clustering performance, especially to reproduce manual gating and comparison to tools such as flowSOM is not comprehensive enough. The result for the benchmarking test could significantly vary depending on how the authors set the ground truth (resolution of cell type annotations). The authors should compare the tool's performance by changing the depth of cell type annotations. Especially, the low abundance cell types such as gdT cells or DCs were not effectively captured by the suggested methods. 

      Thanks for the comment. We appreciate the reviewer’s concern. However, as illustrated in figure 1, our approach uses BinaryClust, a semi-supervised method, to identify main cell types rather than directly targeting subpopulations. The reason is because semi-supervised method relies on users’ prior definition thus is limited to discover novel subsets. In the ImmCellTyper framework, unsupervised method was subsequently applied for subset exploration following the BinaryClust step.

      Regarding benchmarking, we focused on testing the precision of BinaryClust for main cell type characterization, because it is what the method is used for in the pipeline, and we believe this is sufficient. As for the cell subsets discovery, the unsupervised methods we integrated has already been published and widely used by the research community. Therefore, it does not seem to be necessary for additional benchmarking.

      Moreover, as shown in Figure 3 and Table 1, our results indicated that the F-measure for DCs and gdT cells in BinaryClust is 0.80 and 0.92 respectively, which were very close to ground truth and outperformed flowSOM, demonstrating its effectiveness. 

      We hope these clarifications address the reviewer’s concern.

      Minor comments: 

      (1) In Figure 4, it's perplexing to note that BinaryClust shows the slowest runtime for the COVID dataset, compared to the MPN dataset, which features a similar number of cells. What causes this variation? Is it dependent on the number of markers utilized for the clustering? This should be clarified/tested. 

      Thanks for the comment, but we are not sure that we fully understand the question. As shown in figure 4 that BinaryClust has slightly higher runtime in MPN dataset than covid dataset, which is reasonable because and the cell number in MPN dataset is around 1.6 million more than covid dataset.

      (2) Some typos are noted: 

      - DeepCyTOF and LDA use a maker expression matrix extracted → "marker"?* 

      Corrected.

      - Datasets(Chevrier et al.)which → spacing* 

      Corrected.

      - This is due to the method's reliance → spacing*

      Corrected.

      Reviewer #3 (Recommendations For The Authors): 

      Is it possible to accommodate more than two levels within the clustering process, i.e., can the proposed semi-supervised clustering tool be extended to multi-levels instead of binary?

      Thanks for the comments. Binary classification can be considered as an imitation of human gating strategy, as it is applied to each marker. For example, when characterizing the CD8 T cells, we aim for CD19-CD14-CD3+CD4- population, which is binary in nature (either positive and negative) and follows the same logic as the method (BinaryClust) we developed. Results indicated that it works very well for well-defined main cell lineages. However, the limitation is for subpopulation identification, because a handful of makers behave in a continuum manner, so we would suggest unsupervised method after BinaryClust, which also brings another advantage of identifying unknown subsets beyond our current knowledge, and none of the semi-supervised tools can achieve that. To answer the reviewer’s question, it is possible to set the number to 3,4,5 rather than just 2, but considering the design and rationale of the entire framework (as describe in the manuscript and above), it doesn’t seem to be necessary.

      Could you please comment on why on the COVID dataset, BinaryClust was slower as compared to flowSOM?

      Thanks for the question. The performance of algorithms can indeed be affected by the characteristics of the datasets, such as their size and complexity. The covid and MPN datasets differ in various aspects including marker panel, experimental protocol, and data acquisition process, among others, which wound account for the observed variation in speed. So, our explanation is flowSOM suits better for the structure of covid dataset than MPN dataset.  Additionally, for covid dataset, both BinaryClust and flowSOM have runtimes of less than 100s, and the difference between the two isn’t particularly dramatic.

      Minor errors: 

      Line#215 "(ref) " reference is missing

      Added.

      Figure 3, increase the font of the text in order to improve readability. 

      Increased.

      Line#229 didn't --> did not. 

      Corrected

      Line#293 repetition of the reference. 

      The repetition is due to the format of the citation, which has been revised.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):  

      Summary: 

      In this study, Nandy and colleagues examine neural and behavioral correlates of perceptual variability in monkeys performing a visual change detection task. They used a laminar probe to record from area V4 while two macaque monkeys detected a small change in stimulus orientation that occurred at a random time in one of two locations, focusing their analysis on stimulus conditions where the animal was equally likely to detect (hit) or not-detect (miss) a briefly presented orientation change (target). They discovered two behavioral measures that are significantly different between hit and miss trials - pupil size tends to be slightly larger on hits vs. misses, and monkeys are more likely to miss the target on trials in which they made a microsaccade shortly before target onset. They also examined multiple measures of neural activity across the cortical layers and found some measures that are significantly different between hits and misses. 

      Strengths: 

      Overall the study is well executed and the analyses are appropriate (though multiple issues do need to be addressed). 

      We thank the reviewer for their enthusiasm and their constructive comments which we address below.

      Weaknesses: 

      My main concern with this study is that with the exception of the pre-target microsaccades, the physiological and behavioral correlates of perceptual variability (differences between hits and misses) appear to be very weak and disconnected. Some of these measures rely on complex analyses that are not hypothesis-driven and where statistical significance is difficult to assess. The more intuitive analysis of the predictive power of trial outcomes based on the behavioral and neural measures is only discussed at the end of the paper. This analysis shows that some of the significant measures have no predictive power, while others cannot be examined using the predictive power analysis because these measures cannot be estimated in single trials. Given these weak and disconnected effects, my overall sense is that the current results do not significantly advance our understanding of the neural basis of perceptual variability. 

      Reviewer #1 (Recommendations For The Authors): 

      (1) Most of the effects are very small. For example, the difference in pupil size between hits and misses is ~0.08 z-score units. The differences in firing rates between hits and misses are in the order of 1-2% of normalized firing rates. While these effects may be significant, their contribution to perceptual variability could be negligible, as suggested by the analysis of predictive power at the end of the result section. On a related note, it would be useful to mention the analysis of predictive power earlier in the paper. The finding that some of the measures do not have significant predictive power w/r to behavioral outcome raises questions regarding their importance. Finally, it would strengthen the paper if the authors could come up with methods to assess the predictive power of the PPC and interlaminar SSC. Without such analyses, it is difficult to assess the importance of these measures. 

      We expect that relatively small differences in early to intermediate sensory areas could cumulatively result in large differences in higher areas and contribute to the binary distinction between hits and misses. We certainly do not claim that these results completely explain state-dependent differences that determine the outcome of these trials. Instead, we have focused on neural signatures at the level of the V4 columnar microcircuit that might ultimately contribute to the variability in perception.

      We would like to emphasize that, based on the reviewer’s recommendation, we have now analyzed our results separately for each animal (see below). The consistency and significance of our findings across both animals give us confidence that what we have reported here are important neural signatures underlying perceptual variability at threshold.

      We would also like to note that SSC and PPC are now part of the standard toolkit of systems neuroscience and have been employed in numerous studies to our knowledge. While all measures come with their set of caveats and limitations, these two measures provide a frequency-resolved metric of the relationship between two temporal processes (point or continuous), which we believe provide insights into the interlaminar flow of information that we report here.

      Unfortunately, limitations in the GLM method and the reliability of these analyses with limited data make it impossible for these two measures to be included. The GLM requires all variables to be defined for each trial in the input. SSC and PPC can be undefined at low firing rates and require a substantial amount of data to be reliably calculated. While we did consider imputing data or estimating SSC and PPC using multiple trials, we ultimately did not pursue this idea as the purpose of the GLM was to use simultaneous measurements from single trials. 

      (2) What is the actual predictive power of the GLM model (i.e., what is the accuracy of predicting whether a given held-out trial will lead to a hit or a miss)? How much of this predictive power is accounted for by the effect of microsaccades? 

      As the GLM is not a decoder, it does not classify whether a given left out trial will be a hit or a miss. However, the GLM was highly predictive compared to a constant model. This information has been added to Table 3. The deviance of the GLM with and without microsaccades as a variable was not significantly different (p >0.9).  

      (3) The role of stimulus contrast is not explained clearly. Are all the analyses and figures restricted to a single contrast level? Was the contrast the same on both sides? If multiple contrasts are used, could contrast account for some of the observed neural-behavioral covariations? 

      All of the analyses include stimuli of all tested contrast levels. Stimulus contrasts were the same at both locations (attended and unattended). We have added a more detailed description of the contrast in hit and miss trials (Lines 289-296 and reproduced that here: 

      “Non-target stimulus contrasts were slightly different between hits and misses (mean:

      33.1% in hits, 34.0% in misses, permutation test, 𝑝 = 0.02), but the contrast of the target was higher in hits compared to misses (mean: 38.7% in hits, 27.7% in misses, permutation test, 𝑝 = 1.6 𝑒 − 31). Firing rates were normalized by contrast in Figure 3. In all other figures, we considered only non-target stimuli, which had very minor differences in contrast (<1%) across hits and misses. While we cannot completely rule out any other effects of stimulus contrast, the normalization in Figure 3 and minor differences for non-target stimuli should minimize them.”

      (4) Do the animals make false alarms (i.e., report seeing a target in non-target epochs)?

      If not, then it is not clear that the animals are performing near their perceptual threshold. If the false-alarm rate is non-zero, it should be reported and analyzed for neural/behavioral correlates. Does the logistic regression fit allow for a false alarm rate? More generally, it would be useful to see a summary of behavioral performance, such as distribution of thresholds, lower and upper asymptotes, and detection rates on foil trials vs. matched target trials. 

      The logistic regression does allow for a false alarm rate. We have reported additional behavioral parameters in Figure 1-figure supplement 3A-G.  

      (5) As far as I can tell, all the analyses in the paper are done on data combined across the two animals. Given that these effects are weak and that the analyses are complex, it is important to demonstrate for each analysis/figure that the results hold for each animal separately before combining the data across animals. This can be done in supplementary figures. 

      We have updated the paper to include all main results plotted separately for each animal as supplementary figures. 

      - Figure 2-figure supplement 2

      - Figure 3-figure supplement 1

      - Figure 3-figure supplement 2

      - Figure 4-figure supplement 1

      - Figure 5-figure supplement 2

      - Figure 7-figure supplement 1

      All the results except for the canonical correlation analysis were present, consistent, and significant when we analyzed them in each monkey independently.

      (6) The selection of the temporal interval used for the various analyses appears somewhat post hoc and is not explained clearly. Some analyses are restricted to the period immediately before or during target onset (e.g., 400 ms before target onset for analysis of the effect of microsaccade, 60 ms before stimulus onset for the analysis of the effect of neural variability). Other analyses are done on non-target rather than target stimuli. What is the justification for selecting these particular periods for these analyses? The differences in firing rates between hits and misses are restricted to the target epoch and are not present in the non-target epochs. Given these results, it seems important to compare the effects in target and non-target epochs in other analyses as well.

      Restricting the analysis of the Fano Factor to 60 ms before non-target onset seems odd. Given that the duration of the interval between stimulus presentations is random, how could this pre-stimulus effect be time-locked to target onset? 

      We selected a 200ms time window during the pre-stimulus or stimulus-evoked period for almost all our analyses. The results relating to microsaccade occurrence were robust to narrower time windows more consistent with the other pre-stimulus windows we used, but we chose to use the 400ms window to capture a larger fraction of trials with microsaccades. 

      Only the Fano factor time window was selected post-hoc based on the traces in Figure 4A, and the result is robust across animals (new Figure 4-figure supplement 1). The inter-stimulus intervals are random, and we do not believe the neural variability is timelocked to upcoming stimuli, but that lower variability in this pre-stimulus window is characteristic of hits. 

      We believe that the consistency of our results across both animals provides further evidence that our time window selection was appropriate. 

      We are interested in the extent to which these effects would remain consistent when applied only to target stimuli. However, restricting our analyses to only target stimuli substantially reduces the amount of neural data available for analysis. We plan to explore target stimulus representation more thoroughly in future studies.   

      (7) Can the measured neural response be used to discriminate between target and nontarget stimuli? If so, is the discriminability between target and non-target higher in hits vs. misses? 

      Thank you for raising this interesting point. We performed this analysis and find that target stimuli are more discriminable from non-targets in hits compared to misses. This has been added as a new Figure 3A.  

      (8) How many trials were performed per session? Did miss probability tend to increase over time over the session? If so, could this slow change in hit probability account for some of the observed neural and behavioral correlations with perceptual decisions? 

      Monkeys initiated a median of 905 trials (range of 651 to 1086). This has been added to the manuscript (Line 106). Approximately 1/8 of those trials were at perceptual threshold. Hit probability at threshold does not change substantially over the course of the session. We now report this in new Figure 1- figure Supplement 3I (error bars show standard deviation). 

      (9) Did miss probability depend on the time of the change within the trial? If so, do any of the behavioral/neural metrics share a similar within-trial time course? 

      Change times were not significantly different across hit and miss trials (p=0.15, Wilcoxon rank sum test). We now report this in new Figure 1-figure supplement 3H.

      (10) "Deep layer neurons exhibit reduced low-frequency phase-locking in hit trials than in misses (Figure 5B), suggesting an improvement in pooled signal-to-noise among this neural population." - why does this metric suggest improved SNR? Is there any evidence for improved SNR in the data? Why just in deep layers? 

      Thank you for raising this question. We agree this statement is not fully supported by the data and have removed it.  

      (11) I may have missed this but what were the sizes of the Gabor stimuli? 

      This has been added to the methods section (Line 454). The Gaussian halfwidth was 2 degrees.  

      Reviewer #2 (Public Review):  

      In this manuscript, the authors conducted a study in which they measured eye movements, pupil diameter, and neural activity in V4 in monkeys engaged in a visual attention task. The task required the monkeys to report changes in the orientation of Gabors' visual stimuli. The authors manipulated the difficulty of the trials by varying the degree of orientation change and focused their analysis on trials of intermediate difficulty where the monkeys' hit rate was approximately 50%. Their key findings include the following: 1) Hit trials were preceded by larger pupil diameter, reflecting higher arousal, and by more stable eye positions; 2) V4 neurons exhibit larger visual responses in hit trials; 3) Superficial and deep layers exhibited greater coherence in hit trials during both the pre-target stimulus period and the non-target stimulus presentation period. These findings have useful implications for the field, and the experiments and analyses presented in this manuscript validly support the authors' claims. 

      Strengths: 

      The experiments were well-designed and executed with meticulous control. The analyses of both behavioural and electrophysiological data align with the standards in the field. 

      We thank the reviewer for their enthusiasm about our study and their constructive comments which we address below.

      Weaknesses: 

      Many of the findings appear to be incremental compared to previous literature, including the authors' own work. While incremental findings are not necessarily a problem, the manuscript lacks clear statements about the extent to which the dataset, analysis, and findings overlap with the authors' prior research. For example, one of the main findings, which suggests that V4 neurons exhibit larger visual responses in hit trials (as shown in Fig. 3), appears to have been previously reported in their 2017 paper. Additionally, it seems that the entire Fig1-S1 may have been reused from the 2017 paper. These overlaps should have been explicitly acknowledged and correctly referenced. 

      While the raw data used in this paper overlaps entirely with Nandy et al. (2017), all the analyses and findings in this manuscript are new and have not been previously reported. Figure 1-figure supplement 1 is modified and reproduced from that paper only to allow readers to understand the recording methods used to collect the data without needing to go back to the previous paper. We have added an explicit acknowledgment of this to the figure caption.

      Previous studies have demonstrated that attention leads to decorrelation in V4 population activity. The authors should have discussed how and why the high coherence across layers observed in the current study can coexist with this decorrelation. 

      We have updated the discussion section (Lines 347-351) to further elaborate on this interpretation. 

      Furthermore, the manuscript does not explore potentially interesting aspects of the dataset. For instance, the authors could have investigated instances where monkeys made 'false' reports, such as executing saccades towards visual stimuli when no orientation change occurred. It would be valuable to provide the fraction of the monkeys' responses in a session, including false reports and correct rejections in catch trials, to allow for a broader analysis that considers the perceptual component of neural activity over pure sensory responses. 

      We appreciate this feedback. While we agree these are interesting directions, we decided to limit the scope of this study to only focus on trials at threshold with an orientation change, and are considering these directions for future studies. 

      Reviewer #2 (Recommendations For The Authors): 

      • Figure Design: Since eLife does not impose space limitations, it is advisable for the authors to avoid using very small font sizes. Consistency in font size throughout the figures is recommended. Some figures are challenging to discern, for example, the mean+-sem in Fig. 2B, and the alpha values of green and purple colours for superficial/deep layers are too high, making them too transparent or pale. 

      We have increased the size of some small fonts and improved font size consistency throughout the figures. We have changed the layer colors to improve legibility. 

      • Line 119: trail, 

      This has been fixed.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The present study provides a phylogenetic analysis of the size prefrontal areas in primates, aiming to investigate whether relative size of the rostral prefrontal cortex (frontal pole) and dorsolateral prefrontal cortex volume vary according to known ecological or social variables.

      I am very much in favor of the general approach taken in this study. Neuroimaging now allows us to obtain more detailed anatomical data in a much larger range of species than ever before and this study shows the questions that can be asked using these types of data. In general, the study is conducted with care, focusing on anatomical precision in definition of the cortical areas and using appropriate statistical techniques, such as PGLS.

      I have read the revised version of the manuscript with interest. I agree with the authors that a focus on ecological vs laboratory variables is a good one, although it might have been useful to reflect that in the title.

      I am happy to see that the authors included additional analyses using different definitions of FP and DLPFC in the supplementary material. As I said in my earlier review, the precise delineation of the areas will always be an issue of debate in studies like this, so showing the effects of different decisions in vital.

      We thank the reviewer for these positive remarks and for these very useful suggestions on the previous version of this article.

      I am sorry the authors are so dismissive of the idea of looking the models where brain size and area size are directly compared in the model, rather preferring to run separate models on brain size and area size. This seems to me a sensible suggestion.

      We agree with the reviewer 1 and the response of reviewer 3 also made it clear to us of why it was an important issue. We have therefore addressed it more thoroughly this time.

      First, we have added a new analysis, with whole brain volume included as covariate in the model accounting for regional volumes, together with the socio-ecological variables of interest. As expected given the very strong correlation across all brain measures (>90%), the effects of all socio-ecological factors disappear for both FP and DLPFC volumes when ‘whole brain’ is included as covariate. This is coherent with our previous analysis showing that the same combination of socio-ecological variables could account for the volume of FP, DLPFC and the whole brain. Nevertheless, the interpretation of these results remains difficult, because of the hidden assumptions underlying the analysis (see below).

      Second, we have clarified the theoretical reasons that made us choose absolute vs relative measures of brain volumes. In short, we understand the notion of specificity associated with relative measures, but 1) the interpretation of relative measures is confusing and 2) we have alternative ways to evaluate the specificity of the effects (which are complementary to the idea of adding whole brain volume as covariate). 

      Our goal here was to evaluate the influence of socio-ecological factors on specific brain regions, based on their known cognitive functions in laboratory conditions (working memory for the DLPFC and metacognition for the frontal pole). Thus, the null hypothesis is that socio-ecological challenges supposed to mobilize working memory and metacognition do not affect the size of the brain regions associated with these functions (respectively DLPFC and FP). This is what our analysis is testing, and from that perspective, it seems to us that direct measures are better, because within regions (across species), volumes provide a good index of neural counts (since densities are conserved), which are indicative fo the amount of computational resources available for the region. It is not the case when using relative measures, or when using the whole brain as covariate, since densities are heterogenous across brain regions (e.g. Herculano-Houzel, 2011; 2017, but see below for further details on this).

      Quantitatively, the theoretical level of specificity of the relation between brain regions and socio-ecological factors is difficult to evaluate, given that our predictions are based on the cognitive functions associated with DLPFC and FP, namely working memory and metacognition, and that each of these cognitive functions also involved other brain regions. We would actually predict that other brain regions associated with the same cognitive functions as DLPFC or FP also show a positive influence of the same socioecological variables. Given that the functional mapping of cognitive functions in the brain remains debated, it is extremely difficult to evaluate quantitatively how specific the influence of the socio-ecological factors should be on DLPFC and FP compared to the rest of the brain, in the frame of our hypothesis.

      Critically, given that FP and DLPFC show a differential sensitivity to population density, a proxy for social complexity, and that this difference is in line with laboratory studies showing a stronger implication of the FP in social cognition, we believe that there is indeed some specificity in the relation between specific regions of the PFC and socioecological variables. Thus, our results as a whole seem to indicate that the relation between prefrontal cortex regions and socio-ecological variables shows a small but significant level of specificity. We hope that the addition of the new analysis and the corresponding modifications of the introduction and discussion section will clarify this point.

      Similarly, the debate about whether area volume and number of neurons can be equated across the regions is an important one, of which they are a bit dismissive.

      We are sorry that the reviewer found us a bit dismissive on this issue, and there may have been a misunderstanding.

      Based on the literature, it is clearly established that for a given brain region, area volume provides a good proxy for the number of neurons, and it is legitimate to generalize this relation across species if neuronal densities are conserved for the region of interest (see for example Herculano-Houzel 2011, 2017 for review). It seems to be the case across primates because cytoarchitectonic maps are conserved for FP and DLPFC, at least in humans and laboratory primates (Petrides et al, 2012; Sallet et al, 2013; Gabi et al, 2016; Amiez et al, 2019). But we make no claim about the difference in number of neurons between FP and DLPFC, and we never compared regional volumes across regions (we only compared the influence of socio-ecological factors on each regional volume), so their difference in cellular density is not relevant here. As long as the neuronal density is conserved across species but within a region (DLPFC or FP), the difference in volume for that region, across species, does provide a reliable proxy for the influence of the socioecological regressor of interest (across species) on the number of neurons in that region.

      Our claims are based on the strength of the relation between 1) cross-species variability in a set of socio-ecological variables and 2) cross-species variability in neural counts in each region of interest (FP or DLPFC). Since the effects of interest relate to inter-specific differences, within a region, our only assumption is that the neural densities are conserved across distinct species for a given brain region. Again (see previous paragraph), there is reasonable evidence for that in the literature. Given that assumption, regional volumes (across species, for a given brain region) provide a good proxy for the number of neurons. Thus, the influence of a given socio-ecological variable on the interspecific differences in the volume of a single brain region provides a reliable estimate of the influence of that socio-ecological variable on the number of neurons in that region (across species), and potentially of the importance of the cognitive function associated with that region in laboratory conditions. None of our conclusions are based on direct comparison of volumes across regions, and we only compared the influence of socioecological factors (beta weights, after normalization of the variables).

      Note that this is yet another reason for not using relative measures and not including whole brain as covariate in the regression model: Given that whole brain and any specific region have a clear difference in density, and that this difference is probably not conserved across species, relative measures (or covariate analysis) cannot be used as proxies for neuronal counts (e.g. Herculano-Houzel, 2011). In other words, using the whole brain to rescale individual brain regions relies upon the assumption that the ratios of volumes (specific region/whole brain) are equivalent to the ratios of neural counts, which is not valid given the differences in densities.

      Nevertheless, I think this is an important study. I am happy that we are using imaging data to answer more wider phylogenetic questions. Combining detailed anatomy, big data, and phylogenetic statistical frameworks is a important approach.

      We really thank the reviewer for these positive remarks, and we hope that this study will indeed stimulate others using a similar approach.

      Reviewer #2 (Public Review):

      In the manuscript entitled "Linking the evolution of two prefrontal brain regions to social and foraging challenges in primates" the authors measure the volume of the frontal pole (FP, related to metacognition) and the dorsolateral prefrontal cortex (DLPFC, related to working memory) in 16 primate species to evaluate the influence of socio-ecological factors on the size of these cortical regions. The authors select 11 socio-ecological variables and use a phylogenetic generalized least squares (PGLS) approach to evaluate the joint influence of these socio-ecological variables on the neuro-anatomical variability of FP and DLPFC across the 16 selected primate species; in this way, the authors take into account the phylogenetic relations across primate species in their attempt to discover the the influence of socio-ecological variables on FP and DLPF evolution.

      The authors run their studies on brains collected from 1920 to 1970 and preserved in formalin solution. Also, they obtained data from the Mussée National d´Histoire Naturelle in Paris and from the Allen Brain Institute in California. The main findings consist in showing that the volume of the FP, the DLPFC, and the Rest of the Brain (ROB) across the 16 selected primate species is related to three socio-ecological variables: body mass, daily traveled distance, and population density. The authors conclude that metacognition and working memory are critical for foraging in primates and that FP volume is more sensitive to social constraints than DLPFC volume.

      The topic addressed in the present manuscript is relevant for understanding human brain evolution from the point of view of primate research, which, unfortunately, is a shrinking field in neuroscience. But the experimental design has two major weak points: the absence of lissencephalic primates among the selected species and the delimitation of FP and DLPFC. Also, a general theoretical and experimental frame linking evolution (phylogeny) and development (ontogeny) is lacking.

      We are sorry that the reviewer still believes that these two points are major weaknesses.

      - We have added a point on lissencephalic species in the discussion. In short, we acknowledge that our work may not be applied to lissencephalic species because they cannot be studied with our method, but on the other hand, based on laboratory data there is no evidence showing that the functional organization of the DLPFC and FP in lissencephalic primates is radically different from that of other primates (Dias et al, 1996; Roberts et al, 2007; Dureux et al, 2023; Wong et al, 2023). Therefore, there is no a priori reason to believe that not including lissencephalic primates prevents us from drawing conclusions that are valid for primates in general. Moreover, as explained in the discussion, including lissencephalic primates would require using invasive functional studies, only possible in laboratory conditions, which would not be compatible with the number of species (>15) necessary for phylogenetic studies (in particular PGLS approaches). Finally, as pointed out by the reviewer, our study is also relevant for understanding human brain evolution, and as such, including lissencephalic species should not be critical to this understanding.

      - In response to the remarks of reviewer 1 on the first version of the manuscript, we had included a new analysis in the previous version of the manuscript, to evaluate the validity of our functional maps given another set of boundaries between FP and DLPFC. But one should keep in mind that our objective here is not to provide a definitive definition of what the regions usually referred to as DLPFC and FP should be from an anatomical point of view. Rather, as our study aims at taking into account the phylogenetic relations across primate species, we chose landmarks that enable a comparison of the volume of cortex involved in metacognition (FP) and working memory (DLPFC) across species. We have also updated the discussion accordingly.

      We agree that this is a difficult point and we have always acknowledged that this was a clear limitation in our study. In the light of the functional imaging literature in humans and non-human primates, as well as the neurophysiological data in macaques, defining the functional boundary between FP and DLPFC remains a challenging issue even in very well controlled laboratory conditions. As mentioned by reviewer 1, “the precise delineation of the areas will always be an issue of debate in studies like this, so showing the effects of different decisions in vital”. Again, an additional analyses using different boundaries for FP and DLPFC was included in the supplementary material to address that issue. Now, we are not aware of solid evidence showing that the boundaries that we chose for DLPFC vs FP were wrong, and we believe that the comparison between 2 sets of measures as well as the discussion on this topic should be sufficient for the reader to assess both the strength and the limits of our conclusion. That being said, if the reviewer has any reference in mind showing better ways to delineate the functional boundary between FP and DLPFC in primates, we would be happy to include it in our manuscript.

      - The question of development, which is an important question per se,  is neither part of the hypothesis nor central for the field of comparative cognition in primates. Indeed, major studies in the field do not mention development (e.g. Byrne, 2000; Kaas, 2012; Barton, 2012). De Casien et al (2022) even showed that developmental constraints are largely irrelevant (see Claim 4 of their article): [« The functional constraints hypothesis […] predicts more complex, ‘mosaic’ patterns of change at the network level, since brain structure should evolve adaptively and in response to changing environments. It also suggests that ‘concerted’ patterns of brain evolution do not represent conclusive evidence for developmental constraints, since allometric relationships between developmentally linked or unlinked brain areas may result from selection to maintain functional connectivity. This is supported by recent computational modeling work [81], which also suggests that the value of mosaic or concerted patterns may fluctuate through time in a variable environment and that developmental coupling may not be a strong evolutionary constraint. Hence, the concept of concerted evolution can be decoupled from that of developmental constraints »].

      Finally, when studies on brain evolution and cognition mention development, it is generally to discuss energetic constraints rather than developmental mechanisms per se (Heldstab et al 2022 ; Smaers et al, 2021;  Preuss & Wise, 2021; Dunbar & Schutz, 2017; MacLean et al, 2012. Mars et al, 2018; 2021). Therefore, development does not seem to be a critical issue, neither for our article nor for the field.

      Reviewer #3 (Public Review):

      This is an interesting manuscript that addresses a longstanding debate in evolutionary biology - whether social or ecological factors are primarily responsible for the evolution of the large human brain. To address this, the authors examine the relationship between the size of two prefrontal regions involved in metacognition and working memory (DLPFC and FP) and socioecological variables across 16 primate species. I recommend major revisions to this manuscript due to: 1) a lack of clarity surrounding model construction; and 2) an inappropriate treatment of the relative importance of different predictors (due to a lack of scaling/normalization of predictor variables prior to analysis).

      We thank the reviewer for his/her remarks, and for the clarification of his /her criticism regarding the use of relative measures. We are sorry to have missed the importance of this point in the first place. We also thank the reviewer for the cited references, which were very interesting and which we have included in the discussion. As the reviewer 1 also shared these concerns, we wrote a detailed response to explain how we addressed the issue above.

      First, we did run a supplementary analysis where whole brain volume was added as covariate, together with socio-ecological variables, to account for the volume of FP or DLPFC. As expected given the very high correlation across all 3 brain measures, none of the socio-ecological variables remained significant. We have added a long paragraph in the discussion to tackle that issue. In short, we agree with the reviewer that the specificity of the effects (on a given brain region vs the rest of the brain) is a critical issue, and we acknowledge that since this is a standard in the field, it was necessary to address the issue and run this extra-analysis. But we also believe that specificity could be assessed by other means: given the differential influence of ‘population density’ on FP and DLPFC, in line with laboratory data, we believe that some of the effects that we describe do show specificity. Also, we prefer absolute measures to relative measures because they provide a better estimate of the corresponding cognitive operation, because standard allometric rules (i.e., body size or whole brain scaling) may not apply to the scaling and evolution of FP and DLPFC in primates.. Indeed, given that we use these measures as proxies of functions (metacognition for FP and working memory for DLPFC), it is clear that other parts of the brain should show the same effect since these functions are supported by entire networks that include not only our regions of interest but also other cortical areas in the parietal lobe. Thus, the extent to which the relation with socio-ecological variables should be stronger in regions of interest vs the whole brain depends upon the extent to which other regions are involved in the same cognitive function as our regions of interest, and this is clearly beyond the scope of this study. More importantly, volumetric measures are taken as proxies for the number of neurons, but this is only valid when comparing data from the same brain region (across species), but not across brain regions, since neural densities are not conserved. Thus, using relative measures (scaling with the whole brain volume) would only work if densities were conserved across brain regions, but it is not the case. From that perspective, the interpretation of absolute measures seems more straightforward, and we hope that the specificity of the effects could be evaluated using the comparison between the 3 measures (FP, DLPFC and whole brain) as well as the analysis suggested by the reviewer. We hope that the additional analysis and the updated discussion will be sufficient to cover that question, and that the reader will have all the information necessary to evaluate the level of specificity and the extent to which our findings can be interpreted.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      In my previous review of the present manuscript, I pointed out the fact that defining parts, modules, or regions of the primate cerebral cortex based on macroscopic landmarks across primate species is problematic because it prevents comparisons between gyrencephalic and lissencephalic primate species. The authors have rephrased several paragraphs in their manuscript to acknowledge that their findings do apply to gyrencephalic primates.

      I also said that "Contemporary developmental biology has showed that the selection of morphological brain features happens within severe developmental constrains. Thus, the authors need a hypothesis linking the evolutionary expansion of FP and DLPFC during development. Otherwise, the claims form the mosaic brain and modularity lack fundamental support". I insisted that the author should clarify their concept of homology of cerebral cortex parts, modules, or regions cross species (in the present manuscript, the frontal pole and the dorsolateral prefrontal cortex). Those are not trivial questions because any phylogenetic explanation of brain region expansion in contemporary phylogenetic and evolutionary biology must be rooted in evolutionary developmental biology. In this regard, the authors could have discussed their findings in the frame of contemporary studies of cerebral cortex evolution and development, but, instead, they have rejected my criticism just saying that they are "not relevant here" or "clearly beyond the scope of this paper".

      The question of development, which is an important question per se, is neither part of the hypothesis nor central for the field of comparative cognition in primates. Indeed, the major studies in the field do not mention development and some even showed that developmental constraints were not relevant (see De Casien et al., 2022 and details in our response to the public review). When studies on brain evolution and cognition mention development, it is generally to discuss energetic constraints rather than developmental mechanisms per se (Heldstab et al 2022 ; Smaers et al, 2021;  Preuss & Wise, 2021; Dunbar & Schutz, 2017;  MacLean et al, 2012. Mars et al, 2018; 2021).

      If the other reviewers agree, the authors are free to publish in eLife their correlations in a vacuum of evolutionary developmental biology interpretation. I just disagree. Explanations of neural circuit evolution in primates and other mammalian species should tend to standards like the review in this link: https://royalsocietypublishing.org/doi/full/10.1098/ rstb.2020.0522

      In this article, Paul Cizek (a brilliant neurophysiologist) speculates on potential evolutionary mechanisms for some primate brain functions, but there is surprisingly very little reference to the existing literature on primate evolution and cognition. There is virtually no mention of studies that involve a large enough number of species to address evolutionary processes and/or a comparison with fossils and/or an evaluation of specific socio-ecological evolutionary constraints. Most of the cited literature refers to laboratory studies on brain anatomy of a handful of species, and their relevance for evolution remains to be evaluated. These ideas are very interesting and they could definitely provide an original perspective on evolution, but they are mostly based on speculations from laboratory studies, rather than from extensive comparative studies. This paper is interesting for understanding developmental mechanisms and their constraints on neurophysiological processes in laboratory conditions, but we do not think that it would fit it in the framework of our paper as it goes far beyond our main topic.

      Reviewer #3 (Recommendations For The Authors):

      Yes, I am suggesting that the authors also include analyses with brain size (rather than body size) as a covariate to evaluate the effects of other variables in the model over and above the effect on brain size. In a very simplified theoretical scenario: two species have the same body sizes, but species A has a larger brain and therefore a larger FP. In this case, species A has a larger FP because of brain allometric patterns, and models including body size as a covariate would link FP size and socioecological variables characteristic of species A (and others like it). However, perhaps the FP of species A is actually smaller than expected for its brain size, while the FP of species B is larger than expected for its brain size.

      As explained in our response to the public review, we did run this analysis and we agree with the reviewer’s point from a practical point of view: it is important to know the extent to which the relation with a set of socio-ecological variables is specific of the region of interest, vs less specific and present for other brain regions. Again, we are sorry to not have understood that earlier, and we acknowledge that since it is a standard in the field, it needs to be addressed thoroughly.

      We understand that the scaling intuition, and the need to get a reference point for volumetric measures, but here the volume of each brain region is taken as a proxy for the number of neurons and therefore for the region’s computational capacities. Since, for a given brain region (FP or DLPFC) the neural densities seem to be well conserved across species, comparing regional volumes across species provides a good proxy for the contrast (across species) in neural counts for that region. All we predicted was that for a given brain region, associated with a given cognitive operation, the volume (number of neurons) would be greater in species for which socio-ecological constraints potentially involving that specific cognitive operation were greater. We do not understand how or why the rest of the brain would change this interpretation (of course, as discussed just above, beyond the question of specificity). And using whole brain volume as a scaling measure is problematic because the whole brain density is very different from the density of these regions of the prefrontal cortex (see above for further details). Again, we acknowledge that allometric patterns exist, and we understand how they can be interpreted, but we do not understand how it could prove or disprove our hypothesis (brain regions involved in specific cognitive operations are influenced by a specific set of socio-ecological variables). When using volumes as a proxy for computational capacities, the theoretical implications of scaling  procedures might be problematic. For example, it implies that the computational capacities of a given brain region are scaled by the rest of the brain. All other things being equal, the computational capacities of a given brain region, taken as the number of neurons, should decrease when the size of the rest of the brain increases. But to our knowledge there is no evidence for that in the literature. Clearly these are very challenging issues, and our position was to take absolute measures because they do not rely upon hidden assumptions regarding allometric relations and their consequence on cognition.

      But since we definitely understand that scaling is a reference in the field, we have not only completed the corresponding analysis (including the whole brain as a covariate, together with socio-ecological variables) but also expended the discussion to address this issue in detail. We hope that between this new analysis and the comparison of effects between non-scaled measures of FP, DLPFC and the whole brain, the reader will be able to judge the specificity of the effect.

      Models including brain (instead of body) size would instead link FP size and socioecological variables characteristic of species B (and others like it). This approach is supported by a large body of literature linking comparative variation in the relative size of specific brain regions (i.e., relative to brain size) to behavioral variation across species - e.g., relative size of visual/olfactory brain areas and diurnality/nocturnality in primates (Barton et al. 1995), relative size of the hippocampus and food caching in birds (Krebs et al. 1989).

      Barton, R., Purvis, A., & Harvey, P. H. (1995). Evolutionary radiation of visual and olfactory brain systems in primates, bats and insectivores. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, 348(1326), 381-392.

      Krebs, J. R., Sherry, D. F., Healy, S. D., Perry, V. H., & Vaccarino, A. L. (1989). Hippocampal specialization of food-storing birds. Proceedings of the National Academy of Sciences, 86(4), 1388-1392. 

      We are grateful to the reviewer for mentioning these very interesting articles, and more generally for helping us to understand this issue and clarify the related discussion. Again, we understand the scaling principle but the fact that these methods provide interesting results does not make other approaches (such as ours) wrong or irrelevant. Since we have used both our original approach and the standard version as requested by the reviewer, the reader should be able to get a clear picture of the measures and of their theoretical implications. We sincerely hope that the present version of the paper will be satisfactory, not only because it is clearer, but also because it might stimulate further discussion on this complex question.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer 1:

      Thank you for your review and pointing out multiple things to be discussed and clarified! Below, we go through the various limitations you pointed out and refer to the places where we have tried to address them.

      (1) It's important to keep in mind that this work involves simplified models of the motor system, and often the terminology for 'motor cortex' and 'models of motor cortex' are used interchangeably, which may mislead some readers. Similarly, the introduction fails in many cases to state what model system is being discussed (e.g. line 14, line 29, line 31), even though these span humans, monkeys, mice, and simulations, which all differ in crucial ways that cannot always be lumped together.

      That is a good point. We have clarified this in the text (Introduction and Discussion), to highlight the fact that our model isn’t necessarily meant to just capture M1. We have also updated the introduction to make it more clear which species the experiments which motivate our investigation were performed in.

      (2) At multiple points in the manuscript thalamic inputs during movement (in mice) is used as a motivation for examining the role of preparation. However, there are other more salient motivations, such as delayed sensory feedback from the limb and vision arriving in the motor cortex, as well as ongoing control signals from other areas such as the premotor cortex.

      Yes – the motivation for thalamic inputs came from the fact that those have specifically been shown to be necessary for accurate movement generation in mice. However, it is true that the inputs in our model are meant to capture any signals external to the dynamical system modeled, and as such are likely to represent a mixture of sensory signals, and feedback from other areas. We have clarified this in the Discussion, and have added this additional motivation in the Introduction.

      (3) Describing the main task in this work as a delayed reaching task is not justified without caveats (by the authors' own admission: line 687), since each network is optimized with a fixed delay period length. Although this is mentioned to the reader, it's not clear enough that the dynamics observed during the delay period will not resemble those in the motor cortex for typical delayed reaching tasks.

      Yes, we completely agree that the terminology might be confusing. While the task we are modeling is a delayed reaching task, it does differ from the usual setting since the network has knowledge of the delay period, and that is indeed a caveat of the model. We have added a brief paragraph just after the description of the optimal control objective to highlight this limitation.

      We have also performed additional simulations using two different variants of a model-predictive control approach that allow us to relax the assumption that the go-cue time is known in advance. We show that these modifications of the optimal controller yield results that remain consistent with our main conclusions, and can in fact in some settings lead to preparatory activity plateaus during the preparation epoch as often found in monkey M1 (e.g in Elsayed et al. 2016). We have modified the Discussion to explain these results and their limitations, which are summarized in a new Supplementary Figure (S9).

      (4) A number of simplifications in the model may have crucial consequences for interpretation.

      a) Even following the toy examples in Figure 4, all the models in Figure 5 are linear, which may limit the generalisability of the findings.

      While we agree that linear models may be too simplistic, much prior analyses of M1 data suggest that it is often good enough to capture key aspects of M1 dynamics; for example, the generative model underlying jPCA is linear, and Sussillo et al. (2015) showed that the internal activity of nonlinear RNN models trained to reproduce EMG data aligned best with M1 activity when heavily regularized; in this regime, the RNN dynamics were close to linear. Nevertheless, this linearity assumption is indeed convenient from a modeling viewpoint: the optimal control problem is more easily solved for linear network dynamics and the optimal trajectories are more consistent across networks. Indeed, we had originally attempted to perform the analyses of Figure 5 in the nonlinear setting, but found that while the results were overall similar to what we report in the linear regime, iLQR was occasionally trapped into local minimal, resulting in more variable results especially for inhibition-stabilized network in the strongly connected end of the spectrum. Finally, Figure 5 is primarily meant to explore to what extent motor preparation can be predicted from basic linear control-theoretic properties of the Jacobian of the dynamics; in this regard, it made sense to work with linear RNNs (for which the Jacobian is constant).

      b) Crucially, there is no delayed sensory feedback in the model from the plant. Although this simplification is in some ways a strength, this decision allows networks to avoid having to deal with delayed feedback, which is a known component of closed-loop motor control and of motor cortex inputs and will have a large impact on the control policy.

      This comment resonates well with Reviewer 3's remark regarding the autonomous nature (or not) of M1 during movement. Rather than thinking of our RNN models as anatomically confined models of M1 alone, we think of them as models of the dynamics which M1 implements possibly as part of a broader network involving “inter-area loops and (at some latency) sensory feedback”, and whose state appears to be near-fully decodable from M1 activity alone. We have added a paragraph of Discussion on this important point.

      (5) A key feature determining the usefulness of preparation is the direction of the readout dimension. However, all readouts had a similar structure (random Gaussian initialization). Therefore, it would be useful to have more discussion regarding how the structure of the output connectivity would affect preparation, since the motor cortex certainly does not follow this output scheme.

      We agree with this limitation of our model — indeed one key message of Figure 4 is that the degree of reliance on preparatory inputs depends strongly on how the dynamics align with the readout. However, this strong dependence is somewhat specific to low-dimensional models; in higher-dimensional models (most of our paper), one expects that any random readout matrix C will pick out activity dimensions in the RNN that are sufficiently aligned with the most controllable directions of the dynamics to encourage preparation.

      We did consider optimizing C away (which required differentiating through the iLQR optimizer, which is possible but very costly), but the question inevitably arises what exactly should C be optimized for, and under what constraints (e.g fixed norm or not). One possibility is to optimize C with respect to the same control objective that the control inputs are optimized for, and constrain its norm (otherwise, inputs to the M1 model, and its internal activity, could become arbitrarily small as C can grow to compensate). We performed this experiment (new Supplementary Figure S7) and obtained a similar preparation index; there was one notable difference, namely that the optimized readout modes led to greater observability compared to a random readout; thus, the same amount of “muscle energy” required for a given movement could now be produced by a smaller initial condition. In turn, this led to smaller control inputs, consistent with a lower control cost overall.

      Whilst we could have systematically optimized C away, we reasoned that (i) it is computationally expensive, and (ii) the way M1 affects downstream effectors is presumably “optimized” for much richer motor tasks than simple 2D reaching, such that optimizing C for a fixed set of simple reaches could lead to misleading conclusions. We therefore decided to stick with random readouts.

      Additional comments:

      (1) The choice of cost function seems very important. Is it? For example, penalising the square of u(t) may produce very different results than penalising the absolute value.

      Yes, the choice of cost function does affect the results, at least qualitatively. The absolute value of the inputs is a challenging cost to use, as iLQR relies on a local quadratic approximation of the cost function. However, we have included additional experiments in which we penalized the squared derivative of the inputs (Supplementary Figure S8; see also our response to Reviewer 3's suggestion on this topic), and we do see differences in the qualitative behavior of the model (though the main takeaway, i.e. the reliance on preparation, continues to hold). This is now referred to and discussed in the Discussion section.

      (2) In future work it would be useful to consider the role of spinal networks, which are known to contribute to preparation in some cases (e.g. Prut and Fetz, 1999).

      (3) The control signal magnitude is penalised, but not the output torque magnitude, which highlights the fact that control in the model is quite different from muscle control, where co-contraction would be a possibility and therefore a penalty of muscle activation would be necessary. Future work should consider the role of these differences in control policy.

      Thank you for pointing us to this reference! Regarding both of these concerns, we agree that the model could be greatly improved and made more realistic in future work (another avenue for this would be to consider a more realistic biophysical model, e.g. using the MotorNet library). We hope that the current Discussion, which highlights the various limitations of our modeling choices, makes it clear that a lot of these choices could easily be modified depending on the specific assumptions/investigation being performed.

      Reviewer 2:

      Thank you for your positive review! We very much agree with the limitations you pointed out, some of which overlapped with the comments of the other reviewers. We have done our best to address them through additional discussion and new supplementary figures. We briefly highlight below where those changes can be found.

      (1) Though the optimal control theory framework is ideal to determine inputs that minimize output error while regularizing the input norm, it however cannot easily account for some other varied types of objectives especially those that may lead to a complex optimization landscape. For instance, the reusability of parts of the circuit, sparse use of additional neurons when learning many movements, and ease of planning (especially under uncertainty about when to start the movement), may be alternative or additional reasons that could help explain the preparatory activity observed in the brain. It is interesting to note that inputs that optimize the objective chosen by the authors arguably lead to a trade-off in terms of other desirable objectives. Specifically, the inputs the authors derive are time-dependent, so a recurrent network would be needed to produce them and it may not be easy to interpolate between them to drive new movement variants. In addition, these inputs depend on the desired time of output and therefore make it difficult to plan, e.g. in circumstances when timing should be decided depending on sensory signals. Finally, these inputs are specific to the full movement chain that will unfold, so they do not permit reuse of the inputs e.g. in movement sequences of different orders.

      Yes, that is a good point! We have incorporated further Discussion related to this point. We have additionally included a new example in which we regularize the temporal complexity of the inputs (see also our response to Reviewer 3's suggestion on this topic), which leads to more slowly varying inputs, and may indeed represent a more realistic constraint and lead to simpler inputs that can more easily be interpolated between. We also agree that uncertainty about the upcoming go cue may play an important role in the strategy adopted by the animals. While we have not performed an extensive investigation of the topic, we have included a Supplementary Figure (S9) in which we used Model Predictive Control to investigate the effect of planning under uncertainty about the go cue arrival time. We hope that this will give the reader a better sense of what sort of model extensions are possible within our framework.

      (2) Relatedly, if the motor circuits were to balance different types of objectives, the activity and inputs occurring before each movement may be broken down into different categories that may each specialize into one objective. For instance, previous work (Kaufman et al. eNeuron 2016, Iganaki et al., Cell 2022, Zimnik and Churchland, Nature Neuroscience 2021) has suggested that inputs occurring before the movement could be broken down into preparatory inputs 'stricto sensu' - relating to the planned characteristics of the movement - and a trigger signal, relating to the transition from planning to execution - irrespective of whether the movement is internally timed or triggered by an external event. The current work does not address which type(s) of early input may be labeled as 'preparatory' or may be thought of as a part of 'planning' computations.

      Yes, our model does indeed treat inputs in a very general way, and does not distinguish between the different types of processes they may be composed of. This is partly because we do not explicitly model where the inputs come from, such that our inputs likely englobe multiple processes. We have added discussion related to this point.

      (3) While the authors rightly point out some similarities between the inputs that they derive and observed preparatory activity in the brain, notably during motor sequences, there are also some differences. For instance, while both the derived inputs and the data show two peaks during sequences, the data reproduced from Zimnik and Churchland show preparatory inputs that have a very asymmetric shape that really plummets before the start of the next movement, whereas the derived inputs have larger amplitude during the movement period - especially for the second movement of the sequence. In addition, the data show trigger-like signals before each of the two reaches. Finally, while the data show a very high correlation between the pattern of preparatory activity of the second reach in the double reach and compound reach conditions, the derived inputs appear to be more different between the two conditions. Note that the data would be consistent with separate planning of the two reaches even in the compound reach condition, as well as the re-use of the preparatory input between the compound and double reach conditions. Therefore, different motor sequence datasets - notably, those that would show even more coarticulation between submovements - may be more promising to find a tight match between the data and the author's inputs. Further analyses in these datasets could help determine whether the coarticulation could be due to simple filtering by the circuits and muscles downstream of M1, planning of movements with adjusted curvature to mitigate the work performed by the muscles while permitting some amount of re-use across different sequences, or - as suggested by the authors - inputs fully tailored to one specific movement sequence that maximize accuracy and minimize the M1 input magnitude.

      Regarding the exact shape of the occupancy plots, it is important to note that some of the more qualitative aspects (e.g the relative height of the two peaks) will change if we change the parameters of the cost function. Right now, we have chosen the parameters to ensure that both reaches would be performed at roughly the same speed (as a way to very loosely constrain the parameters based on the observed behavior). However, small changes to the hyperparameters can lead to changes in the model output (e.g one of the two consecutive reaches being performed using greater acceleration than the other), and since our biophysical model is fairly simple, changes in the behavior are directly reflected in the network activity. Essentially, what this means is that while the double occupancy is a consistent feature of the model, the exact shape of the peaks is more sensitive to hyperparameters, and we do not wish to draw any strong conclusions from them, given the simplicity of the biophysical model. However, we do agree that our model exhibits some differences with the data. As discussed above, we have included additional discussion regarding the potential existence of separate inputs for planning vs triggering the movement in the context of single reaches.

      Overall, we are excited about the suggestions made by the Reviewer here about using our approach to analyze other motor sequence datasets, but we think that in order to do this properly, one would need to adopt a more realistic musculo-skeletal model (such as one provided by MotorNet).

      (4) Though iLQR is a powerful optimization method to find inputs optimizing the author's cost function, it also has some limitations. First, given that it relies on a linearization of the dynamics at each timestep, it has a limited ability to leverage potential advantages of nonlinearities in the dynamics. Second, the iLQR algorithm is not a biologically plausible learning rule and therefore it might be difficult for the brain to learn to produce the inputs that it finds. It remains unclear whether using alternative algorithms with different limitations - for instance, using variants of BPTT to train a separate RNN to produce the inputs in question - could impact some of the results.

      We agree that our choice of iLQR has limitations: while it offers the advantage of convergence guarantees, it does indeed restrict the choice of cost function and dynamics that we can use. We have now included extensive discussion of how the modeling choices affect our results.

      We do not view the lack of biological plausibility of iLQR as an issue, as the results are agnostic to the algorithm used for optimization. However, we agree that any structure imposed on the inputs (e.g by enforcing them to be the output of a self-contained dynamical system) would likely alter the results. A potentially interesting extension of our model would be to do just what the reviewer suggested, and try to learn a network that can generate the optimal inputs. However, this is outside the scope of our investigation, as it would then lead to new questions (e.g what brain region would that other RNN represent?).

      (5)  Under the objective considered by the authors, the amount of input occurring before the movement might be impacted by the presence of online sensory signals for closed-loop control. It is therefore an open question whether the objective and network characteristics suggested by the authors could also explain the presence of preparatory activity before e.g. grasping movements that are thought to be more sensory-driven (Meirhaeghe et al., Cell Reports 2023).

      It is true that we aren’t currently modeling sensory signals explicitly. However, some of the optimal inputs we infer may be capturing upstream information which could englobe some sensory information. This is currently unclear, and would likely depend on how exactly the model is specified. We have added new discussion to emphasize that our dynamics should not be understood as just representing M1, but more general circuits whose state can be decoded from M1.

      Reviewer #2 (Recommendations For The Authors):

      Additionally, thank you for pointing out various typos in the manuscript, we have fixed those!

      Reviewer 3:

      Thank you very much for your review, which makes a lot of very insightful points, and raises several interesting questions. In summary, we very much agree with the limitations you pointed out. In particular, the choice of input cost is something we had previously discussed, but we had found it challenging to decide on what a reasonable cost for “complexity” could be. Following your comment, we have however added a first attempt at penalizing “temporal complexity”, which shows promising behavior. We have only included those additional analyses as supplementary figures, and we have included new discussion, which hopefully highlights what we meant by the different model components, and how the model behavior may change as we vary some of our choices. We hope this can be informative for future models that may use a similar approach. Below, we highlight the changes that we have made to address your comments.

      The main limitation of the study is that it focuses exclusively on one specific constraint - magnitude - that could limit motor-cortex inputs. This isn't unreasonable, but other constraints are at least as likely, if less mathematically tractable. The basic results of this study will probably be robust with regard such issues - generally speaking, any constraint on what can be delivered during execution will favor the strategy of preparing - but this robustness cuts both ways. It isn't clear that the constraint used in the present study - minimizing upstream energy costs - is the one that really matters. Upstream areas are likely to be limited in a variety of ways, including the complexity of inputs they can deliver. Indeed, one generally assumes that there are things that motor cortex can do that upstream areas can't do, which is where the real limitations should come from. Yet in the interest of a tractable cost function, the authors have built a system where motor cortex actually doesn't do anything that couldn't be done equally well by its inputs. The system might actually be better off if motor cortex were removed. About the only thing that motor cortex appears to contribute is some amplification, which is 'good' from the standpoint of the cost function (inputs can be smaller) but hardly satisfying from a scientific standpoint.

      The use of a term that punishes the squared magnitude of control signals has a long history, both because it creates mathematical tractability and because it (somewhat) maps onto the idea that one should minimize the energy expended by muscles and the possibility of damaging them with large inputs. One could make a case that those things apply to neural activity as well, and while that isn't unreasonable, it is far from clear whether this is actually true (and if it were, why punish the square if you are concerned about ATP expenditure?). Even if neural activity magnitude an important cost, any costs should pertain not just to inputs but to motor cortex activity itself. I don't think the authors really wish to propose that squared input magnitude is the key thing to be regularized. Instead, this is simply an easily imposed constraint that is tractable and acts as a stand-in for other forms of regularization / other types of constraints. Put differently, if one could write down the 'true' cost function, it might contain a term related to squared magnitude, but other regularizing terms would by very likely to dominate. Using only squared magnitude is a reasonable way to get started, but there are also ways in which it appears to be limiting the results (see below).

      I would suggest that the study explore this topic a bit. Is it possible to use other forms of regularization? One appealing option is to constrain the complexity of inputs; a long-standing idea is that the role of motor cortex is to take relatively simple inputs and convert them to complex time-evolving inputs suitable for driving outputs. I realize that exploring this idea is not necessarily trivial. The right cost-function term is not clear (should it relate to low-dimensionality across conditions, or to smoothness across time?) and even if it were, it might not produce a convex cost function. Yet while exploring this possibility might be difficult, I think it is important for two reasons.

      First, this study is an elegant exploration of how preparation emerges due to constraints on inputs, but at present that exploration focuses exclusively on one constraint. Second, at present there are a variety of aspects of the model responses that appear somewhat unrealistic. I suspect most of these flow from the fact that while the magnitude of inputs is constrained, their complexity is not (they can control every motor cortex neuron at both low and high frequencies). Because inputs are not complexity-constrained, preparatory activity appears overly complex and never 'settles' into the plateaus that one often sees in data. To be fair, even in data these plateaus are often imperfect, but they are still a very noticeable feature in the response of many neurons. Furthermore, the top PCs usually contain a nice plateau. Yet we never get to see this in the present study. In part this is because the authors never simulate the situation of an unpredictable delay (more on this below) but it also seems to be because preparatory inputs are themselves strongly time-varying. More realistic forms of regularization would likely remedy this.

      That is a very good point, and it mirrors several concerns that we had in the past. While we did focus on the input norm for the sake of simplicity, and because it represents a very natural way to regularize our control solutions, we agree that a “complexity cost” may be better suited to models of brain circuits. We have addressed this in a supplementary investigation. We chose to focus on a cost that penalizes the temporal complexity of the inputs, as ||u(t+1) - u(t)||^2. Note that this required augmenting the state of the model, making the computations quite a bit slower; while it is doable if we only penalize the first temporal derivative, it would not scale well to higher orders.

      Interestingly, we did find that the activity in that setting was somewhat more realistic (see new Supplementary Figure S8), with more sustained inputs and plateauing activity. While we have kept the original model for most of the investigations, the somewhat more realistic nature of the results under that setting suggests that further exploration of penalties of that sort could represent a promising avenue to improve the model.

      We also found the idea of a cost that would ensure low-dimensionality of the inputs across conditions very interesting. However, it is challenging to investigate with iLQR as we perform the optimization separately for each condition; nevertheless, it could be investigated using a different optimizer.

      At present, it is also not clear whether preparation always occurs even with no delay. Given only magnitude-based regularization, it wouldn't necessarily have to be. The authors should perform a subspace-based analysis like that in Figure 6, but for different delay durations. I think it is critical to explore whether the model, like monkeys, uses preparation even for zero-delay trials. At present it might or might not. If not, it may be because of the lack of more realistic constraints on inputs. One might then either need to include more realistic constraints to induce zero-delay preparation, or propose that the brain basically never uses a zero delay (it always delays the internal go cue after the preparatory inputs) and that this is a mechanism separate from that being modeled.

      I agree with the authors that the present version of the model, where optimization knows the exact time of movement onset, produces a reasonably realistic timecourse of preparation when compared to data from self-paced movements. At the same time, most readers will want to see that the model can produce realistic looking preparatory activity when presented with an unpredictable delay. I realize this may be an optimization nightmare, but there are probably ways to trick the model into optimizing to move soon, but then forcing it to wait (which is actually what monkeys are probably doing). Doing so would allow the model to produce preparation under the circumstances where most studies have examined it. In some ways this is just window-dressing (showing people something in a format they are used to and can digest) but it is actually more than that, because it would show that the model can produce a reasonable plateau of sustained preparation. At present it isn't clear it can do this, for the reasons noted above. If it can't, regularizing complexity might help (and even if this can't be shown, it could be discussed).

      In summary, I found this to be a very strong study overall, with a conceptually timely message that was well-explained and nicely documented by thorough simulations. I think it is critical to perform the test, noted above, of examining preparatory subspace activity across a range of delay durations (including zero) to see whether preparation endures as it does empirically. I think the issue of a more realistic cost function is also important, both in terms of the conceptual message and in terms of inducing the model to produce more realistic activity. Conceptually it matters because I don't think the central message should be 'preparation reduces upstream ATP usage by allowing motor cortex to be an amplifier'. I think the central message the authors wish to convey is that constraints on inputs make preparation a good strategy. Many of those constraints likely relate to the fact that upstream areas can't do things that motor cortex can do (else you wouldn't need a motor cortex) and it would be good if regularization reflected that assumption. Furthermore, additional forms of regularization would likely improve the realism of model responses, in ways that matter both aesthetically and conceptually. Yet while I think this is an important issue, it is also a deep and tricky one, and I think the authors need considerable leeway in how they address it. Many of the cost-function terms one might want to use may be intractable. The authors may have to do what makes sense given technical limitations. If some things can't be done technically, they may need to be addressed in words or via some other sort of non-optimization-based simulation.

      Specific comments

      As noted above, it would be good to show that preparatory subspace activity occurs similarly across delay durations. It actually might not, at present. For a zero ms delay, the simple magnitude-based regularization may be insufficient to induce preparation. If so, then the authors would either have to argue that a zero delay is actually never used internally (which is a reasonable argument) or show that other forms of regularization can induce zero-delay preparation.

      Yes, that is a very interesting analysis to perform, which we had not considered before! When investigating this, we found that the zero-delay strategy does not rely on preparation in the same way as is seen in the monkeys. This seems to be a reflection of the fact that our “Go cue” corresponds to an “internal” go cue which would likely come after the true, “external go cue” – such that we would indeed never actually be in the zero delay setting. This is not something we had addressed (or really considered) before, although we had tried to ensure we referred to “delta prep” as the duration of the preparatory period but not necessarily the delay period. We have now included more discussion on this topic, as well as a new Supplementary Figure S10.

      I agree with the authors that prior modeling work was limited by assuming the inputs to M1, which meant that prior work couldn't address the deep issue (tackled here) of why there should be any preparatory inputs at all. At the same time, the ability to hand-select inputs did provide some advantages. A strong assumption of prior work is that the inputs are 'simple', such that motor cortex must perform meaningful computations to convert them to outputs. This matters because if inputs can be anything, then they can just be the final outputs themselves, and motor cortex would have no job to do. Thus, prior work tried to assume the simplest inputs possible to motor cortex that could still explain the data. Most likely this went too far in the 'simple' direction, yet aspects of the simplicity were important for endowing responses with realistic properties. One such property is a large condition-invariant response just before movement onset. This is a very robust aspect of the data, and is explained by the assumption of a simple trigger signal that conveys information about when to move but is otherwise invariant to condition. Note that this is an implicit form of regularization, and one very different from that used in the present study: the input is allowed to be large, but constrained to be simple. Preparatory inputs are similarly constrained to be simple in the sense that they carry only information about which condition should be executed, but otherwise have little temporal structure. Arguably this produces slightly too simple preparatory-period responses, but the present study appears to go too far in the opposite direction. I would suggest that the authors do what they can to address these issue via simulations and/or discussion. I think it is fine if the conclusion is that there exist many constraints that tend to favor preparation, and that regularizing magnitude is just one easy way of demonstrating that. Ideally, other constraints would be explored. But even if they can't be, there should be some discussion of what is missing - preparatory plateaus, a realistic condition-invariant signal tied to movement onset - under the present modeling assumptions.

      As described above, we have now included two additional figures. In the first one (S8, already discussed above), we used a temporal smoothness prior, and we indeed get slightly more realistic activity plateaus. In a second supplementary figure (S9), we have also considered using model predictive control (MPC) to optimize the inputs under an uncertain go cue arrival time. There, we found that removing the assumption that the delay period is known came with new challenges: in particular, it requires the specification of a “mental model” of when the Go cue will arrive. While it is reasonable to expect that monkeys will have a prior over the go time arrival cue that will be shaped by the design of the experiment, some assumptions must be made about the utility functions that should be used to weigh this prior. For instance, if we imagine that monkeys carry a model of the possible arrival time of the go cue that is updated online, they could nonetheless act differently based on this information, for instance by either preparing so as to be ready for the earliest go cue possible or alternatively to be ready for the average go cue. This will likely depend on the exact task design and reward/penalty structure. Here, we added simulations with those two cases (making simplifying assumptions to make the problem tractable/solvable using model predictive control), and found that the “earliest preparation” strategy gives rise to more realistic plateauing activity, while the model where planning is done for the “most likely go time” does not. We suspect that more realistic activity patterns could be obtained by e.g combining this framework with the temporal smoothness cost. However, the main point we wished to make with this new supplementary figure is that it is possible to model the task in a slightly more realistic way (although here it comes at the cost of additional model assumptions). We have now added more discussion related to those points. Note that we have kept our analyses on these new models to a minimum, as the main takeaway we wish to convey from them is that most components of the model could be modified/made more realistic. This would impact the qualitative behavior of the system and match to data but – in the examples we have so far considered – does not appear to modify the general strategy of networks relying on preparation.

      On line 161, and in a few other places, the authors cite prior work as arguing for "autonomous internal dynamics in M1". I think it is worth being careful here because most of that work specifically stated that the dynamics are likely not internal to M1, and presumably involve inter-area loops and (at some latency) sensory feedback. The real claim of such work is that one can observe most of the key state variables in M1, such that there are periods of time where the dynamics are reasonably approximated as autonomous from a mathematical standpoint. This means that you can estimate the state from M1, and then there is some function that predicts the future state. This formal definition of autonomous shouldn't be conflated with an anatomical definition.

      Yes, that is a good point, thank you for making it so clearly! Indeed, as previous work, we do not think of our “M1 dynamics” as being internal to M1, but they may instead include sensory feedback / inter-area loops, which we summarize into the connectivity, that we chose to have dynamics that qualitatively resemble data. We have now incorporated more discussion regarding what exactly the dynamics in our model represent.

      Round 2 of reviews

      Reviewer 3:

      My remaining comments largely pertain to some subtle (but to me important) nuances at a few locations in the text. These should be easy for the authors to address, in whatever way they see fit.

      Specific comments:

      (1) The authors state the following on line 56: "For preparatory processes to avoid triggering premature movement, any pre-movement activity in the motor and dorsal pre-motor (PMd) cortices must carefully exclude those pyramidal tract neurons."

      This constraint is overly restrictive. PT neurons absolutely can change their activity during preparation in principle (and appear to do so in practice). The key constraint is looser: those changes should have no net effect on the muscles. E.g., if d is the vector of changes in PT neuron firing rates, and b is the vector of weights, then the constraint is that b'd = 0. d = 0 is one good way of doing this, but only one. Half the d's could go up and half could go down. Or they all go up, but half the b's are negative. Put differently, there is no reason the null space has to be upstream of the PT neurons. It could be partly, or entirely, downstream. In the end, this doesn't change the point the authors are making. It is still the case that d has to be structured to avoid causing muscle activity, which raises exactly the point the authors care about: why risk this unless preparation brings benefits? However, this point can be made with a more accurate motivation. This matters, because people often think that a null-space is a tricky thing to engineer, when really it is quite natural. With enough neurons, preparing in the null space is quite simple.

      That is a good point – we have now reformulated this sentence to instead say “to avoid triggering premature movement, any pre-movement activity in the motor and dorsal premotor (PMd) cortices must engage the pyramidal tract neurons in a way that ensures their activity patterns will not lead to any movement”.

      (2) Line 167: 'near-autonomous internal dynamics in M1'.

      It would be good if such statements, early in the paper, could be modified to reflect the fact that the dynamics observed in M1 may depend on recurrence that is NOT purely internal to M1. A better phrase might be 'near-autonomous dynamics that can be observed in M1'. A similar point applies on line 13. This issue is handled very thoughtfully in the Discussion, starting on line 713. Obviously it is not sensible to also add multiple sentences making the same point early on. However, it is still worth phrasing things carefully, otherwise the reader may have the wrong impression up until the Discussion (i.e. they may think that both the authors, and prior studies, believe that all the relevant dynamics are internal to M1). If possible, it might also be worth adding one sentence, somewhere early, to keep readers from falling into this hole (and then being stuck there till the Discussion digs them out).

      That is a good point: we have now edited the text after line 170 to make it clear that the underlying dynamics may not be confined to M1, and have referenced the later discussion there.

      (3) The authors make the point, starting on line 815, that transient (but strong) preparatory activity empirically occurs without a delay. They note that their model will do this but only if 'no delay' means 'no external delay'. For their model to prepare, there still needs to be an internal delay between when the first inputs arrive and when movement generating inputs arrive.

      This is not only a reasonable assumption, but is something that does indeed occur empirically. This can be seen in Figure 8c of Lara et al. Similarly, Kaufman et al. 2016 noted that "the sudden change in the CIS [the movement triggering event] occurred well after (~150 ms) the visual go cue... (~60 ms latency)" Behavioral experiments have also argued that internal movement-triggering events tend to be quite sluggish relative to the earliest they could be, causing RTs to be longer than they should be (Haith et al. Independence of Movement Preparation and Movement Initiation). Given this empirical support, the authors might wish to add a sentence indicating that the data tend to justify their assumption that the internal delay (separating the earliest response to sensory events from the events that actually cause movement to begin) never shrinks to zero.

      While on this topic, the Haith and Krakauer paper mentioned above good to cite because it does ponder the question of whether preparation is really necessary. By showing that they could get RTs to shrink considerably before behavior became inaccurate, they showed that people normally (when not pressured) use more preparation time than they really need. Given Lara et al, we know that preparation does always occur, but Haith and Krakauer were quite right that it can be very brief. This helped -- along with neural results -- change our view of preparation from something more cognitive that had to occur, so something more mechanical that was simply a good network strategy, which is indeed the authors current point. Working a discussion of this into the current paper may or may not make sense, but if there is a place where it is easy to cite, it would be appropriate.

      This is a nice suggestion, and we thank the reviewer for pointing us to the Haith and Krakauer paper. We have now added this reference and extended the paragraph following line 815 to briefly discuss the possible decoupling between preparation and movement initiation that is shown in the Haith paper, emphasizing how this may affect the interpretation of the internal delay and comparisons with behavioral experiments.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Review: 

      This study used ATAC-Seq to characterize chromatin accessibility during stages of GABAergic neuron development in induced pluripotent stem cells (iPSCs) derived from both Dravet Syndrome (DS) patients and healthy donors. The authors report accelerated GABAergic maturation to a point, followed by further differentiation into a perturbed chromatin profile, in the cells from patients. In a preliminary analysis, valproic acid, an anti-seizure medication commonly used in patients with DS, increased open chromatin in both patient and control iPSCs in a nonspecific manner, and to different degrees in cultures derived from different patients. These findings provide new information about DS-associated changes in chromatin, and provide further evidence for developmental abnormalities in interneurons with DS. 

      Strengths:

      This is a novel study that aims to investigate the epigenetic changes that occur in a sodium channel model of epilepsy; these changes are often ignored but may be an interesting area for future therapeutics. In general, the flow of the paper is good, and the figures are well-designed.  Reply: Thank you for your positive feedback about our work. 

      Weaknesses:

      The most substantial weakness relates to the observation that DS is often viewed as a monogenic form of epilepsy. It is directly linked to SCN1A gene haploinsufficiency (Yu et al, 2006; Ogiwara et al, 2007). The gene product is Nav1.1, the alpha subunit of voltage-gated sodium channel type I that regulates neuronal excitability. Yet, analysis was conducted at time points of GABAergic interneuron differentiation in which SCN1A is likely not expressed. The paper would be strengthened if SCN1A expression and Nav1.1 protein were examined across the experimental time course. If SCN1A is not yet expressed, this would complicate any explanation of how the observed epigenetic changes might arise. It also seems counterintuitive that the absence of a sodium channel can accelerate differentiation, when, a priori, one might expect the opposite (a 'less neuronal' signal). 

      Thanks, this is an important point!  In our revised manuscript, we have incorporated data on the expression of SCN1A at d19 and d65 of GABAergic development in both the control and patient groups. We first retrieved data from our previous RNA-Seq analysis, showing SCN1A gene expression in our cells at both d19 and d65. We have now updated our text on the SCN1A gene expression in the revised manuscript (Revised Supplementary Figure 1A, revised text Line 108-109). Second, we confirmed the dynamics of SCN1A expression by real-time quantitative RT/PCR analysis at four time-pionts of GABAergic development (d0, d19, d35 and d65). Notably, expression of SCN1A was detected by qRT-PCR from d19 and the expression increased with differentiation. We have now included this information in the revised manuscript (Revised Supplementary Figure 1B, revised text Line 112). 

      Related to this, another important limitation of the study is that the controls are cells derived from healthy individuals and not from isogenic lines. The usage of isogenic lines is extremely relevant for every study in which iPSC-derived somatic cells are used to model a disease, but specifically in diseases like DS, in which the genetic background has an ascertained impact on disease phenotype (Cetica et al, 2017 and others). This serious limitation should be considered.

      Yes, we fully agree that isogenic and edited patient-derived iPSC would have been the ideal controls. At an early stage we therefore invested considerable time and efforts in order to generate isogenic lines from patientderived iPSC. However, editing of the SCN1A variants in patient-derived iPSC turned out unsuccessful after several trials and modifications so we finally turned to iPSC from healthy donors. This is now discussed together with other limitations of our study in the revised manuscript (end of discussion section, lines 499-506).

      In addition, the authors should provide data on variability across cell lines and differentiations to help convince the reader that the results can be attributed to genetic defects, rather than variability across individuals. 

      This is a valuable point. In the revised manuscript, we have now added plots and IF staining from individual samples to give the readers a complete picture on how they are distributed (Revised Supplementary Figure 1C, Revised Supplementary Figure 2, and Revised Supplementary Figure 4).

      In the revised manuscript, we incorporated an explanation on the strategy used to compare the two groups (cases vs. controls) in more detail. In our analysis, we first compared the dynamic changes of chromatin accessibility cell line by cell line across differentiation. We then extracted the common changes from different cell lines at each time point (Revised text line 152-155, line 226-228). Using this strategy, we extracted the common changes confined to the control and patient groups, respectively. With this approach we avoid to capture the variability across individuals.

      Additionally, the authors acknowledge the variability of the differentiations and cell lines, which is commendable, and they attribute this to "possibly reflecting cell line specific and endogenous differences reported previously", but could also have to do with cell death. This is a large confounding factor for ATAC-seq. Certainly, Sup Fig 1C shows lower FrIP scores, consistent with cell death, and there seems to be a lot of death in the representative images. Moreover, the iGABA neurons are very difficult to keep alive, especially to 65 days, without co-culturing with glia and/or glutamatergic neurons. The authors should comment on how much these factors may have influenced their results. 

      With this point in mind, we re-examined QC of our ATAC-Seq across all samples: As shown in revised

      Supplementary Figure 2C and Supplementary Figure 4C, our cutoff for FRiP is 15%, and all of samples have an FrIP of more than 15%. At the later time points (d35 or d65), we did not observe a FRiP <15%. We therefore feel confident that the quality of ATAC-Seq is good enough for downstream analysis and data interpretation.  

      Regarding the differentiation protocol, we are following a directed protocol of iPSC towards interneurons. The protocol is described in detail by Maroof et al (reference 34) and slightly modified in our lab (described in reference 13). With our modified protocol, GABAergic cells are viable beyond day 65 without the need of co-cultures with astrocyte or microglia. This is also reflected by the electrophysiological activity of interneurons at d65 and at later time points (reference 13). Additionally, our ambition was to obtain a homogeneous cell population for further analysis. Adding other cell types to the cultures would have interfered with downstream processes and a need for cell sorting. Using our protocol, we obtain viable GABA interneurons after up to 100 days in culture. To assess the viability of our cells at the point of sampling (other than by morphological assessment), we used Trypan blue staining and an automated cell counter. Only samples with a viability >90% were processed for ATAC seq. which is a commonly used cut-off for cell viability. We have now modified the method section in the revised version to describe the GABAergic differentiation and sampling (line 519-529).

      Finally, changes in gene expression are only inferred, as no RNA levels were measured. If RNA-seq was not possible it would have been good to see at least some of the key genes/findings corroborated with RNA/protein levels vs chromatin accessibility alone, particularly given that these molecular readouts do not always correlate. 

      In our revised manuscript, we include our recently published RNA-seq performed at d19 and d65. We also correlated the RNAseq and ATACseq data obtained from the same samples.  The Pearson correlations between gene expression and chromatin accessibility were within the range 0.49-0.57 (Revised Supplementary Figure 2G, Revised supplementary Figure 4G), which is acceptable according to standard criteria. The results confirmed that the quality of ATAC-Seq is good enough for analysis of expression levels and chromatin openness in key genes. We also added gene expression levels from RNA-seq (d19 and d65) in our revised manuscript (Revised Figure 1G, Revised Figure 2G). Finally, we performed qRT-PCR analysis of key genes in each cluster and the results are now included in the revised version (Revised Supplementary Figure 3E, Revised Supplementary Figure 5E)

      Additional Points:

      (1) Representative images for cell-identity markers for only D65 are shown, and not D0, D19, and D35 though it is stated in the text that this was performed. At a minimum, these representative images should be shown for all lines. 

      As suggested, we have now added images for cell identity markers of all iPSC lines in the revised version (Revised Supplementary Figure 1C).

      (2) What QC was performed on iPSC lines, i.e. karyotype/CNV analysis and confirmation of genotypes?

      All iPSC lines used in this study have been fully characterized according to standard and state-of-the art procedures: Expression of pluripotency and stemness genes has been shown by immunostaining, flow cytometry and scorecard analysis; integrity of the genome has been assessed by karyotyping using g-banding; differentiation capacity was characterized using an embryoid body assay in combination with scorecard analysis; and genotypes were verified by Sanger sequencing. Please, see the following publications for full datasets: Schuster et all, Neurobiol Dis 2019, Schuster et al Stem Cell Res 2019, Sobol et al Stem Cells and Development 2015. In our lab, the integrity of iPSC lines are routinely verified using flow cytometry (expression for TRA-1-60 and SSEA4), immunostaining (expression of NANOG, SOX2 and OCT4), Sanger sequencing (targeting variants in SCN1A gene), cell morphology analysis and analysis of mycoplasma by MycoAlert® (Lonza).

      (3) Were all experiments performed on a single differentiation? Or multiples? Were the differentiations performed with the same type? If not, was batch considered in the analysis? 

      Thank you for raising this question. The text Material and Methods has been modified as follows, to better describe the differentiation and sampling procedure:

      “GABAergic interneuron differentiation from iPSCs was performed as previously described (reference 13). The protocol utilizes DUAL SMAD inhibition to induce neurogenesis towards neural stem cells for 10 days, followed by patterning with high levels of sonic hedgehog for nine days towards cortically fated neuronal progenitor cells (NPC) and subsequent maturation for 46 days, i.e. a total of 65 days (Figure 1A). Neuronal cells at day 65 and onwards are healthy and viable as judged by morphological assessment by light microscopy. Differentiation was performed at least 3 times per cell line.  

      Cell cultures were sampled at days 0 (D0), D19, D35 and D65, respectively, by harvesting cells with TryplE and centrifugation (300 x g, 3 min). Harvested cells were counted and assessed for viability using trypan blue staining and an automated EVE cell counter (Nano Entek). Samples with a viability of >90% were chosen for ATAC-Seq library preparation (see below).”.  

      I also assume that technical replicates were merged, and then all three biological replicates were kept for each analysis and outliers were not removed, e.g. Control_D19_8F seems like an example of an outlier. 

      This is a valuable point. We agree on that there is variability across three health donors and patients, respevtively, but the quality of ATAC-Seq is good after multiple assessment of QC (Revised Supplementary Figure 2B-D). The color code in Supplementary Figure 1C may be mis-leading as the Pearsson correlation of all samples was displayed. Overall, the correlation from all ATAC-seq among replicates are over 0.8. At the same time, we observed that samples at d0 are clustered together, but not at the later time points. We interpret this as related to the cell-line specific plasticity of chromatin dynamic during differentiation. The observation agrees with our results from PCA (Revised Supplementary Figure 2F).  

      (4) In Figure 1C, it is intriguing that the ATACseq signal gets stronger in imN. One might expect it to be strongest in the iPSCs which are undifferentiated and have the highest levels of open chromatin. Is this a function of sequencing depth, or are all the Y-axes normalized across all time points? 

      This is another valuable point. Figure 1C present the average chromatin openness for clusters specific regions- not of chromatin openness from the entire genome, which is a reason for why the chromatin openness at

      D35 is higher than at other time-points. The genome-wide chromatin openness is presented in revised

      Supplementary Figure 2D and we have now updated the figure legend to avoid any potential misunderstanding. 

      The sequencing depth for each sample is extracted in a similar range. To give the readers a complete picture, we also present the depth of sequencing reads for each sample (Revised Supplementary Figure 2A and Revised Supplementary Figure 4A). The Y-axes of genome browser tracks were normalized, and we added the normalized value in the figures. 

      (5) In Figure 1F, are these all enriched terms, or were they prioritized somehow? 

      Yes, the enriched terms are prioritized based on biological meanings, and we have now clarified this in the updated legend of the manuscript. In addition, all enriched terms are now included in revised Supplementary Table 2 and Supplementary Table 4. 

      (6) In Figure 1G (also the same plots in Fig 2/3), are all these images normalized i.e. there is no scale bar for each track, and do they represent and aggregate BAM/bigwig?

      Yes, the genome browser tracks were normalized and we have now revised the figures by adding scale bars.

      It would be good to show in supplement the variability across cell lines/diffs - particularly given the variability in the heatmap/PCA - and demonstrate the rigor/reproducibility of these results. This comment applies to all these plots across the 3 figures, particularly as in some instances the samples appear to cluster by individual first and then time point (Sup Fig 3B). 

      Thanks. We have now revised the figure with plots showing individual samples. 

      How confident are the authors that these effects are driven by genotype and not a single cell line? In the Fig 3D representation of NANOG, it is very difficult to see any difference between patient and control. 

      In Figure 3D, we showed common chromatin dynamics in the control and patient groups. To avoid any misunderstanding, we have now updated our legend in the revised manuscript. 

      (7) For the changes in occupancy annotation (UTR/exon/intron etc), are these differences still significant after correcting for variability from cell line to cell line at each time point? I.e. rather than average across all three samples, what is the range?  Reply: Revised accordingly. 

      (8) The VPA timepoint is not well-justified. Given that VPA would be administered in patients with fully mature inhibitory neurons, it is difficult to determine the biological relevance. I appreciate that this is a limitation of the model, but this should at least be addressed in the manuscript. 

      We agree on that our model system of GABAergic interneuron development has limitations and that cells may not fully recapitulate the development and physiology in vivo. Obvious factors to consider in our system are the directed protocol to enrich for GABAergic interneurons and the differentiation time-line restricted to 65d. This is now discussed (lines 499-506).

      Recommendations for the authors:

      (1) The term 'mutation' has been replaced with the term ' pathogenic variant' or likely pathogenic variant depending on the context, please see PMID: 25741868 

      Thank you for pointing this out. We have replaced all instances of “mutation” with “pathogenic variant” throughout the manuscript.

      (2) It is unclear what the nomenclature for sample labelling is in Supplementary Figure 1, e.g. 7C, 8F, 1B.  

      We apologize for this confusion. There are cell lines names. We labeled all data and images according to cell line name, i.e. control lines: Ctl1B, Ctl7C and Ctl8F; patient lines: DD1C, DD4A, DD5A. To avoid any potential confusion, we have added a note in the revised legend of Supplementary Figure 1B.

      (3) Can the authors confirm that the Deseq2 FDR values are Benjamini-Hochberg procedure corrected per default settings? If so, this should ideally be added to methods or legend for clarity 

      Yes, default settings were used in Deseq2 FDR values, which is added in the method part of revised manuscript. 

      (4) While it makes sense that the authors present the data in the order of Figure 1, and Figure 2, this actually makes it quite difficult to compare the two datasets, especially for the functional enrichment in the "F" figures. It may be helpful to consider re-organizing the figure order. For instance, for the long-term potentiation signal in the DS-iPSCs, what does this mean in terms of biological relevance? Or maybe Figure 2 needs to be supplementary given that Figure 3 is a more direct comparison.  

      Thank you for the suggestions. We attempted to reorganize during our revision. We still believe it is easier for the audience to grasp the main message if we organize it according to our current workflow—first presenting an individual differential landscape for controls and patients, and then comparing the common and unique aspects among them.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this manuscript, entitled " Merging Mul-OMICs with Proteome Integral Solubility Alteration Unveils Antibiotic Mode of Acon", Dr. Maity and colleagues aim to elucidate the mechanisms of action of antibiotics through combined approaches of omics and the PISA tool to discover new targets of five drugs developed against Helicobacter pylori.

      Strengths:

      Using transcriptomics, proteomic analysis, protein stability (PISA), and integrative analysis, Dr. Maity and colleagues have identified pathways targeted by five compounds initially discovered as inhibitors against H. pylori flavodoxin. This study underscores the necessity of a global approach to comprehensively understanding the mechanisms of drug action. The experiments conducted in this paper are well-designed and the obtained results support the authors' conclusions.

      Weaknesses:

      This manuscript describes several interesting findings. A few points listed below require further clarification:

      (1) Compounds IVk exhibits markedly different behavior compared to the other compounds. The authors are encouraged to discuss these findings in the context of existing literature or chemical principles.

      This is a good point. We have added the following paragraph (Page No-13).

      “In several of our studies, compound IVk, which has a higher MIC, exhibits markedly different behavior. This difference in behavior may stem from different sources, including intercellular availability, inactivation inside the cell, or loss of target specificity. Multiple studies have previously demonstrated that there is only a 30% chance for a structurally similar compound to have similar biological activity32.”

      (2) The incubation me for treating H. pylori with the drugs was set at 4 hours for transcriptomic and proteomic analyses, compared to 20 min for PISA analysis. The authors need to explain the reason for these differences in treatment duration.

      This is now explained in Pages 17 and 19, where the following paragraphs have been included

      “The incubation time for transcriptomics and proteomics assays was determined based on the Time-Kill Curves assay (Fig. 6(A)). The 4-hour time point shows a significant amount of cell death compared to the control population.”

      “The target deconvolution method aims to evaluate the initial interaction with intracellular proteins. We selected a 20-minute time point based on intracellular ROS generation (not shown). It is a well-reported phenomenon that bactericidal drugs induce early production of ROS.”

      (3) The PISA method facilitates the identification of proteins stabilized by drug treatment. DnaJ and Trigger factor (g), well-known molecular chaperones, prevent protein aggregation under stress. Their enrichment in the soluble fraction is expected and does not necessarily indicate direct stabilization by the drugs. The possibility that their stabilization results from binding to other proteins destabilized by the drugs should be considered. To prevent any misunderstanding, the authors should clarify that their methodology does not solely identify direct targets. Instead, the combination of their findings sheds light on various pathways affected by the treatment.

      This is also a very valuable observation. We now clearly state that in new paragraphs at Pages 8 and 13

      Another target shared among several compounds is the chaperone protein trigger factor (Tig), which plays a crucial role in facilitating proper protein folding and is indispensable for the survival of bacterial cells. The solubility of this protein has been altered by all the compounds except IVk (Fig. 2(I-J)) in a concentration-dependent manner (Fig. S4(B, D, and E)). The possibility of Tig interacting with other proteins destabilized by the drug, along with the influence of the heat gradient during the PISA assay, may introduce potential noise in the data. Further investigation is required to confirm the interaction of the drug with Tig.

      “The module “black” associated with this compound contains Tig, which is involved in facilitating proper protein folding, as a target, and it down-regulates multiple proteins associated closely with S12 ribosomal protein of the 30S subunit (Fig. S9(D)) indicating its involvement in stabilization of ribosomal protein.”

      (4) At the end of the manuscript, the authors conclude that four compounds "strongly interact with CagA". However, detailed molecule/protein interaction studies are necessary to definitively support this claim. The authors should exercise caution in their statement. As the authors mentioned, additional research (not mandated in the scope of this current paper) is necessary to determine the drug's binding affinity to the proposed targets.

      We have modified the sentence (Page -15) to say:

      “This study identifies four out of our five compounds that induce significant change in the solubility of CagA, the major virulence factor of H. pylori.”

      (5) The authors should clarify the PISA-Express approach over standard PISA. A detailed explanation of the differences between both methods in the main text is important.

      This was already explained in Page 5 (no changes have been made)

      Reviewer #2 (Public Review):

      Summary:

      This work has an important and ambitious goal: understanding the effects of drugs, in this case antimicrobial molecules, from a holistic perspective. This means that the effect of drugs on a group of genes and whole metabolic pathways is unveiled, rather than its immediate effect on a protein target only. To achieve this goal the authors successfully implement the PISA-Express method (Protein Integral Solubility Alteration), using combined transcriptomics, proteomics, and drug-induced changes in protein stability to retrieve a large number of genes and proteins affected by the used compounds. The compounds used in the study (compound IVa, IVb, IVj, and IVk) were all derived from the precursors compound IV, they are effective against Helicobacter pylori, and their mode of action on clusters of genes and proteins has been compared to the one of the known pylori drug metronidazole (MNZ). Due to this comparison, and confirmed by the diversity of responses induced by these very similar compounds, it can be understood that the approach used is reliable and very informative. Notably, although all compound IV derivatives were designed to target pylori Flavodoxin (Fld), only one showed a statically significant shift of Fld solubility (compound IVj, FIG S11). For most other compounds, instead, the involvement of other possible targets affecting diverse metabolic pathways was also observed, notably concerning a series of genes with other important functions: CagA (virulence factor), FtsY/FtsA (cell division), AtpD (ATP-synthase complex), the essential GTPase ObgE, Tig (protein export), as well as other proteins involved in ribosomal synthesis, chemotaxis/motility and DNA replication/repairs. Finally, for all tested molecules, in vivo functional data have been collected that parallel the omics predictions, comforting them and showing that compound IV derivatives differently affect cellular generation of reactive oxygen species (ROS), oxygen consumption rates (OCR), DNA damage, and ATP synthesis.

      Strengths:

      The approach used is very potent in retrieving the effects of chemically active molecules (in this case antimicrobial ones) on whole cells, evidencing protein and gene networks that are involved in cell sensitivity to the studied molecules. The choice of these compounds against H. pylori is perfect, showcasing how different the real biological response is, compared to the hypothetical one. In fact, although all molecules were retrieved based on their activity on Fld, the authors unambiguously show that large unexpected gene clusters may, and in fact are, affected by these compounds, and each of them in different manners.

      Impact:

      The present work is the first report relying on PISA-Express performed on living bacterial cells. Because of its findings, this work will certainly have a high impact on the way we design research to develop effective drugs, allowing us to understand the fine effects of a drug on gene clusters, drive molecule design towards specific metabolic pathways, and eventually better plan the combination of multiple active molecules for drug formulation. Beyond this, however, we expect this article to impact other related and unrelated fields of research as well. The same holistic approaches might also allow gaining deep, and sometimes unexpected, insight into the cellular targets involved in drug side effects, drug resistance, toxicity, and cellular adaptation, in fields beyond the medicinal one, such as cellular biology and environmental studies on pollutants.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Please modify these few concerns:

      -  It is unclear from the introduction and discussion whether conventional transcriptomic and proteomic analyses have previously been conducted on the compounds examined in this study. If only targeted studies have been performed please clarify this further.

      To make it more clear, we have added the following paragraph in Page 5:

      “Our investigation into understanding the mode of action of nitro-benzoxadiazole compounds commenced with a comparison of the conventional transcriptional and translational changes induced by these compounds, the vehicle control (DMSO), and the commercially used drug MNZ. RNA sequencing (RNA-seq) and expressional proteomics were employed to identify transcriptional and translational changes, respectively.”

      -  The decision to monitor the oxygen consumption rate (OCR) is based on the hypothesis that the drugs would impact flavodoxins function. Could the authors cite specific studies that suggest a reduction in flavodoxin leads to decreased OCR that can be measured?

      The reviewer is correct to say that we have done this study based on our hypothesis that a reduction in flavodoxin may lead to decreased OCR.  To our knowledge, there is no previous studies indicating that so we now clearly state (Page 14) that it is our hypothesis.

      “On the other hand, given that these drugs indicated involvement of multiple factors from the electron transport chain including flavodoxin and we observed significant drop in the ATP production rate (Fig. 6(D)) associated to compounds IV and IVj, we have investigated the changes in oxygen consumption rate (OCR) as we hypothesize that a reduction in soluble flavodoxin could lead to decreased OCR”

      -  Increase font size in some figures and supplemental materials for clarity.

      We acknowledge the reviewer's comment and have addressed it to the best possible extent in the figures.

      -  Correct figure references throughout the text (example of mistake p4, Fig S1D, p6 S1C).

      We have corrected the figure references.

      -  Check spelling errors, for example, Figure S1B: "library preparation".

      We have revised the figures and corrected spelling errors.

      -  Ensure H. pylori is in italics.

      Done!

      -  Figure S4: Replace (D) by (E).

      Done! Thank you.

      -  Page 7: Check the sentence: "...RpleE, InfC) and F Furthermore, we..." .

      Corrected!  

      “The 20 common essential targets are mostly associated with cell division (for example, FtsZ), small subunit ribosomal proteins (RspC, RspE, RspL, RplE, InfC). Furthermore, we identified a few unique changes for compound IV (DnaN, involved in DNA tethering and processivity of DNA polymerases, and C694_06445, which could be a functional equivalent of delta subunit of DNA polymerase III).”

      -  Page 9: Please modify the name of one compound "Compounds IV, IVj (and not IVk) and MnZ downregulate...".

      We have observed that both reviewers mentioned this point and we revisited the data, as suggested by Fig S8(B), that compounds IV, IVk, and MNZ cluster together and downregulate the genes associated with this pathway. Based on this, we have not changed anything in the text.  

      -  Figure S9: please clarify symbols (triangles and others) in the Figure legend.

      Done!

      -  Page 9: Is it the Figure S9B you are referring to? Talking about proteomics?

      Sorry, we have not understood the above comment.

      Reviewer #2 (Recommendations For The Authors):

      All figures are printed as one per page. In this format, almost all pictures suffer a severe problem with dimensions. Notably graph axes and axis values, subtitles, and legends within the pictures are too small, although the graphical part is almost always appropriate. Negative example (higher fonts are needed): Figure 1. Positive example (font ok): Figure 2A or Figure 3 right panels.

      We have carefully revised our figures to address the issues you mentioned, ensuring that elements are visible when printed one per page. In Fig 1: We have increased the font sizes of the graph axes, axis values, subtitles, and legends to improve readability. Additionally, we have color-matched different Gene Ontology (GO) terms for better rideability. In Fig 2: To enhance clarity, we have resized the figure by removing the top 10 protein list, now presented in a separate table. This ensures that the figure's main content remains prominent.  These modifications have been made across figures to maintain consistency and readability.

      For all figures, particularly for non-experts, not only a list of what is found in the picture should be provided, but also a minimal, simplified key of interpretation (of what is to be noticed). Particularly relevant for scatter plots.

      We have modified the legends to provide simplified key interpretation for the scatter plots. 

      In general for most analyses I see the involvement of FtsA, whereas most discussions concern FtsY and FtsZ. Maybe this point should be clarified. For example: i) FtsZ is quoted in the Second "Results" paragraph (page 6), but we can't find this gene in Figure 2, nor in the corresponding table (Figure 2A); ii) FtsY downregulation is quoted in the Fifth "Results" paragraph (page 9), but we can't find this gene in Figure 5, 9S or 10S.

      We are not entirely sure if we have understood the reviewer's comment correctly, as we did not mention FtsY in our discussion section. In the discussion section, we have focused on the involvement of FtsZ and FtsA with some of our compounds. We decided to discuss them together because FtsZ is the primary component that is recruited to the membrane by the actin-related protein FtsA, while the role of FtsY remains highly debated.

      Figure 1: same colour for the same GO: term in different panels should be used.

      Done!

      Figure 4: please specify (being it essential throughout the whole paper) that the group colouring only refers to Figure 4A, lower bar.

      Done!

      Figure 5, S9, and S10: having the combination of analysed sets (brown / IV , magenta / IVb, etc....) as a panel subtle is almost a necessity, to avoid constant page turning. I did rewrite all of them by hand to be able to follow the main text story.

      Done!

      What are the triangles? (this is not written anywhere).

      We have now explained this in the legends of Fig5.

      Figures S9 and S10 are too crowded (please refer to Figure 5 for a good format/size).

      For supplementary figures S9 and S10 we prefer to keep the gene names, but in order to make them more legible we have now added subtitles to each panel.

      Second and third "Results" paragraph. Explicitly saying that the Second is only focused on TOP 10 hits, at the beginning of the paragraph (while the third on essential genes) would help enormously the non-specialist in orienting among the different sections.

      On page 7, we have revised the text to indicate that the paragraph is only focused on the top 10 hits. Additionally, we have included a table of top 10 hits for better clarity and accessibility. 

      Page 6: the following sentence should be in the introduction, to stress the novelty of the work: "This is the first me PISA assay, in the form of PISA-Express, has been successfully performed in living bacterial cells, with protocols adapted and modified from previous PISA studies in mammalian cells".

      Page- 2 

      We agree this is an important point. However, having we stated it in both the abstract and in the PISA section in the results we prefer not to state it once more in the Introduction.

      (no changes made)

      I couldn't find any reference to Figure S3 in the text.

      Included! (P 9)

      "Compounds IV, IVk, and MNZ downregulate the genes associated with this pathway (Fig. 4(B) & S8(B))": it seems to me that it is IVj rather than IVk to downregulate. Please check carefully.

      We have observed that both reviewers mentioned this point and we revisited the data, as suggested by Fig S8(B), that compounds IV, IVk, and MNZ cluster together and downregulate the genes associated with this pathway. Based on this, we have not changed anything in the text.  

      Page 12: of the pre-defined target like flavodoxin => of the pre-defined target flavodoxin.

      Thanks! We have removed “like” from the sentence.

      Metronidazol (=MNZ) only appears on page 13 (MNZ already on page 8).

      Corrected!  The correspondence is now first indicated in P. 3.

      Please resolve the ambiguity metronidazol/metronidazole (main text and figures).

      We now always say “metronidazole”

      The Sixth "Results" paragraph (pages 10-11) should be developed a bit more. All Figure 6 results are summarized in 8 lines at the end of the paragraph. This doesn't bring much, particularly to a non-specialist reader. Please, for each panel, clearly explain what is to be noticed and what main conclusion(s) can be extracted.

      We have improved the description of the section. The modified part now reads:

      …This indicates that the nitro-bearing groups have a higher propensity to generate ROS. We have also observed that the genes associated with the generation of ROS are significantly overexpressed for compounds IV, IVb, IVj, and MNZ (Fig. S12(A)). As described above and depicted in Fig. S12(B), multiple DNA damage repair proteins and genes are down-regulated in the presence of compounds IV, IVb, IVj, and MNZ. Additionally, DNA PolA was found to be a major target for compound IVj. Following these results, we investigated compound-induced DNA damage using the APO BrdU TUNEL assay. All the compounds, particularly IV and IVj, caused significant DNA damage (Fig. 6(C)).

      On the other hand, given that these drugs indicated involvement of multiple factors from the electron transport chain including flavodoxin and we observed significant drop in the ATP production rate (Fig. 6(D)) associated to compounds IV and IVj, we have investigated the changes in oxygen consumption rate (OCR) as we hypothesize that a reduction in soluble flavodoxin could lead to decreased OCR.  Though the signal-to-noise ratio of these data is poor…

      and we added figure S12 for clarity.   

      In the same section I found: "Compound IV and its derivatives cause a marked increase in ROS generation when compared to the control (DMSO)" => refers to THIS work or previous work? (in the later case, please quote it).

      This data is from our current paper, as shown in Fig 6(B).

      In the same paragraph, "the signal-to-noise ratio of these data is considerable" => does it mean that you have good (high signal-to-noise) data, or that you have too high noise for precise quantification? I rather understood the later, but this sentence definitely needs to be rewritten.

      Thank you for pointing out the mistake. Your interpretation is correct. We have corrected the sentence.

    1. Author response:

      The following is the authors’ response to the original reviews.

      (1) The conclusions in the text are very broad and general but often based on a limited number of examples. It would be important that the authors hit the appropriate tone when most of the analysis (in Figure 5) is derived from n=3 events.

      We have tried to hit the correct tone here by modifying our manuscript text. In particular we have we have added a pie chart to Figure 4 (Figure 4C, that summarises data from all RBMX targets, not just the original n=3, and shows that most RBMX targets are rescued by RBMXL2).

      (2) The fractions of long/ultra-long exons actually bound by/regulated by RBMX are not clearly stated - which is in contrast to the general statement of the title (implying a global role for RBMX in proper splicing of ultra-long exons).

      (i) We have changed our title (now “An anciently diverged family of RNA binding proteins maintain correct splicing of a class of ultra-long exons through cryptic splice site repression”).

      (ii) We also include much more clear text about the fractions of long/ultralong exons bound by RBMX with the following text: 

      “…..This led us to test whether RBMX protein is preferentially associated with long exons. For this we plotted the distribution of internal exons bound and regulated by RBMX together with all internal exons expressed from HEK293 mRNA genes (Liu et al., 2017) (Figure 2 – Source Data 1). We found that RBMX controls and binds two different classes of exons: the first have comparable length to the average HEK293 exon, while the second were extremely long, exceeding 1000 bp in length (Figure 2F). We defined this second class as ‘ultra-long exons’, which represented the 18.9% of internal exons regulated by RBMX and 17.6% of the ones that contained RBMX iCLIP tags. These proportions were significantly enriched compared to the general abundance of internal ultra-long exons expressed from HEK293 cells, which was only 0.4% (Figure 2G)……”

      “…….We next wondered whether ultra-long exons regulated by RBMX (which represented 11.6% of all ultra-long internal exons from genes expressed in HEK293) had any particular feature compared to ultra-long exons that were RBMX-independent……..”

      (3) The authors should state what fraction of ultra-long exons show cryptic splicing in the RBMX siRNA that are corrected by RBMXL2 overexpression (rather than just showing the 3 events). There's some confusion about the global nature of the conclusions relative to the data displayed.

      This is a good point. We have used the RNAseq information as suggested, and included a pie chart (Figure 4C) that includes this information.

      (4) It would be helpful if the authors could identify if there are some motifs more present in ultra-long exons than others.

      Good point, we have included k-mer analysis of the ultra-long exons bound by RBMX, and also more generally ultra-long exons in the human genome, in Figure 2H and 2I. We also add the following text:

      K-mer analyses also showed that while ultra-long exons within mRNAs are rich in AT-rich sequences compared to shorter exons (Figure 2H), the ultra-long exons that are either regulated or bound by RBMX displayed enrichment of AG-rich sequences (Figure 2I), consistent with our identified RBMX-recognised sequences (Figure 2C).

      (5) The authors should evaluate if RBMX-repressed 3' splice sites have similar or low splice site scores/strengths than natural 3' splice sites.

      We have added splice site score analyses in Figure 1F and Figure 1 Supplement 1B. These show that the cryptic splice sites repressed by RBMX are not significantly different from those that are normally used. We add the following text to accompany these figure panels:

      “Furthermore, analysis of splice site strength revealed that, unlike splice sites activated by RBMX (Figure 1 – Figure supplement 1B), alternative splice sites repressed by RBMX have comparable strength to more commonly used splice sites (Figure 1F). This means that RBMX operates as a splicing repressor in human somatic cells to prevent use of ‘decoy’ splice sites that could disrupt normal patterns of gene expression.”

      (6) The section "RBMX protein-RNA interactions may insulate important splicing signals from the spliceosome." is a very preliminary look at possible mechanisms. Can you integrate the RNA Seq and CLIP datasets to generate "splicing maps" that would provide more generalized insights? In fact, where possible, it would be great to integrate the iCLIP data from the same cell types to generate RNA splicing maps (with the KD RNA-seq data)

      We have added “RNA map-type” plots to integrate iCLIP data with splicing patterns (Figure 2 Figure supplement 1D and 1E), and made corresponding changes to the text.

      Additional changes

      We also made some extra changes to respond to the further points raised by reviewers.

      (1) We have carried out gene ontology analysis of those genes that contain RBMX-regulated ultra-long exons versus all ultra-long exons (now Figure 3A, and also Figure 3- Figure supplement 1A and 1B).

      (2) We have corrected the cartoon summarising the branch point analysis (now Figure 3 – Figure Supplement 2F).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      In this manuscript, by using simulation, in vitro and in vivo electrophysiology, and behavioral tests, Peng et al. nicely showed a new approach for the treatment of neuropathic pain in mice. They found that terahertz (THz) waves increased Kv conductance and decreased the frequency of action potentials in pyramidal neurons in the ACC region. Behaviorally, terahertz (THz) waves alleviated neuropathic pain in the mouse model. Overall, this is an interesting study. The experimental design is clear, the data is presented well, and the paper is well-written. I have a few suggestions.

      (1) The authors provide strong theoretical and experimental evidence for the impact of voltage-gated potassium channels by terahertz wave frequency. However, the modulation of action potential also relies on non-voltage-dependent ion channels. For example, I noticed that the RMP was affected by THz application (Figure 3F) as well. As the RMP is largely regulated by the leak potassium channels (Tandem-pore potassium channels), I would suggest testing whether terahertz wave photons have also any impact on the Kleak channels as well.

      Thank you for your positive comment and for providing us with this valuable suggestion. After testing the leak K+ current with and without HFTS on the SNI model, we observed a notable increase in the leak K+ current with HFTS when the holding potential surpassed -40 mV (please see the revised Figs. 2m and n). This finding prompted us to delve deeper into the shifts in the resting membrane potential (RMP). The data, along with statistical analysis, are detailed in Tables S1-3.

      (2) The activation curves of the Kv currents in Figure 2h seem to be not well-fitted. I would suggest testing a higher voltage (>100 mV) to collect more data to achieve a better fitting.

      Thanks for your advice. We repeated the experiment while maintaining the voltage of patched neurons at a higher level (>100 mV) to collect ample data for better fitting. The outcomes are illustrated in the revised Figs. 2g-j. Clearly, the data reveals a significant increase in K+ conductance in the HFTS group as compared to the SNI group. We have integrated these discoveries into the revised manuscript, replacing the earlier results.

      (3) In the part of behavior tests, the pain threshold increased after THz application and lasted within 60 mins. I suggest conducting prolonged tests to determine the end of the analgesic effect of terahertz waves.

      Thank you for your insightful comment. We echo your curiosity about the duration of the HFTS effect. In the process of revising our work, we conducted a comparative analysis of the analgesic duration resulting from 10-minute and 15-minute applications of HFTS. The findings are visualized in the revised Fig. 5c. Our observations indicate that after 160 minutes, the PWMT value for the 15-minute HFTS group decreased to a level comparable to that of the SNI group. Meanwhile, the analgesic effects persisted for 140 minutes in the case of the 10-minute HFTS application. These results imply a direct correlation between the duration of HFTS application and the duration of analgesia.

      (4) Regarding in vivo electrophysiological recordings, the post-HFTS recordings were acquired from a time window of up to 20 min. It seems that the HFTS effect lasted for minutes, but this was not tested in vitro where they looked at potassium currents. This long-lasting effect of HFTS is interesting. Can the authors discuss it and its possible mechanisms, or test it in slice electrophysiological experiments?

      Thank you for your comment. Based on the results from in vivo electrophysiological recordings, it was observed that the effect of HFTS can endure for a minimum of 20 minutes, and this duration was even more extended in behavioral assessments. Taking your advice, we employed slice electrophysiological recording for further testing. Following a 15-minute application of HFTS, we evaluated the K+ current at 5 and 20 minutes after incubation. Our observations clearly indicated a substantial and lasting increase in K+ current, with the effect persisting for at least 20 minutes (refer to Fig. 2l). This provides confirmation of the long-lasting influence of HFTS. The relevant data and statistical analysis are documented in Table S1-2.

      (5) How did the authors arrange the fiber for HFTS delivery and the electrode for in vivo multi-channel recordings? Providing a schematic illustration in Figure 4 would be useful.

      Thank you for your comment. To enhance the reader's understanding of the HFTS delivery device during multi-channel recording, we have included a schematic illustration in Fig. 4a in the revised manuscript. The top portion of Fig. 4a depicts a quantum cascade laser (QCL) with a center frequency located at approximately 36 THz. This laser is then connected to the recording electrode via a PIR fiber. The left section illustrates the detailed structure of the recording electrode.

      (6) Some grammatical errors should be corrected.

      Thank you for your thorough review. We have carefully checked and corrected grammar errors we found throughout the entire text to ensure that readers can better comprehend the content of the article.

      Reviewer #2 (Public Review):

      In this manuscript, Peng et al., reported that 36 THz high-frequency terahertz stimulation (HFTS) can suppress the activity of pyramidal neurons by enhancing the conductance of voltage-gated potassium channel. The authors also demonstrated the effectiveness of using 36THz HFTS for treating neuropathic pain.

      Strengths:

      The manuscript is well written and the conclusions are supported by robust results. This study highlighted the potential of using 36 THz HFTS for neuromodulation.

      Weaknesses:

      More characterization of HFTS is needed, so the readers can have a better assessment of the potential usage of HFTS in their own applications.

      Thank you for your suggestion. We have created schematic diagrams illustrating the HFTS delivery (Fig. 4a and Fig. 5a in the revised manuscript). Fig. 4a presents the structure designed for in vivo multi-channel recording. Fig. 5a shows the structure used in behavior test, the recording electrode is replaced by a metal hollow tube, allowing the PIR fiber to pass through the tube and target the ACC region of the mice.

      (1) It would be very helpful to estimate the volume of tissue that can be influenced by HFTS. It is not clear how 15 mins HFTS was chosen for this functional study. Does a longer time have a stronger effect? A better characterization of the relationship between the stimulus duration of HFTS and its beneficial effects would be very useful.

      Thank you for your feedback. The degree of tissue influence is directly related to the size of the spot emerging from the fiber outlet. In our experiment, we used a PIR fiber with a 630 nm inner core diameter to propagate high-frequency THz waves. This core features a refractive index of 2.15 and has an effective numerical aperture (NA) of 0.35 ± 0.05.

      Our decision to apply HFTS for 15 minutes in the behavioral study was primarily based on observations from in vivo multi-channel recordings. Specifically, we noticed a considerable reduction in the average firing rate of PYR cells after 15 minutes of HFTS exposure. To further investigate the correlation between the duration of HFTS stimulation and its effects, we conducted a comparative study using a 10-minute HFTS session. The results, depicted in revised Fig. 5c, reveal that the PWMT value decreased to the level seen in the SNI group after approximately 160 minutes following 15 minutes of HFTS, and after about 140 minutes with 10 minutes of HFTS. This suggests a direct relationship between the length of HFTS application and its beneficial outcomes.

      (2) How long does the behavioral effect last after 15 minutes of HFTS? Figure 5b only presents the behavioral effect for one hour, but the pain level is still effectively reduced at this time point. The behavioral measurement should last until pain sensitization drops back to pre-stim level.

      Thank you for your feedback. Similar question is also mentioned by reviewer 1. As depicted in Fig. 5c, it was observed that the analgesic effects lasted for 140-160 min with 10-15 minutes application of HFTS. Based on these findings, we can conclude that in the SNI model, targeting the ACC brain region with HFTS for a duration of 10-15 minutes results in an analgesic effect that lasts for roughly 140-160 minutes. This provides valuable insights into the potential clinical applications and duration of relief that can be achieved through HFTS treatment.

      (3) Although the manuscript only tested in ACC, it will also be useful to demonstrate the neural modulation effect on other brain regions. Would 36THz HFTS also robustly modulate activities in other brain regions? Or are different frequencies needed for different brain regions?

      Thank you for your comment. We hypothesize that light waves at a frequency of approximately 36 THz effectively modulate neuronal activities in various brain regions, primarily due to their impact on K channels. Additionally, we speculate that the application of THz waves at different frequencies may influence other channels, such as Na and Ca channels, potentially facilitating or inhibiting neuronal activities. We believe this is a fascinating and significant area of research to explore in the future.

      Reviewer #3 (Public Review):

      Summary:

      This manuscript by Peng et al. presents intriguing data indicating that high-frequency terahertz stimulation (HFTS) of the anterior cingulate cortex (ACC) can alleviate neuropathic pain behaviors in mice. Specifically, the investigators report that terahertz (THz) frequency stimulation widens the selectivity filter of potassium channels thereby increasing potassium conductance and leading to a reduction in the excitability of cortical neurons. In voltage clamp recordings from layer 5 ACC pyramidal neurons in acute brain slice, Peng et al. show that HFTS enhances K current while showing minimal effects on Na current. Current clamp recording analyses show that the spared nerve injury model of neuropathic pain decreases the current threshold for action potential (AP) generation and increases evoked AP frequency in layer 5 ACC pyramidal neurons, which is consistent with previous studies. Data are presented showing that ex-vivo treatment with HFTS in slice reduces these SNI-induced changes to excitability in layer 5 ACC pyramidal neurons. The authors also confirm that HFTS reduces the excitability of layer 5 ACC pyramidal neurons via in vivo multi-channel recordings from SNI mice. Lastly, the authors show that HFTS is effective at reducing mechanical allodynia in SNI using both the von Frey and Catwalk analyses. Overall, there is considerable enthusiasm for the findings presented in this manuscript given the need for non-pharmacological treatments for pain in the clinical setting.

      Strengths:

      The authors use a multifaceted approach that includes modeling, ex-vivo and in-vivo electrophysiological recordings, and behavioral analyses. Interpretation of the findings is consistent with the data presented. This preclinical work in mice provides new insight into the potential use of directed high-frequency stimulation to the cortex as a primary or adjunctive treatment for chronic pain.

      Weaknesses:

      There are a few concerns noted that if addressed, would significantly increase enthusiasm for the study.

      (1) The left Na current trace for SNI + HFTS in Figure 2B looks to have a significant series resistance error. Time constants (tau) for the rate of activation and inactivation for Na currents would be informative.

      Thank you for your feedback. We have carefully considered your comments and made several adjustments in the revised Figs. 2b-f to improve clarity and accuracy. Firstly, we have conducted a comparison of the time constants (tau) between the SNI group and the SNI+HFTS group. These time constants represent the latency of Na current activation or inactivation relative to the half-activated/inactivated voltage. Our analysis reveals that there is no statistically significant difference in tau between the two groups for both activation and deactivation curves. Secondly, we have updated the sample traces in Fig. 2b of the revised manuscript. These new traces illustrate that tau does not significantly differ between the SNI and SNI+HFTS groups, providing a visual representation of our findings. We believe that these modifications strengthen the presentation of our study's details and results, making the data more accessible and understandable for readers.

      (2) It is unclear why an unpaired t-test was performed for paired data in Figure 2. Also, statistical methods and values for non-significant data should be presented.

      Thank you for your comment. I think you mean the results in Fig. 3. We agree with you that we should use one-way ANOVA to analyze the data since there are more than 2 groups for comparison. We thus re-analyzed the data by using one-way ANOVA in Figs. 3g-k, and have included detailed statistical methods and P values in the revised manuscript.

      (3) It would seem logical to perform HFTS on ACC-Pyr neurons in acute slices from sham mice (i.e. Figure 3 scenario). These experiments would be informative given the data presented in Figure 4.

      Thank you for your valuable advice. During the revision process, we performed HFTS on ACC-PYR neurons in acute slices obtained from sham mice. The findings from this experiment have been integrated into the updated Fig. 3, where the sham group is represented by the green line and histogram (the revised Fig. 3 in the manuscript). It is noteworthy that a significant decrease in spike frequency was observed in the sham mice following HFTS.

      (4) As the data are presented in Figure 4g, it does not seem as if SNI significantly increased the mean firing rate for ACC-Pyr neurons, which is observed in the slice. The data were analyzed using a paired t-test within each group (sham and SNI), but there is no indication that statistical comparisons across groups were performed. If the argument is that HFTS can restore normal activity of ACC-Pyr neurons following SNI, this is a bit concerning if no significant increase in ACC-Pyr activity is observed in in-vivo recordings from SNI mice.

      Thank you for highlighting the inaccuracies in the analysis. After reviewing the data, we re-analyzed it using alternative statistical methods. In the revised version, since the data did not follow a normal distribution, we employed Wilcoxon matched-paired signed rank tests within the sham and SNI groups, and Mann-Whitney tests between the sham and SNI groups.

      Upon comparing the statistical outcomes across the groups, we found that the mean firing rate of 130 ACC neurons in SNI mice was significantly higher compared to that of 108 ACC neurons in sham mice (P = 0.0447, Mann-Whitney test). Notably, the mean firing rate of ACC-PYR exhibited a more pronounced increase with a P value of 0.0274 in SNI pre-HFTS versus sham pre-HFTS, while the mean firing rate of ACC-INT did not display a significant change across the groups. These findings align with the observations we made in the slice, reinforcing the validity of our results.

      (5) The authors indicate that the effects of HFTS are due to changes in Kv1.2. However, they do not directly test this. A blocking peptide or dendrotoxin could be used in voltage clamp recordings to eliminate Kv1.2 current and then test if this eliminates the effects of HFTS. If K current is completely blocked in VC recordings then the authors can claim that currents they are recording are Kv1.1 or 1.2.

      Thank you for your kind suggestion. In our research, we employed the Kv1.2 structure as a model to determine the response frequency of terahertz waves. Through both in vitro and in vivo experiments, we were able to demonstrate that the frequency of approximately 36 THz affects the Kv channel and its corresponding spike frequency. Upon analyzing the action potential waveform, we observed a notable variance in the resting membrane potential (RMP). This RMP is predominantly controlled by leak potassium channels, specifically the Tandem-pore potassium channels. In accordance with the recommendation of reviewer 1, we have addressed this particular aspect of our experimentation in the revised manuscript.

      We agree that we should use blocking peptides or dendrotoxin to eliminate Kv1.2 current. However, we meet problems in purchasing and delivery of the drugs. We thus added some explanation in the Discussion part to emphasize the value for this pharmacological experiment and can further confirm this in the future works.

      (6) The ACC is implicated in modulating the aversive aspect of pain. It would be interesting to know whether HFTS could induce conditioned place preference in SNI mice via negative reinforcement (i.e. alleviation of spontaneous pain due to the injury). This would strengthen the clinical relevance of using HFTS in treating pain.

      Thank you for this valuable advice. We share your intrigue regarding this experiment, and we fully recognize the importance and potential of further exploring this area. At present, however, our equipment and platform limitations prevent us from conducting the necessary tests. However, we remain committed to pursuing relevant research opportunities in the future.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1:

      (1) Study suggests that the effects of their tumor models of mouse behavioral are largely non-specific to the tumor as most behaviors are rescued by analgesic treatment. So, most of the changes were likely due to site-specific pain and not a unique signal from the tumor.

      The tumor generates pain at the site it is implanted, and it is likely amplified by the oral activities tumor bearing mice have to engage in. As there is no pain in the absence of the tumor, the pain is, by definition, caused by the tumor, not by the site. Concerning the relationship between pain and behavior, the behavioral assays undertaken in our study (nesting, cookie test, wheel running) were very limited in scope.  Two of these assays (nesting, cookie test) require use of the oral cavity. Only nesting and wheel running were assessed in the context of treatment for pain. Nesting behavior was completely restored with carprofen and buprenorphine treatment suggesting that in the absence of pain, mice were able to make perfect nests. Consistent with this, carprofen and buprenorphine treated animals also gained weight indicating that eating (another activity dependent on the oral cavity) was also restored.  Wheel running, an activity that does not rely on the oral cavity, was only partially restored with drug treatment. While additional behavioral tests are necessary to confirm this finding, the data suggest that there is pain-independent information relayed to the brain which accounts for this decline in wheel running.

      Reviewer #2:

      (1) The main claim is that tumor-infiltrating nerves underlie cancer-induced behavioral alterations, but the experimental interventions are not specific enough to support this. For example, all TRPV1 neurons, including those innervating the skin and internal organs, are ablated to examine sensory innervation of the tumor. Within the context of cancer, behavioral changes may be due to systemic inflammation, which may alter TRPV1 afferents outside the local proximity of tumor cells. A direct test of the claims of this paper would be to selectively inhibit/ablate nerve fibers innervating the tumor or mouth region.

      We agree with the reviewer that a direct test of the hypothesis would require selectively inhibiting the nerve fibers innervating the tumor and assessing the impact on behavior. Studies in the lab are on-going using pharmacological interventions to do this. These studies are beyond the scope of this current manuscript.

      (2) Behavioral results from TRPV1 neuron ablation studies are in part confounded by differing tumor sizes in ablated versus control mice. Are the differences in behavior potentially explained by the ablated animals having significantly smaller tumors? The differences in tumor sizes are not negligible. One way to examine this possibility might be to correlate behavioral outcomes with tumor size.

      As suggested by the reviewer, we have graphed nesting scores and time-to-interact (cookie test) relative to tumor volume.  In both cases, we used simple linear regression to fit the data and analyzed the slopes of the lines. In the case of nesting, there was no significant difference between the slopes. This is now included as Supplemental Figure 4A. In the case of the cookie test, there was a significant difference between the slopes. This is now included as Supplemental Figure 4B. Graphing the data in this way allows one to look at any given tumor volume and infer what the nesting score and the time-to-interact for the two groups of mice. The linear regression model fits the time to interact with the cookie reasonably well, thus from this graph, we can see that at any given tumor volume the time to interact with the cookie was generally shorter in TRPV1cre::DTAfl/wt animals as compared to C57BL/6 mice. Unfortunately, the linear regression does not fit the nesting data very well and thus it is more difficult to make the comparison of tumor volume and nesting score.

      The following text has been added to the results section.

      Given the impact of nociceptor neuron ablation on tumor growth, we wondered whether differences in tumor volume contributed to the behavioral differences we noted. Thus, the behavior data were graphed as a function of tumor volume (Supplemental Fig 4A, B). A simple linear regression model was used to fit the data. In the case of nesting scores, the linear regression did not fit the data points very well making it difficult to assess nesting scores at a given tumor volume (Supplemental Fig 4A). However, the linear regression model fit the time to interact data better. Here, the graph suggests that tumor volume did not influence behavior as at any given tumor volume the time to interact with the cookie is generally smaller in TRPV1-Cre::Floxed-DTA animals as compared to C57BL/6 animals (Supplemental Fig 4B).

      Reviewer #3:

      (1) The authors mention in their Discussion the need for additional experiments. Could they also include / comment on the potential impact on the anti-tumor immune system in their model?

      The following text has been added to the discussion:

      Neuro-immune interactions have been studied in the context of a variety of conditions including, but not limited to infection 109, inflammation 110,111, homeostasis in the gut 112-114, as well as neurological diseases115,116. Neuro-immune communications in the context of cancer and behavior have also been studied (e.g., sickness behavior, depression) 117-119 however, these studies did not assess these interactions at the tumor bed. Investigations into neuro-immune interactions occurring within primary malignancies which harbor nerves have shed light on these critical communications. In the context of melanoma, which is innervated by sensory nerves, we identified that release of the neuropeptide calcitonin gene related peptide (CGRP) induces immune suppression. This effect is mediated by CGRP binding to its receptor, RAMP1, which is expressed on CD8+ T cells 49. A study utilizing a different syngeneic model of oral cancer similarly found an immune suppressive role for CGRP 120-122. These studies demonstrate that neuro-immune interactions occur at the tumor bed. Our current findings indicating that tumor-infiltrating nerves connect to a circuit that includes regions within the brain suggest that neuro-immune interactions within the peripheral malignancy may contribute to the behavioral alterations we studied.

      (2) The authors mention the importance of inflammation contributing to pain in cancer but do not clearly highlight how this may play a role in their model. Can this be clarified?

      The following text has been added to the discussion section of the manuscript.

      Moreover, given that carprofen and buprenorphine decrease inflammation 104, their ability to restore normal nesting and cookie test behaviors (which require the use of the oral cavity where the tumor is located) suggests that inflammation at the tumor site contributed to the decline in these behaviors in vehicle-treated animals. Since both drugs were given systemically and each only partially restored wheel running, it suggests that systemic inflammation alone cannot fully account for the decline in wheel running seen in vehicle-treated animals. We posit that the inflammation- and pain-independent component of this behavioral decline is mediated via the transcriptional and functional alterations in the cancer-brain circuit.

      (3) The tumor model apparently requires isoflurane injection prior to tumor growth measurements. This is different from most other transplantable types of tumors used in the literature. Was this treatment also given to control (i.e., non-tumor) mice at the same time points? If not, can the authors comment on the impact of isoflurane (if any) in their model?

      Mice in all groups (tumor and non-tumor) were treated with isoflurane. This important detail has been added to the methods section.

      (4) The authors emphasize in several places that this is a male mouse model. They mention this as a limitation in the Discussion. Was there an original reason why they only tested male mice?

      The following text has been added in the discussion section:

      Head and neck cancer is predominantly a cancer in males; it occurs in males three times more often than in females 123, this disparity increases in certain parts of the world. While smoking cigarettes and drinking alcohol are risk factors for HPV negative head and neck squamous cell carcinoma, even males that do not smoke and drink are have a higher susceptibility for this cancer than females 124,125. Thus, our studies used only male mice. However, we do recognize that females also get this cancer. In fact, female patients with head and neck cancer, particularly oral cancer, report more pain than their male counterparts 126,127. These findings suggest that differences in tumor innervation exist in males and females.

      Therefore, another project in the lab has been to compare disease characteristics (including innervation and behavior) in male and female mice. The findings from this second study are the topic of a separate manuscript.

      Recommendations For The Authors:

      Reviewing editor:

      (1) Tumors can communicate with the brain via blood-borne agents from the tumor itself or immune cells that are activated by the tumor in addition to neurons that invade the tumor. The xia and malaise that accompanies some tumors can be mediated by direct innervation and/or the humoral factors because both can activate the same parabrachial pathway. This paper makes the case for the direct innervation being important but ignores the possibility of both being involved. The interesting observation that innervation supports tumor growth (perhaps via substance P) is troublesome because the slower appearance of behavioral consequences (Figures 4 & 5) could be attributed to the smaller tumor size. A nice control for humoral effects would be to implant the tumor cells someplace in the body where innervation does not occur (if possible) and then examine behavioral outcomes.

      In the course of several projects, we have implanted different tumor cell lines in different locations in mice (oral cavity, hind limb, flank, peritoneal cavity). In each location, tumor innervation occurs. This is not a phenomenon found only in mice as we completed an immunohistological survey of human cancers from different sites and found they are all innervated (PMID 34944001). These data are consistent with tumor and locally-released factors that recruit nerves to the tumor bed (PMID: 30327461)(PMID: 32051587)(PMID: 27989802). Thus, an implantation site that does not result in tumor innervation is currently unknown and likely does not exist.

      (2) The authors should address whether there is an inflammatory component in this tumor model.

      MOC2-7 tumors have been characterized as non-inflamed and poorly immunogenic 129-131.

      This information has been added to the methods section.

      (3) The RTX experiment in Figure 5 would be more compelling if the drug was injected directly into the tumor rather than injecting it in the flank, thus ablating all TRPV1-exressing neurons as in the genetic approach.

      While we agree with the reviewer that ablating the TRPV1-expressing neurons at the tumor site directly would be ideal, RTX treatment takes approximately one week for ablation to occur but a significant amount of inflammation is associated with this. Therefore, we wait a total of 4 weeks for the inflammation to resolve. By this time, tumors have generally reached sacrifice criteria. Thus, this approach would not enable the question to be answered Moreover, we are not aware of any studies in which RTX has been injected in the oral cavity or face. While RTX is utilized clinically to treat pain, it is typically administered intrathecally, epidurally or intra-ganglionically (PMID: 37894723).

      (4) The authors address affective aspects of pain but do not adequately address the sensory aspects, e.g., sensitivity to touch, heat and/or cold. They attribute the decrease in food disappearance (consumption) and nest building to oral pain, but it could be due to anhedonia and anorexia that can accompany tumor progression.

      Assaying for touch and heat/cold sensitivity in the oral cavity is a critical aspect of studying head and neck cancer that needs to be addressed. However, in rodents these assays are not trivial given that any touch/heat/cold in the area of the tumor (oral cavity) impacts the sensitive whiskers in that region which directly influence these assays. Thus, we have been refining assays (e.g., OPAD, facial von Frey) to address these important questions. The findings from these studies are beyond the scope of this manuscript.

      The reviewer makes a good point about anhedonia and anorexia. The following text has been added to the results section:

      Pain-induced anhedonia is mediated by changes in the reward pathway. Specifically, in the context of pain, dopaminergic neurons in the ventral tegmental area (VTA) become less responsive to pain and release less serotonin.  This decreased serotonin results in disinhibition of GABA release; the resulting increased GABA promotes an increased inhibitory drive leading to anhedonia  82 and, when extreme, anorexia. Carprofen and buprenorphine treatments completely reversed nesting behavior and significantly improved eating. Inflammation 83 and opioids 84 directly influence reward processing and though our tracing studies did not indicate that the tumor-brain circuit includes the VTA, this brain region may be indirectly impacted by tumor-induced pain in the oral cavity. Thus, an alternative interpretation of the data is that the effects of carprofen and buprenorphine treatments on nesting and food consumption may be due to inhibition of anhedonia (and anorexia) rather than, or in addition to, relieving oral pain.

      (5) Comment on why only males were used in this study.

      Please see response to public reviews.

      Reviewer #1:

      (1) Please provide a justification for the use of exclusively male mice and expand in the discussion if there is potential for these findings to be directly applicable to female mice as well.

      Please see response to public reviews.

      The following text has been added to the discussion:

      Head and neck cancer is predominantly a cancer in males; it occurs in males three times more often than in females 123, this disparity increases in certain parts of the world. While smoking cigarettes and drinking alcohol are risk factors for HPV negative head and neck squamous cell carcinoma, even males that do not smoke and drink are have a higher susceptibility for this cancer than females 124,125. Thus, our studies used only male mice. However, we do recognize that females also get this cancer. In fact, female patients with head and neck cancer, particularly oral cancer, report more pain than their male counterparts 126,127. These findings suggest that differences in tumor innervation exist in males and females.

      (2) When discussing the results shown in Figure 2, please include some mention of Fus, since it was the highest expressed transcript.

      The following text has been added to the results section regarding Fus.

      The gene demonstrating the highest increase in expression, Fus, was of particular interest; it increases in expression within DRG neurons following nerve injury and contributes to injury-induced pain 51,52. Of note, we purposefully used whole trigeminal ganglia rather than FACS-sorted tracer-positive dissociated neurons to avoid artificially imposing injury and altering the transcript levels of these cells 53,54. Thus, significantly elevated expression of Fus by ipsilateral TGM neurons from tumor-bearing animals suggests the presence of neuronal injury induced by the malignancy. This is consistent with our previous findings 55 and those of others 56 showing that tumor-infiltrating nerves harbor higher expression of nerve-injury transcripts and neuronal sensitization.

      (3) In line 197 please clarify the mice used. Were all mice tumor-bearing and some had nociceptors ablated, or was there a control (no tumor) group as well?

      Line 197 refers to Figure 4D. In this figure, panels B-D show quantification of cFos and DFosB in the spinal nucleus of the TGM (SpVc), The parabrachial nucleus (PBN) and the Central nucleus of the amygdala (CeA). These data are from C57BL/6 and TRPV1cre::DTAfl/wt animals all of whom had tumor. Supplementary Figure 3C also show quantification of cFos and DFosB but these are from control, non-tumor bearing animals. The fact that controls are non-tumor-bearing has been added to the supplemental figure legend and the text of the results section has been clarified as follows.

      While Fos expression was similar between non-tumor bearing mice of the two genotypes (Supplemental Fig. 3C-E), the absence of nociceptor neurons in tumor-bearing animals decreases cFos and DFosB in the PBN, and DFosB in the SpVc (Fig. 4B, C).

      (4) Overall it would improve the readability of the figures if the colors for the IHC channels were on the image itself and not exclusively in the figure legend.

      The colors for all the staining have been added to each panel.

      (5) It is not a problem that complete cartography was not done, but please include a justification for why the brain regions that were focused on were chosen.

      In order to ensure that our neural tracing technique captured only nerves present within the tumor bed, we restricted the injection of tracer to only 2 µl. We demonstrated that this small volume did not leak out of the tumor (Figure 1) and thus any tracer labeled neurons we identified were deemed as being connected in a circuit to nerves in the tumor bed. While we acknowledged that this calculated technical approach restricted our ability to tracer label all neurons in the tumor bed (as well as those they share circuitry with), it ensured no tracer leakage and inadvertent labeling of non-tumoral nerves. In non-tumor animals injected with 10 µl of tracer, labeled regions in the brain included the spinal nucleus of the trigeminal, the parabrachial nucleus, the central amygdala, the facial nucleus and the motor nucleus of the trigeminal. The regions that were tracer positive when tumor was injected were limited to the spinal nucleus of the trigeminal, the parabrachial nucleus and the central amygdala. Thus, the regions in the brain that we focused on were the areas that became tracer-positive following injection of tracer into the tumor.

      (6) Were the cells that were injected cultured in media with 10% fetal calf serum? If so was any inflammatory response seen? If not please state in the methods section the media that cells for injection were cultured in.

      The cells injected into animals were cultured in media containing 10% fetal calf serum. When cells are harvested for tumor injections, they are first washed two times with PBS and then trypsinized to detach the cells from the plate. Cells are collected, washed again with PBS and resuspended with DMEM without serum; this is what is injected into animals. We harvest cells in this way in order to eliminate any serum being injected into mice. This information has been added to the Methods section.

      (7) Would any of the differences in drug treatment (Carprofen vs Buprenorphine) be due to the differing routes of administration and metabolism of the drugs?

      Since carprofen and buprenorphine each resulted in similar behavioral impacts (nesting and wheel running), their different routes of administration seem to play a minor or no role in the behaviors assessed.

      (8) Please include in the methods section the specific approach and software that was used for processing calcium imaging data and calculating a relative change in fluorescence.

      The specific approach used for processing calcium imaging data and calculating relative change in fluorescence as well as the software used are all included in the methods section. Please see below:

      Ca2+ imaging. TGM neurons from non-tumor and tumor-bearing animals (n=4-6 mice/condition) were imaged on the same day. Neurons were incubated with the calcium indicator, Fluo-4AM, at 37°C for 20 min. After dye loading, the cells were washed, and Live Cell Imaging Solution (Thermo-Fisher) with 20 mM glucose was added. Calcium imaging was conducted at room temperature. Changes in intracellular Ca2+ were measured using a Nikon scanning confocal microscope with a 10x objective. Fluo-4AM was excited at 488 nm using an argon laser with intensity attenuated to 1%. The fluorescence images were acquired in the confocal frame (1024 × 1024 pixels) scan mode. After 1 min of baseline measure, capsaicin (300nM final concentration) was added. Ca2+ images were recorded before, during and after capsaicin application. Image acquisition and analysis were achieved using NIS-Elements imaging software. Fluo-4AM responses were standardized and shown as percent change from the initial frame. Data are presented as the relative change in fluorescence (DF/F0), where F0 is the basal fluorescence and DF=F-F0 with F being the measured intensity recorded during the experiment. Calcium responses were analyzed only for neurons responding to ionomycin (10 µM, positive control) to ensure neuronal health. Treatment with the cell permeable Ca2+ chelator, BAPTA (200 µM), served as a negative control.

      (9) Suggestions for Figure 1:

      - In Figures 1C, D, E, include labels for the days of tumor harvest.

      - Please make the size of the labels the same for 1K an 1L and align them.

      - Microscopy image in Figure 1L for SpVc looks like it may be at a different magnification.

      - If possible, include (either in the figure or the supplement) IHC images staining for Dcx and tau, which would complement the western blot data.

      The requested changes to the figures have been made. Unfortunately, we do not have Dcx and tau IHC staining of the day 4, 10 and 20 tumors.

      (10) Suggestions for Figure 2:

      - Include directly onto the graph in Figure 2a the legend for tumor-bearing (red) and non-tumor bearing (blue).

      - Keep consistent between Figure 2G and 2H/I if the tumor/nontumor will be labeled as T/N or Tumor/Control.

      The requested changes to the figures have been made.

      (11) Suggestions for Figure 3:

      - An example trace of calcium signal would complement Figure 3G, H well.

      Example tracings of calcium signal are already provided in Supplementary Figure 3A and B.

      Reviewer #2:

      (1) While the use of male mice is acknowledged, there is not a rationale for why female mice were not included in the study.

      Please see the response to Reviewer #1 (first question).

      (2) Criteria for euthanasia should be described in the Methods. This is especially needed for interpreting the survival curve in Figure 4H.

      Criteria for euthanasia in our IACUC approved protocol include:

      - maximum tumor volume of 1000mm3

      - edema

      - extended period of weight loss progressing to emaciation

      - impaired mobility or lesions interfering with eating, drinking or ambulation

      - rapid weight loss (>20% in 1 week)

      - weight loss at or more than 20% of baseline

      In addition to tumor size and weight loss, we use the body condition score to evaluate the state of animals and to determine euthanasia.  These details have been added to the Methods section.

      (3) At what stage in cancer progression were the Fos studies conducted for Figure 4A-D?

      The brains used for Fos staining (Fig 4B-D) were harvested at week 5 post-tumor implantation.

      (4) For Fos counts, what are the bregma coordinates for the sections that were quantified?

      SpVc:  -7.56 to -8.24mm

      PBN:  -4.96 to -5.52mm

      CeA:  -0.82mm to -1.94mm

      (5) Statistics are needed for the claim in Lines 171-173.

      The statistical analysis of Fos staining from tumor-bearing and non-tumor bearing brains are included in Figure 3D-F. The statistical analysis of ex vivo Ca+2 imaging of brains from tumor-bearing and non-tumor bearing animals are included in Figure 3 I and J.

      (6) How long was the baseline period for weight and food intake measurements? How long were the animals single-housed before taking the baseline measurements?  

      Baseline weight and food intake measurements were 2 weeks and animals were singly housed before baseline measurements for 2 weeks (a total of 4 weeks).

      Minor:

      (7) The authors might consider rewording the sentence on lines 59-62, given that it is abundantly clear from rodent studies that both the tumor and chemotherapy are associated with adverse behavioral outcomes.

      We have reworded the sentence as follows:  The association of cancer with impaired mental health is directly mediated by the disease, its treatment or both; these findings suggest that the development of a tumor alters brain functions.

      (8) Line 212 needs a space between the two sentences.

      This has been fixed.

      (9) Font size in Figure 2 is not consistent with the other figures.

      This has been fixed.

      (10) "DAPI" is the more conventional than "DaPi".

      This has been fixed.

      Editorial Comments and Suggestions:

      (1) The Abstract would be better if it were more concise, e.g. ~175 words.

      The abstract has been shortened as requested and now reads:

      Cancer patients often experience changes in mental health, prompting an exploration into whether nerves infiltrating tumors contribute to these alterations by impacting brain functions. Using a mouse model for head and neck cancer and neuronal tracing we show that tumor-infiltrating nerves connect to distinct brain areas. The activation of this neuronal circuitry altered behaviors (decreased nest-building, increased latency to eat a cookie, and reduced wheel running). Tumor-infiltrating nociceptor neurons exhibited heightened calcium activity and brain regions receiving these neural projections showed elevated cFos and delta FosB as well as increased calcium responses compared to non-tumor-bearing counterparts. The genetic elimination of nociceptor neurons decreased brain Fos expression and mitigated the behavioral alterations induced by the presence of the tumor. While analgesic treatment restored nesting and cookie test behaviors, it did not fully restore voluntary wheel running indicating that pain is not the exclusive driver of such behavioral shifts. Unraveling the interaction between the tumor, infiltrating nerves, and the brain is pivotal to developing targeted interventions to alleviate the mental health burdens associated with cancer.

      (2) Lines 28, 104, 258, 486, 521, and many other places, "utilized" should be "used" because the former refers to an application for which it is not intended, e.g. a hammer was utilized as a doorstop.

      The requested changes have been made.

      (3) Lines 32 and 73, it is not clear whether the basal activity is heightened or whether excitability is increased. "manifest" might be better than "harbor" on line 73.

      We have changed the wording in the abstract to be clearer. Moreover, our finding that TGM neurons from tumor-bearing animals have increased expression of the s1-Receptor and phosphorylated TRPV1 (Fig 2G-I) indicate that these neurons have increased excitability.

      (4) Line 34 and elsewhere, it would be better to refer to Fos because the is no need to distinguish cellular, cFos, from viral, vFos, in this context.

      The requested changes have been made.

      (5) Line 38, It would be better to refer to what was actually measured rather than "oral movements".

      The requested changes have been made. The sentence now reads: “While analgesic treatment restored nesting and cookie test behaviors, it did not fully restore voluntary wheel running.”

      (6) Line 84, CXCR3-null mouse on a C57BL/6 background.

      The requested change has been made.

      (7) Lines 86,129 wild-type, male mice.

      The requested change has been made.

      (8) Lines114-115, the brackets are not necessary.

      The requested change has been made.

      (9) Lines 118, 384, 409, 527, 589, 971, 974 always leave a space between numbers and units. Use Greek u for micro.

      The requested change has been made.

      (10) Lines 123-124, it is not clear that there is meaningful labeling within the CeA.

      We have replaced this image with a more representative one of the CeA from a tumor-bearing animal with clear tracer labeling.

      (11) Lines 125, 138, and 246 transcription was not measured, only transcript levels were measured.

      The requested changes have been made.

      (12) Line 133, I think >4 fold is meant.

      Thank you for catching that. I have fixed it to >4 fold.

      (13) Line 165, single-time-point assessment (add hyphens).

      The requested change has been made.

      (14) Line 181 and elsewhere including figure, the superscripts refer to alleles of the genes; hence approved gene names should be used in italics (as in Methods), TRPV1-Cre:: Floxed-DTA (without italics) would be acceptable.

      The requested changes have been made.

      (15) Line 182, nociceptor-neuron-ablated mice (add hyphens).

      The requested changes have been made.

      (16) Line 197, It is not clear that the "speed" of food disappearance was measured or that it is due to oral pain vs loss of appetite.

      The reviewer makes a good point. We have changed the sentence to read:

      To evaluate the effects of this disruption on cancer-induced behavioral changes, we assessed the animals’ general well-being through nesting behavior 32 and anhedonia using the cookie test 76,77, as well as  body weight and food disappearance as surrogates for oral pain and/or loss of appetite.

      (17) Line 199, The reduced tumor growth after ablation could account for most of the changes in the other parameters that were measured.

      We have graphed the nesting scores and time-to-interact with the cookie as a function of tumor volume.  These data are now included as Supplemental Figure 4 and suggest that at the same tumor volume, nesting scores and times-to-interact with the cookie are different between the groups.

      (18) Line 204 TPVP1 spelling. Is the TGN smaller after ablation of half of the neurons?

      The requested change has been made.

      (19) Line 235, "now" is not necessary.

      The requested change has been made.

      (20) Line 238-239 and elsewhere, a few references for to why the TGN-SpVc-PBN-CeA circuit is relevant would be helpful.

      The following references have been added regarding the relevance of this circuit to behavior:

      Molecular Brain 14: 94 (2021) (PMID 34167570)

      Neuropharmacology 198: 108757 (2021) (PMID 34461068)

      Frontiers in Cellular Neuroscience 16: 997360 (2022)  (PMID 36385947)

      Neuropsychopharmacology  49(3): 508-520 (2024) (PMID 37542159)

      (21) Lines 371, 434 and Figures, gm should be g or grams in scientific usage. Include JAX lab stock numbers for these mouse lines.

      The requested changes have been made.

      (22) Line 432, removing food for one hour is not a fast.

      The sentence has been reworded as follows: One hour prior to testing, mouse food is removed and the animals are acclimated to the brightly lit testing room.

      (23) Line 476, 5-um sections (add hyphen).

      The hyphen has been added.

      (24) Lines 988, and 1023, DAPI are usually shown this way.

      The requested change has been made.

      (25) Figure 1K, add Bregma levels to figures.

      SpVc: -8.12 mm

      PBN: -5.34 mm

      CeA: -1.34 mm

      (26) Figure 3 line 1033, "area under the curve" What curve was examined?

      The curve examined was the change in fluorescence over time. This curve has been added as Supplemental Figure 3C.

      (27) Figure 3B, the circled area is the lateral PBN. At first glance, I thought scp was meant as the label for the circled area.

      Scp is noted in the figure legend as a landmark.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      For the colony analysis, it is unclear from the methods and main text whether the initial individual sorted colonies were split and subject to different conditions to support the claim of bi-potency. The finding that 40% of colonies displayed tenogenic differentiation, may instead suggest heterogeneity of the sorted progenitor population. The methods as currently described, suggest that two different plates were subject to different induction conditions. It is therefore difficult to assess the strength of the claim of bi-potency.

      Thanks for your valuable comment. We are sorry for the confusing illustration of colony assay. In fact, we first obtained CD29+/CD56+ cells by FACs. Then these freshly isolated cells were randomly seeded to 96-well plate with density of 1 cell/well. Subsequently, the single cell in each plate was cultured with growth medium to form colonies for ten days. Then myogenic induction was performed in three 96-well plates and tenogenic induction was performed in another three 96-well plates for subsequent analyses. Thus, we agree with your point that the sorted progenitor population could be heterogeneous. Almost all the cells highly expressed myogenic progenitor genes PAX7/MYOD1/MYF5 (Figure 1g) and over 95% colonies successfully differentiated into myotubes (Figure 2g). Thus, we believe these obtained CD29+/CD56+ cells were myogenic progenitor cells, while a subgroup of these cells obtained bi-potency.

      This group uses the well-established CD56+/CD29+ sorting strategy to isolate muscle progenitor cells, however recent work has identified transcriptional heterogeneity within these human satellite cells (ie Barruet et al, eLife 2020). Given that they identify a tenocyte population in their human muscle biopsy in Figure 1a, it is critical to understand the heterogeneity contained within the population of human progenitors captured by the authors' FACS strategy and whether tenocytes contained within the muscle biopsy are also CD56+/CD29+.

      Thanks for your constructive suggestion. We will include more samples to perform scRNA-seq and reanalyze the data.

      The bulk RNA sequencing data presented in Figure 3 to contrast the expression of progenitor cells under different differentiation conditions are not sufficiently convincing. In particular, it is unclear whether more than one sample was used for the RNAseq analyses shown in Figure 3. The volcano plots have many genes aligned on distinct curves suggesting that there are few replicates or low expression. There is also a concern that the sorted cells may contain tenocytes as tendon genes SCX, MKX, and THBS4 were among the genes upregulated in the myogenic differentiation conditions (shown in Figure 3b).

      Thanks for your comment. Each group consisted of three samples for RNAseq analyses. We are sorry there exist a minor analysis mistake in Figure 3b and Figure 3c, which will be reanalyzed in the revised version. As for contamination of tenocytes, almost all the obtained cells highly expressed myogenic progenitor marker PAX7/MYOD1/MYF5 (Figure 1g-h). Low expression levels of tendon markers were identified in these cells (Figure 2a-c). Furthermore, although tendon genes slightly upregulated in myogenic differentiation conditions, these markers dramatically upregulated in tenogenic differentiation conditions (Figure 2c). Thus, we believe the tenogenic differentiation ability of sorted cells were mainly ascribed to CD29+/CD56+ myogenic progenitor cells.

      Reviewer #2 (Public Review):

      scRNAseq assay using total mononuclear cell population did not provide meaningful insight that enriched knowledge on CD56+/CD29+ cell population. CD56+/CD29+ cells information may have been lost due to the minority identity of these cells in the total skeletal muscle mononuclear population, especially given the total cell number used for scRNAseq was very low and no information on participant number and repeat sample number used for this assay. Using this data to claim a stem cell lineage relationship for MuSCs and tenocytes may not convincing, as seeing both cell types in the total muscle mononuclear population does not establish a lineage connection between them.

      Thanks for your constructive suggestion. We will include more samples to perform scRNA-seq and reanalyze the data.

      The TGF-b pathway assay uses a small molecular inhibitor of TGF-b to probe Smad2/3. The assay conclusion regarding Smad2/3 pathway responsible for tenocyte differentiation may be overinterpretation without Smad2/3 specific inhibitors being applied in the experiments.

      Thanks for your comment. We agree with your comment that we should revise it in the revision version.

      Reviewer #3 (Public Review):

      Comment: This dual differentiation capability was not observed in mouse muscle stem cells.

      Thanks for your comment. We have explored the tenogenic differentiation potential of mouse MuSCs both in vivo and in vitro. However, low tenogenic differentiation ability was revealed (Figure 4), which might be due to species diversity. Maybe it is more demanding for humans to maintain the homeostasis of the locomotion system and the whole organism locomotion ability in much longer life span and bigger body size. Thus, the current study also indicated that anima studies may not clinically relevant when investigating human diseases.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In the manuscript entitled "Magnesium modulates phospholipid metabolism to promote bacterial phenotypic resistance to antibiotics", Li et al demonstrated the role of magnesium in promoting phenotypic resistance in V. alginolyticus. Using standard microbiological and metabolomic techniques, the authors have shown the significance of fatty acid biosynthesis pathway behind the resistance mechanism. This study is significant as it sheds light on the role of an exogenous factor in altering membrane composition, polarization, and fluidity which ultimately leads to antimicrobial resistance.

      Strengths:

      (1) The experiments were carried out methodically and logically.

      (2) An adequate number of replicates were used for the experiments.

      Weaknesses:

      (1) The introduction section needs to be more informative and to the point.

      (2) The weakest point of this paper is in the logistics through the results section. The way authors represented the figures and interpreted them in the results section (or the figure legends) does not match. The figures are difficult to interpret and are not at all self-explanatory.

      (3) There are too many mislabeling of the figure panels in the main text which makes it difficult to find out which figures the authors are explaining. There should be more explanation on why and how they did the experiments and how the results were interpreted.

      (1) We would like to extensive revise the introduction to make it more informative than the current version.

      (2) We will check the description in the text and labeling in the figures to make it is logic.

      (3) We will add the explanation of the experiments to make it clear that why we perform the assays.

      Reviewer #2 (Public Review):

      Summary:

      In this study, the authors aimed to identify if and how magnesium affects the ability of two particular bacteria species to resist the action of antibiotics. In my view, the authors succeeded in their goals and presented a compelling study that will have important implications for the antibiotic resistance research community. Since metals like magnesium are present in all lab media compositions and are present in the host, the data presented in this study certainly will inspire additional research by the community. These could include research into whether other types of metals also induce multi-drug resistance, whether this phenomenon can be observed in other bacterial species, especially pathogenic species that cause clinical disease, and whether the underlying molecular determinants (i.e. enzymes) of metal-induced phenotypic resistance could be new antimicrobial drug targets themselves.

      Strengths:

      This study's strengths include that the authors used a variety of methodologies, all of which point to a clear effect of exogenous Mg2+ on drug resistance in the targeted species. I also commend the authors for carrying out a comprehensive study, spanning evaluation of whole cell phenotypes, metabolic pathways, genetic manipulation, to enzyme activity level evaluation. The fact that the authors uncovered a molecular mechanism underlying Mg2+-induced phenotypic resistance is particularly important as the key proteins should be studied further.

      Weaknesses:

      I believe there are weaknesses in the manuscript, however. The authors take for granted that the reader is familiar with all the assays utilized, and do not properly explain some experiments, and thus I highly suggest that the authors add a brief statement in each situation describing the rationale for each selected methodology (more details are in the private review to the authors). The Results section is also quite long and bogs down at times, and I suggest that the authors reduce its length by 10 to 20%. In contrast, the Introduction is sparse and lacks key aspects, for example, there should be mention of the study's main purpose and approaches, plus an introduction to the authors' choice of species and their known drug resistance properties, as well as the drug of choice (balofloxacin). Another notable weakness is that the authors evaluated Mg2+-induced phenotypic resistance only against two closely related species, and thus the generalizability of this mechanism of drug resistance is not known. The paper would be strengthened if the authors could demonstrate this type of phenotypic resistance in at least one more Gram-negative species and at least one Gram-positive species (antimicrobial susceptibility evaluations would suffice), each of which should be pathogenic to humans. Demonstrating magnesium-induced phenotypic drug resistance in the WHO Priority Bacterial Pathogens would be particularly important.

      We will add the explanation of the experiments to make it clear that why we perform the assays. And we will revise the introduction and shorten the length of the manuscript. Expanding the bacterial species is very good idea and we will perform such experiment.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      In this work, Odenwald and colleagues show that mutant biotin ligases used to perform proximity-dependent biotin identification (TurboID) can be used to amplify signal in fluorescence microscopy and to label phase-separated compartments that are refractory to many immunofluorescence approaches. Using the parasite Trypanosoma brucei, they show that fluorescent methods such as expansion microscopy and CLEM, which require bright signals for optimal detection, benefit from the elevated signal provided by TurboID fusion proteins when coupled with labeled streptavidin. Moreover, they show that phase-separated compartments, where many antibody epitopes are occluded due to limited diffusion and potential sequestration, are labeled reliably with biotin deposited by a TurboID fusion protein that localizes within the compartment. They show successful labeling of the nucleolus, likely phase-separated portions of the nuclear pore, and stress granules. Lastly, they use a panel of nuclear pore-TurboID fusion proteins to map the regions of the T. brucei nuclear pore that appear to be phase-separated by comparing antibody labeling of the protein, which is susceptible to blocking, to the degree of biotin deposition detected by streptavidin, which is not. 

      Strengths: 

      Overall, this study shows that TurboID labelling and fluorescent streptavidin can be used to boost signal compared to conventional immunofluorescence in a manner similar to tyramide amplification, but without having to use antibodies. TurboID could prove to be a viable general strategy for labeling phase-separated structures in cells, and perhaps as a means of identifying these structures, which could also be useful. 

      Weaknesses: 

      However, I think that this work would benefit from additional controls to address if the improved detection that is being observed is due to the increased affinity and smaller size of streptavidin/biotin compared to IgGs, or if it has to do with the increased amount of binding epitope (biotin) being deposited compared to the number of available antibody epitopes. I also think that using the biotinylation signal produced by the TurboID fusion to track the location of the fusion protein and/or binding partners in cells comes with significant caveats that are not well addressed here, mostly due to the inability to discern which proteins are contributing to the observed biotin signal. 

      To dissect the contributions of the TurboID fusion to elevating signal, anti-biotin antibodies could be used to determine if the abundance of the biotin being deposited by the TurboID is what is increasing detection, or if streptavidin is essential for this.

      We agree with the reviewer, that it would be very interesting to distinguish whether the increase in signal comes from the multiple biotinylation sites or from streptavidin being a very good binder, or perhaps from both. However, this question is very hard to answer, as antibodies differ massively in their affinity to the antigen which is further dependent on the respective IF-conditions, and are therefore not directly comparible. Even if anti-biotin gives a better signal then anti-HA, this can be either caused by the increase in antigen-number (more biotin than HA-tag) or by the higher binding affinity, or by a combination of both, thus hard to distinguish. Nevertheless, we have tested monoclonal mouse anti-biotin targeting the (non-phase-separated) NUP158. We found the signal from the biotin-antibody to be much weaker than from anti-HA, indicating that, at least this particular biotin antibody, is not a very good binder in IF. 

      Alternatively, HaloTag or CLIP tagging could be used to see if diffusion of a small molecule tag other than biotin can overcome the labeling issue in phase-separated compartments. There are Halo-biotin substrates available that would allow the conjugation of 1 biotin per fusion protein, which would allow the authors to dissect the relative contributions of the high affinity of streptavidin from the increased amount of biotin that the TurboID introduces. 

      This is a very good idea, as in this case, the signals are both from streptavidin and are directly comparable. We expressed NUP158 with HaloTag and added PEG-biotin as a Halo ligand. However, PEG-biotin is poorly cell-permeable, and is in general only used on lysates. In trypanosomes, cell permeability is particular restricted, and even Halo-ligands that are considered highly cell-penetrant give only a weak signal. Even after over-night incubation, we could not get any signal with PEG-biotin. Our control, the TMR-ligand 647, gave a weak nuclear pore staining, confirming the correct expression and function of the HaloTag-NUP158.

      The idea of using the biotin signal from the TurboID fusion as a means to track the changing localization of the fusion protein or the location of interacting partners is an attractive idea, but the lack of certainty about what proteins are carrying the biotin signal makes it very difficult to make clear statements. For example, in the case of TurboID-PABP2, the appearance of a biotin signal at the cell posterior is proposed to be ALPH1, part of the mRNA decapping complex. However, because we are tracking biotin localization and biotin is being deposited on a variety of proteins, it is not formally possible to say that the posterior signal is ALPH1 or any other part of the decapping complex. For example, the posterior labeling could represent a localization of PABP2 that is not seen without the additional signal intensity provided by the TurboID fusion. There are also many cytoskeletal components present at the cell posterior that could be being biotinylated, not just the decapping complex. Similar arguments can be made for the localization data pertaining to MLP2 and NUP65/75. I would argue that the TurboID labeling allows you to enhance signal on structures, such as the NUPs, and effectively label compartments, but you lack the capacity to know precisely which proteins are being labeled.  

      We fully agree with the reviewer, that tracking proteins by streptavidin imaging alone is problematic, because it cannot distinguish, which protein is biotinylated. We therefore used words like “likely”  in the description of the data. However, we still think, it is a valid method, as long as it is confirmed by an orthogonal method. We have added this paragraph to the end of this chapter:

      “Importantly, tracking of proteins by streptavidin imaging requires orthogonal controls, as the imaging alone does not provide information about the nature of the biotinylated proteins. These can be proximity ligation assay, mass spectrometry or specific tagging visualisation of protein suspects by fluorescent tags. Once these orthogonal controls are established for a specific tracking, streptavidin imaging is an easy and cheap and highly versatile method to monitor protein interactions in a specific setting.”

      Reviewer #2 (Public Review): 

      Summary: 

      The authors noticed that there was an enhanced ability to detect nuclear pore proteins in trypanosomes using a streptavidin-biotin-based detection approach in comparison to conventional antibody-based detection, and this seemed particularly acute for phase-separated proteins. They explored this in detail for both standard imaging but also expansion microscopy and CLEM, testing resolution, signal strength, and sensitivity. An additional innovative approach exploits the proximity element of biotin labelling to identify where interacting proteins have been as well as where they are. 

      Strengths: 

      The data is high quality and convincing and will have obvious application, not just in the trypanosome field but also more broadly where proteins are tricky to detect or inaccessible due to phase separation (or some other steric limitations). It will be of wide utility and value in many cell biological studies and is timely due to the focus of interest on phase separation, CLEM, and expansion microscopy. 

      Thank you! We are glad you liked it.

      Reviewer #3 (Public Review): 

      Summary: 

      The authors aimed to investigate the effectiveness of streptavidin imaging as an alternative to traditional antibody labeling for visualizing proteins within cellular contexts. They sought to address challenges associated with antibody accessibility and inconsistent localization by comparing the performance of streptavidin imaging with a TurboID-HA tandem tag across various protein localization scenarios, including phase-separated regions. They aimed to assess the reliability, signal enhancement, and potential advantages of streptavidin imaging over antibody labeling techniques. 

      Overall, the study provides a convincing argument for the utility of streptavidin imaging in cellular protein visualization. By demonstrating the effectiveness of streptavidin imaging as an alternative to antibody labeling, the study offers a promising solution to issues of accessibility and localization variability. Furthermore, while streptavidin imaging shows significant advantages in signal enhancement and preservation of protein interactions, the authors must consider potential limitations and variations in its application. Factors such as the fact that tagging may sometimes impact protein function, background noise, non-specific binding, and the potential for off-target effects may impact the reliability and interpretation of results. Thus, careful validation and optimization of streptavidin imaging protocols are crucial to ensure reproducibility and accuracy across different experimental setups. 

      Strengths: 

      - Streptavidin imaging utilizes multiple biotinylation sites on both the target protein and adjacent proteins, resulting in a substantial signal boost. This enhancement is particularly beneficial for several applications with diluted antigens, such as expansion microscopy or correlative light and electron microscopy. 

      - This biotinylation process enables the identification and characterization of interacting proteins, allowing for a comprehensive understanding of protein-protein interactions within cellular contexts. 

      Weaknesses: 

      - One of the key advantages of antibodies is that they label native, endogenous proteins, i.e. without introducing any genetic modifications or exogenously expressed proteins. This is a major difference from the approach in this manuscript, and it is surprising that this limitation is not really mentioned, let alone expanded upon, anywhere in the manuscript. Tagging proteins often impacts their function (if not their localization), and this is also not discussed.

      - Given that BioID proximity labeling encompasses not only the protein of interest but also its entire interacting partner history, ensuring accurate localization of the protein of interest poses a challenge. 

      - The title of the publication suggests that this imaging technique is widely applicable. However, the authors did not show the ability to track the localization of several distinct proteins on the same sample, which could be an additional factor demonstrating the outperformance of streptavidin imaging compared with antibody labeling. Similarly, the work focuses only on small 2D samples. It would have been interesting to be able to compare this with 3D samples (e.g. cells encapsulated in an extracellular matrix) or to tissues.  

      Recommendations for the authors:

      To enhance the assessment from 'incomplete' to 'solid', the reviewers recommend that the following major issues be addressed: 

      Major issues: 

      (1) Anti-biotin antibodies in combination with TurboID labeling should be used to compare the signal/labelling penetrance to streptavidin results. That would show if elevated biotin deposition matters, or if it is really the smaller size, more fluors, and higher affinity of streptavidin that's making the difference. 

      We agree with the reviewer, that it would be very interesting to distinguish whether the increase in signal comes from the multiple biotinylation sites or from streptavidin being a very good binder, or perhaps from both, and whether the size matters (IgG versus streptavidin). However, this question is very hard to answer, as antibodies differ massively in their affinity to the antigen. Thus, even if antibiotin would give a better signal then anti-HA, this could be either caused by the increase in antigen-number (more biotin than HA-tag) or by the better binding affinity, or by a combination, and it would not allow to truly answer the question. We have now tested anti-biotin antibodies, also in repsonse to reviewer 1, and got a much poorer signal in comparison to anti-HA or streptavidin.

      Please note that we made another attempt using nanobodies to target phase-separated proteins, to see, whether size matters (Fig. 2I). The nanobody did not stain Mex67 at the nuclear pores, but gave a weak nucelolar signal for NOG1, which may suggest that the nanobody can slightly better penetrate than IgG, but it does not rule out that the nanobody simply binds with higher affinity. Reviewer 1 has suggested to use the Halo Tag with PEG-biotin: this would indeed allow to directly compare the streptavidin signal caused by the TurboID with a single biotin added by the Halo tag. Unfortunately, the PEG-biotin does not  penetrate trypanosome cells. In conclusion, we are not aware of a method that would allow to establish why streptavidin but not IgGs can penetrate to phase separated areas. We therefore prefer to not overinterpret our data, but stick to what is supported by the data: “the inability to label phase-separated areas is not restricted to anti-HA but applies to other antibodies”.

      (3) Figure 4 A-B. The validity of claiming the correct localization demonstrated by streptavidin imaging comes into question, especially when endogenous fluorescence, via the fusion protein, remains undetectable (as indicated by the yellow arrow at apex). 

      In this figure, the streptavidin imaging does NOT show the correct localisation of the bait protein, but it does show proteins from historic interactions that have a distinct localisation to the bait. We had therefore introduced this chapter with the paragraph below, to make sure, the reader is aware of the limitations (which we also see as an opportunity, if properly controlled):

      “We found that in most cases, streptavidin labelling faithfully reflects the steady state localisation of a bait protein, e.g., the localisation resembles those observed with immunofluorescence or direct fluorescence imaging of GFP-fusion proteins. For certain bait proteins, this is not the case, for example, if the bait protein or its interactors have a dynamic localisation to distinct compartments, or if interactions are highly transient. It is thus essential to control streptavidin-based de novo localisation data by either antibody labelling (if possible) or by direct fluorescence of fusion-proteins for each new bait protein.”

      In particular, on lines 450-460, there's a fundamental issue with the argument put forward here. It is not possible to formally know that the posterior labeling is ALPH1 vs. another part of the decapping complex that was associated with PABP2-Turbo, or if the higher detection capacity of the Turbo-biotin label is uncovering a novel localization of the PABP2. While it is likely that it is ALPH1, it is not possible to rule out other possibilities with this approach. These issues should be discussed here and more generally the possibility of off-target labeling with this approach should be addressed in the discussion. 

      We fully agree with the reviewer, that tracking proteins by streptavidin imaging alone is problematic, because it cannot distinguish, which protein is biotinylated. We therefore used words like “likely”  in the description of the data. However, we still think, it is a valid method, as long as it is back-uped by an orthogonal method. We have added this paragraph to the end of this chapter:

      “Importantly, tracking of proteins by streptavidin imaging requires orthogonal controls, as the imaging alone does not provide information about the nature of the biotinylated proteins. These can be proximity ligation assay, mass spectrometry or specific tagging visualisation of protein suspects by fluorescent tags. Once these orthogonal controls are established for a specific tracking, streptavidin imaging is an easy and cheap and highly versatile method to monitor protein interactions in a specific setting.”

      (4) More discussion and acknowledgment of the general limitations in using tagged proteins are needed to balance the manuscript, especially if the hope is to draw a comparison with antibody labeling, which works on endogenous proteins (not requiring a tag). For example: (a) tagging proteins requires genetic/molecular work ahead of time to engineer the constructs and/or cells if trying to tag endogenous proteins; (b) tagged proteins should technically be validated in rescue experiments to confirm the tag doesn't disrupt function in the cell/tissue/context of interest; and (c) exogenous tagged proteins compete with endogenous untagged proteins, which can complicate the interpretation of data.  

      We have added this paragraph to the first paragraph of the discussion part:

      “Like many methods that are frequently used in cell- and molecular biology, streptavidin imaging is based on the expression of a genetically engineered fusion protein: it is essential to validate both, function and localisation of the TurboID-HA tagged protein by orthogonal methods. If the fusion protein is non-functional or mis-localised, tagging at the other end may help, but if not, this protein cannot be imaged by streptavidin imaging. Likewise, target organisms not amenable to genetic manipulation, or those with restricted genetic tools,  are not or less suitable for this method.”

      Also, we like to point out that for non-mainstream organisms like trypanosomes, antibodies are not commercially available and often genetic manipulation is more time-efficient and cheaper than the production of antiserum against the target protein.

      Also, the introduction would ideally be more general in scope and introduce the pros and cons of antibody labeling vs biotin/streptavidin, which are mentioned briefly in the discussion. The fact that the biotin-streptavidin interaction is ~100-fold higher affinity than an IgG binding to its epitope is likely playing a key role in the results here. The difference in size between IgG and streptavidin, the likelihood that the tetrameric streptavidin carries more fluors than a IgG secondary, and the fact that biotin can likely diffuse into phase-separated environments should be clearly stated. The current introduction segues from a previous paper that a more general audience may not be familiar with. 

      We have now included this paragraph to the introduction:

      “It remains unclear, why streptavidin was able to stain biotinylated proteins within these antibody inaccessible regions, but possible reasons are: (i) tetrameric streptavidin is smaller and more compact than IgGs (60 kDa versus a tandem of two IgGs, each with 150 kDa) (ii) the interaction between streptavidin and biotin is ~100 fold stronger than a typical interaction between antibody and antigen and (iii) streptavidin contains four fluorophores, in contrast to only one per secondary IgG.”

      Minor issues: 

      The copy numbers of the HA and Ty1 epitope tags vary depending on the construct being used. For example, Ty1 is found as a single copy tag in the TurboID tag, but on the mNeonGreen tag there are 6 copies of the epitope. It makes it hard to know if differences in detection are due to variations in copies of the epitope tags. Line 372-374: can the authors explain why they chose to use nanobodies in this case? It would be great to show the innate mNeonGreen signal in 2K to compare to the Ty1 labeling. The presence of 6 copies of the Ty1 epitope could be essential to the labeling seen here.

      We agree with the reviewer, that these data are a bit confusing. We have now removed Figure 3K, as it is the only construct with 6 Ty1 instead of one, and it does not add to the conclusions. (the mNeonsignal is entirely in the nucleolus, as shown by Tryptag). We have also added an explanation why we used nanobodies (“The absence of a nanobody signal rules out that its simply the size of IgGs that prevents the staining of Mex67 at the nuclear pores, as nanobodies are smaller than (tetrameric) streptavidin”). However, as stated above, we prefer not to overinterpret the data, as signals from different antibodies/nanobodies – antigen combinations are not comparable. Important to us was to stress that the absence of signal in phase-separated areas is NOT restricted to the anti-HA antibody, which is clearly supported by the data.

      What is the innate streptavidin background labeling look like in cells that are not carrying a TurboID fusion, from the native proteins that are biotinylated? That should be discussed. 

      We have now included the controls without the TurboID fusions for trypanosomes and HeLa cells: “Wild type cells of both Trypanosomes and human showed only a very low streptavidin signal, indicating that the signal from naturally biotinylated proteins is neglectable (Figure S8 in supplementary material).”

      Line 328-331: This is likely to be dependent on whether or not the protein moves to different localizations within the cell. 

      True, we agree, and we have added this paragraph:

      “The one exception are very motile proteins that produce a “biotinylation trail” distinct to the steady state localisation; these exceptions, and how they can be exploited to understand protein interactions, are discussed in chapter 4 below. “

      Line 304-305: Does biotin supplementation not matter at all? 

      No, we never saw any increase in biotinylation when we added extra biotin to trypanosomes. The 0.8 µM biotin concentration in the medium were sufficient.

      Line 326-327: Was the addition of biotin checked for enhancement in the case of the mammalian NUP98? I would argue that there is a significant number of puncta in Figure 1D that are either green or magenta, not both. The amount of extranuclear puncta in the HA channel is also difficult to explain. Biotin supplementation to 500 µM was used in mammalian TurboID experiments in the original Nature Biotech paper- perhaps nanomolar levels are too low. 

      We now tested HeLa cells with 500 µM Biotin and saw an increase in signal, but also in background; due to the increased background  we conclude that low biotin concentrations are more suitable . We have also repeated the experiment using 4HA tags instead of 1HA, and we found a minor improvement in the antibody signal for NUP88 (while the phase separated NUP54 was still not detectable). We have replaced the images in Figure 1D  (NUP88) and also in Figure 2F (NUP54) with improved images and using 4HA tags. However, we like to note that single nuclear pore resolution is beyond what can be expected of light microscopy.

      Line 371: In 2I, I see a signal that looks like the nucleus, similar to the Ty1 labeling in 2G, so I don't think it's accurate to say that that Mex67 was "undetectable". Does the serum work for blotting? 

      Thank you, yes, “undetectable” was not the correct phrase here. Mex67 localises to the nuclear pores, to the nuceoplasm and to the nucleolus (GFP-tagging or streptavidin). Antibodies, either to the tag or to the endogenous proteins, fail to detect Mex67 at the nuclear pores and also don’t show any particular enrichment in the nucleolus. They do, however, detect Mex67 in the (not-phase-separated) area of the nucleoplasm. We have changed the text to make this clearer. The Mex67 antiserum works well on a western blot (see for example: Pozzi, B., Naguleswaran, A., Florini, F., Rezaei, Z. & Roditi, I. The RNA export factor TbMex67 connects transcription and RNA export in Trypanosoma brucei and sets boundaries for RNA polymerase I. Nucleic Acids Res. 51, 5177–5192 (2023))

      Line 477: "lacked" should be "lagged".

      Thank you, corrected.

      Line 468-481: My previous argument holds here - how do you know that the difference in detection here is just a matter of much higher affinity/quantity of binding partner for the avidin?

      See answer to the second point of (3), above.

      483-491: Same issue - without certainty about what the biotin is on, this argument is difficult to make. 

      See answer to the second point of (3), above.

      Line 530: "bone-fine" should be "bonafide"

      Thank you, corrected.

      Line 602: biotin/streptavidin labeling has been used for expansion microscopy previously (Sun, Nature Biotech 2021; PMID: 33288959). 

      Thank you, we had overlooked this! We have now included this reference and describe the differences to our approach clearer in the discussion part:

      “Fluorescent streptavidin has been previously used in expansion microscopy to detect biotin residues in target proteins produced by click chemistry (Sun et al., 2021). However, to the best of our knowledge, this is the first report that employs fluorescent streptavidin as a signal enhancer in expansion microscopy and CLEM, by combining it with multiple biotinylation sites added by a biotin ligase. Importantly, for both CLEM and expansion, streptavidin imaging is the only alternative approach to immunofluorescence, as denaturing conditions associated with these methods rule out direct imaging of fluorescent tags.”

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment:

      This study presents valuable framework and findings to our understanding of the brain as a fractal object by observing the stability of its shape property within 11 primate species and by highlighting an application to the effects of aging on the human brain. The evidence provided is solid but the link between brain shape and the underlying anatomy remains unclear. This study will be of interest to neuroscientists interested in brain morphology, whether from an evolutionary, fundamental or pathological point of view, and to physicists and mathematicians interested in modeling the shapes of complex objects.

      We now clarified the outstanding questions regarding if our model outputs can be related to actual primate brain anatomy, which we believe was mainly based on comments regarding the validity of our output of apparently thicker cortices than nature can produce.

      We address this point in more detail in the point-by-point response below, but want to address this misunderstanding directly here: Our algorithm does not produce thicker cortices with increasing coarse-graining scales; in fact, the cortical thickness never exceeds the actual cortical thickness in our outputs, but rather thins with each coarse-graining scale. In other words, we believe that our outputs are fully in line with neuroanatomy across species.

      Reviewer #2 (Public Review): 

      In this manuscript, the authors analyze the shapes of cerebral cortices from several primate species, including subgroups of young and old humans, to characterize commonalities in patterns of gyrification, cortical thickness, and cortical surface area. The authors state that the observed scaling law shares properties with fractals, where shape properties are similar across several spatial scales. One way the authors assess this is to perform a "cortical melting" operation that they have devised on surface models obtained from several primate species. The authors also explore differences in shape properties between brains of young (~20 year old) and old (~80) humans. A challenge the authors acknowledge struggling with in reviewing the manuscript is merging "complex mathematical concepts and a perplexing biological phenomenon." This reviewer remains a bit skeptical about whether the complexity of the mathematical concepts being drawn from are justified by the advances made in our ability to infer new things about the shape of the cerebral cortex. 

      To allow scientists from all backgrounds to adopt these complex ideas, we have made our code to “melt” the brains and for further downstream analysis publicly available. We have now also provided a graphical user interface, to allow users without substantial coding experience to run the analysis. We also believe that the algorithmic concepts are easy to understand due to the similarity to the coarse-graining procedures found in long-standing and well-accepted box-counting algorithms.

      Beyond the theoretical insight of the fractal nature of cortices and providing an explicit and crucial link between vastly different brains that are gyrified and those that are not, we believe that the advance gained by our methods for future applications is clearly demonstrated in our proof-of-principle with a four-fold increase in effect size. For reference, an effect size of 8 would translate to an almost perfect separation of groups, i.e. an ideal biomarker with near 100% sensitivity and specificity.

      (1) The series of operations to coarse-grain the cortex illustrated in Figure 1 produces image segmentations that do not resemble real brains.

      As re-iterated in our Methods and Discussion: “Note, of course, that the coarse-grained brain surfaces are an output of our algorithm alone and are not to be directly/naively likened to actual brain surfaces, e.g. in terms of the location or shape of the folds. Our comparisons here between coarse-grained brains and actual brains is purely on the level of morphometrics across the whole cortex.”

      Fig. 1 therefore serves as an explanation to the reader on the algorithmic outputs, but each melted brain is not supposed to be directly/visually compared to actual brains. Similar to algorithms measuring the fractal dimension, or the exposed surface area of a given brain, the intermediate outputs of these algorithms are not supposed to represent any biologically observed brain structures, but rather serve as an abstraction to obtain meaningful morphometrics.

      We additionally added a note to the caption of Fig. 1 to clarify this point:

      “Note that the actual size of the brains for analysis are rescaled (see Methods and Fig. 3); we display all brains scaled at an equal size here for the ease of visualisation of the method.”

      Finally, we also edited the entire paper for terminology to clearly distinguish the terms of (1) the cortex as a 3D object, (2) coarse-grained and voxelised versions thereof, and (3) summary morphological measures derived from the former. When we invite comparisons in our paper between real brains and coarse-grained brains, this is always at the level of summary morphological measures, not at the level of the 3D objects/voxelisations themselves.

      The process to assign voxels in downsampled images to cortex and white matter is biased towards the former, as only 4 corners of a given voxel are needed to intersect the original pial surface, but all 8 corners are needed to be assigned a white matter voxel. The reason for introducing this bias (and to the extent that it is present in the authors' implementation) is not provided.

      This detail was in the Supplementary, and we have now added additional clarification on this specific point to our Supplementary:

      “In detail, we assign all voxels in the grid with at least four corners inside the original pial surface to the pial voxelization. This process allows the exposed surface to remain approximately constant with increasing voxel sizes. A constant exposed surface is desirable, as we only want to gradually ‘melt’ and fuse the gyri, but not grow the bounding/exposed surface as well. We want the extrinsic area to remain approximately constant as we decrease the intrinsic area via coarse-graining; it is like generating iterates of a Koch curve in reverse, from more to less detailed, by increasing the length of smallest line segment.

      We then assign voxels with all eight corners inside the original white matter surface to the white matter voxelization. This is to ensure integrity of the white matter, as otherwise white matter voxels in gyri may become detached from the core white matter, and thus artificially increase white matter surface area. Indeed, the main results of the paper are not very sensitive to this decision using all eight corners, vs. e.g. only four corners, as we do not directly use white matter surface area for the scaling law measurements. However, we still maintained this choice in case future work wants to make use of the white matter voxelisations or derivative measures.”

      Note on the point of white matter integrity that if both grey and white matter voxelisations require all 8 corner to be inside the respective mesh, there will be voxels not assigned to either at the grey/white matter interface, causing potential downstream issues.

      We further acknowledge:

      “Of course, our proposed procedure is not the only conceivable way to erase shape details below a given scale; and we are actively working on related algorithms that are also computationally cheaper. Nevertheless, the current version requires no fine-tuning, is computationally feasible and conceptually simple, thus making it a natural choice for introducing the methodology and approach.”

      The authors provide an intuitive explanation of why thickness relates to folding characteristics, but ultimately an issue for this reviewer is, e.g., for the right-most panel in Figure 2b, the cortex consists of several 4.9-sided voxels and thus a >2 cm thick cortex. A structure with these morphological properties is not consistent with the anatomical organization of typical mammalian neocortex. 

      We assume the reviewer refers to Fig. 1B with the panel on scale=4.9mm. We would like to point out that Fig. 1 serves as an explanation of the voxelisation method. For the actual analysis and Results, we are using re-scaled brains (see Fig. 2 with the ever decreasing brain sizes). The rescaling procedure is now expanded as below:

      “Morphological properties, such as cortical thicknesses measured in our ‘melted’ brains are to be understood as a thickness relative to the size of the brain. Therefore, to analyse the scaling behaviour of the different coarse-grained realisations of the same brain, we apply an isometric rescaling process that leaves all dimensionless shape properties unaffected (more details in Suppl. S3.1). Conceptually, this process fixes the voxel size, and instead resizes the surfaces relative to the voxel size, which ensures that we can compare the coarse-grained realisations to the original cortices, and test if the former, like the latter, also scale according to Eqn. (1). Resizing, or more precisely, shrinking the cortical surface is mathematically equivalent to increasing the box size in our coarse-graining method. Both achieved an erasure of folding details below a certain threshold. After rescaling, as an example, the cortical thickness also shrinks with increasing levels of coarse-graining, and never exceeds the thickness measured at native scale.”

      We additionally added a note to the caption of Fig. 1 to clarify this point:

      “Note that the actual size of the brains for analysis are rescaled (see Methods and Fig. 3); we display all brains scaled at an equal size here for the ease of visualisation of the method.”

      Finally, we also edited the entire paper for terminology to clearly distinguish the terms of (1) the cortex as a 3D object, (2) coarse-grained versions thereof, and (3) summary morphological measures derived from the former. When we invite comparisons in our paper between real brains and coarse-grained brains, this is always at the level of summary morphological measures, not at the level of the 3D objects themselves and their detailed anatomical features.

      (2) For the comparison between 20-year-old and 80-year-old brains, a well-documented difference is that the older age group possesses more cerebral spinal fluid due to tissue atrophy, and the distances between the walls of gyri becomes greater. This difference is born out in the left column of Figure 4b. It seems this additional spacing between gyri in 80 year olds requires more extensive down-sampling (larger scale values in Figure 4a) to achieve a similar shape parameter K as for the 20 year olds. The authors assert that K provides a more sensitive measure (associated with a large effect size) than currently used ones for distinguishing brains of young vs. old people. A more explicit, or elaborate, interpretation of the numbers produced in this manuscript, in terms of brain shape, might make this analysis more appealing to researchers in the aging field.

      We have removed the main results relating to K and aging from our last revision already to avoid confusion. This is now only in the supplementary analysis, and our claim of K being a more sensitive measure for age and ageing – whilst still true – will be presented in more detail in a series of upcoming papers.

      (3) In the Discussion, it is stated that self-similarity, operating on all length scales, should be used as a test for existing and future models of gyrification mechanisms. Given the lack of association between the abstract mathematical parameters described in this study and explicit properties of brain tissue and its constituents, it is difficult to envision how the coarse-graining operation can be used to guide development of "models of cortical gyrification."

      We have clarified in more detail what we meant originally in Discussion:

      “Finally, this dual universality is also a more stringent test for existing and future models of cortical gyrification mechanisms at relevant scales, and one that moreover is applicable to individual cortices. For example, any models that explicitly simulate a cortical surface as an output could be directly coarse-grained with our method and the morphological trajectories can be compared with those of actual human and primate cortices. The simulated cortices would only be ‘valid’ in terms of the dual universality, if it also produces the same morphological trajectories.”

      However, we agree with the reviewer that our paper could be misread as demanding direct comparisons of each coarse-grained brain with an actual brain, and we have now added the following text to clarify that this is not our intention for the proposed method or outputs.

      “Note, we do not suggest to directly compare coarse-grained brain surfaces with actual biological brain surfaces. As we noted earlier, the coarse-grained brain surfaces are an output of our algorithm alone and not to be directly/naively likened to actual brain surfaces, e.g. in terms of the location or shape of the folds. Our comparisons here between coarse-grained brains and actual brains is purely on the level of morphometrics across the whole cortex.”

      Indeed, the dual universality imposes restrictive constraints on the possible shapes of real cortices, but do not fully specify them. Presumably, the location of individual folds in different individuals and species will depend on their respective evolutionary histories, so there is no reason to expect a match in fold location between the ‘melted’ cortices of more gyrified species, on one hand, and the cortex of a less-gyrified one, on the other,  even if their global morphological parameters and global mechanism of folding coincide.

      (4) There are several who advocate for analyzing cortical mid-thickness surfaces, as the pial surface over-represents gyral tips compared to the bottoms of sulci in the surface area. The authors indicate that analyses of mid-thickness representations will be taken on in future work, but this seems to be a relevant control for accepting the conclusions of this manuscript.

      In the context of some applications and methods, we agree that the mid-surface is a meaningful surface to analyse. However, in our work, the mid-surface is not. The fractal estimation rests on the assumption that the exposed area hugs the object of interest (hence convex hull of the pial surface), as the relationship between the extrinsic and intrinsic areas across scales determine the fractal relationship (Eq. 2). If we used the mid-surface instead of the pial surface for all estimation, this would not represent the actual object of interest, and it is separated from the convex hull. Estimating a new convex hull based on the mid surface would be the equivalent of asking for the fractal dimension of the mid-surface, not of the cortical ribbon. In other words, it would be a different question, bound to yield a different answer.

      Hence, we indicated in our original response that we only have a provisional answer, but more work beyond the scope of this paper is required to answer this question, as it is a separate question. The mid-surface, as a morphological structure in its own right, will have its own scaling properties, and our provisional understanding is that these also yield a scaling law parallel to those of the cortical ribbon with the same or a similar fractal dimension. But more systematic work is required to investigate this question at native scale and across scales.

      Reviewer #3 (Public Review):

      Summary: Through a rigorous methodology, the authors demonstrated that within 11 different primates, the shape of the brain followed a universal scaling law with fractal properties. They enhanced the universality of this result by showing the concordance of their results with a previous study investigating 70 mammalian brains, and the discordance of their results with other folded objects that are not brains. They incidentally illustrated potential applications of this fractal property of the brain by observing a scale-dependant effect of aging on the human brain. 

      Strengths: 

      - New hierarchical way of expressing cortical shapes at different scales derived from previous report through implementation of a coarse-graining procedure 

      - Investigation of 11 primate brains and contextualisation with other mammals based on prior literature 

      - Proposition of tool to analyse cortical morphology requiring no fine tuning and computationally achievable 

      - Positioning of results in comparison to previous works reinforcing the validity of the observation. 

      - Illustration of scale-dependance of effects of brain aging in the human. 

      Weaknesses: 

      - The notion of cortical shape, while being central to the article, is not really defined, leaving some interpretation to the reader 

      - The organization of the manuscript is unconventional, leading to mixed contents in different sections (sections mixing introduction and method, methods and results, results and discussion...). As a result, the reader discovers the content of the article along the way, it is not obvious at what stages the methods are introduced, and the results are sometimes presented and argued in the same section, hindering objectivity. 

      To improve the document, I would suggest a modification and restructuring of the article such that: 1) by the end of the introduction the reader understands clearly what question is addressed and the value it holds for the community, 2) by the end of the methods the reader understands clearly all the tools that will be used to answer that question (not just the new method), 3) by the end of the results the reader holds the objective results obtained by applying these tools on the available data (without subjective interpretations and justifications), and 4) by the end of the discussion the reader understands the interpretation and contextualisation of the study, and clearly grasps the potential of the method depicted for the better understanding of brain folding mechanisms and properties. 

      We thank this reviewer again for their attention to detail and constructive comments. We have followed the detailed suggestions provided by us in the Recommendations For The Authors, and summarise the main changes here:

      - We have restructured all sections to be more clearly following Introduction, Methods, Results, and Discussion; by using subsections, we believe the structure is now more accessible to readers.

      -  We have now clarified the concept of “cortical shape”, as we use it in our paper in several places, by distinguishing clearly the object of study, and the morphological properties measured from it.

      Recommendations for the authors: 

      Reviewer #2 (Recommendations For The Authors): None 

      Reviewer #3 (Recommendations For The Authors): 

      I once again compliment the authors for their elegant work. I am happy with the way they covered my first feedback. My second review takes into account some comments made by other reviewers with which I agree. 

      We thank this reviewer again for their attention to detail and constructive comments.

      Recommendations for clarifications: 

      General comments: The purpose of the article could be made clearer in the introduction. When I differentiate results from discussion, I think of results as objective measures or observations, while discussion will relate to the interpretation of these results (including comparison with previous literature, in most cases). 

      We have restructured all sections to be more clearly following Introduction, Methods, Results, and Discussion; by using subsection, we believe the structure is now more accessible to readers.

      - l.39: define or discuss "cortical shape" 

      We have gone through the entire paper and corrected for any ambiguities. We specifically distinguish between the cortex as a structure overall, shape measures derived from this structure, and coarse-grained versions of the structure.

      - l.48-74: this would match either an introduction or a discussion rather than a methods section. 

      Done

      - l.98-106: this would match a discussion rather than a methods section. 

      Done

      - l.111: here could be a good spot to discuss the 4 vs 8 corners for inclusion of pial vs white matter voxelization 

      We have discussed this in the more detailed Supplementary section now, as after restructuring, this appears to be the more suitable place.

      - l.140-180: it feels that this section mixes methods, results and discussion of the results 

      We agree and we have resolved this by removing sentences and re-arranging sections.

      - l.183-217: mix of results and discussion 

      We agree and we have resolved this by removing sentences and re-arranging sections.

      Small cosmetic suggestions: 

      - l.44: conservation of 'some' quantities: vague 

      Changed to conservation of morphological relationships across evolution

      - l.66: order of citations ([24, 22,23]) 

      Will be fixed at proof stage depending on format of references.

      - l.77: delete space between citation and period 

      Done

      - l.77: I would delete 'say' 

      Done

      - l.86: 'but to also analyse' -> 'to analyse' 

      Done

      - l.105: remove 'we are encouraged that' 

      Done

      - l.111: 'also see' -> 'see also' 

      Done

      - l.164: 'remarkable': subjective 

      Done

      - l.189: define approx. abbreviation 

      Done

      - l.190: 'approx' -> 'approx.' 

      Revised

      - l.195: 'dramatic': subjective 

      removed

      -l. 246: 'much' -> vague 

      explained

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Plasmacytoid dendritic cells (pDCs) represent a specialized subset of dendritic cells (DCs) known for their role in producing type I interferons (IFN-I) in response to viral infections. It was believed that pDCs originated from common DC progenitors (CDP). However, recent studies by Rodrigues et al. (Nature Immunology, 2018) and Dress et al. (Nature Immunology, 2019) have challenged this perspective, proposing that pDCs predominantly develop from lymphoid progenitors expressing IL-7R and Ly6D. A minor subset of pDCs arising from CDP has also been identified as functionally distinct, exhibiting reduced IFN-I production but a strong capability to activate T-cell responses. On the other hand, clonal lineage tracing experiments, as recently reported by Feng et al. (Immunity, 2022), have demonstrated a shared origin between pDCs and conventional DCs (cDCs), suggesting a contribution of common DC precursors to the pDC lineage.

      In this context, Araujo et al. investigated the heterogeneity of pDCs in terms of both development and function. Their findings revealed that approximately 20% of pDCs originate from lymphoid progenitors common to B cells. Using Mb1-Cre x Bcl11a floxed mice, the authors demonstrated that the development of this subset of pDCs, referred to as "B-pDCs," relied on the transcription factor BCL11a. Functionally, B-pDCs exhibited a diminished capacity to produce IFN-I in response to TLR9 agonists but secreted more IL-12 compared to conventional pDCs. Moreover, B-pDCs, either spontaneously or upon activation, exhibited increased expression of activation markers (CD80/CD86/MHC-II) and a heightened ability to activate T-cell responses in vitro compared to conventional pDCs. Finally, Araujo et al. characterized these B-pDCs at the transcriptomic level using bulk and single-cell RNA sequencing, revealing them as a unique subset of pDCs expressing certain B cell markers such as Mb1, as well as specific markers (Axl) associated with cells recently described as transitional DCs.

      Thus, in contrast to previous findings, this study posits that a small proportion of pDCs derive from B cell-committed lymphoid progenitors, and this subset of B-pDCs exhibits distinct functional characteristics, being less specialized in IFN-I production but rather in T cell activation.

      Strengths:

      Previously, the same research group delineated the significance of BCL11a as a critical transcription factor in pDC development (Ippolito et al., PNAS, 2014). This study elucidates the precise stage during hematopoiesis at which BCL11a expression becomes essential for the emergence of a distinct subset of pDCs, substantiated by robust genetic evidence in vivo. Furthermore, it underscores the shared developmental origin between pDCs and B cells, reinforcing prior research in the field that suggests a lymphoid origin of pDCs. Finally, this work attributes specific functional properties to pDCs originating from these lymphoid progenitors shared with B cells, emphasizing the early imprinting of functional heterogeneity during their development.

      Weaknesses:

      The authors delineate a subset of pDCs dependent on the BCL11a transcription factor, originating from lymphoid progenitors, and compare it to conventional pDCs, which they suggest differentiate from common DC progenitors of myeloid origin. However, this interpretation lacks support from the authors' data. Their single-cell RNA sequencing data identifies cells corresponding to progenitors (Prog2), from which the majority of pDCs, termed conventional pDCs, likely originate. This progenitor cell population expresses Il7r, Siglech, and Ly6D, but not Csfr1. The authors describe this progenitor as resembling a "pro-pDC myeloid precursor," yet these cells align more closely with lymphoid (Il7r+) progenitors described by Rodrigues et al. (Nature Immunology, 2018) and Dress et al. (Nature Immunology, 2019). Furthermore, analysis of their Mb1 reporter mice reveals that only a fraction of common lymphoid progenitors (CLP) express YFP, giving rise to a fraction of YFP+ pDCs. However, this does not exclude the possibility that YFP- CLP could also give rise to pDCs. The authors could address this caveat by attempting to differentiate pDCs from both YFP+ and YFP- CLPs in vitro in the presence of FLT3L. Additionally, transfer experiments using these lymphoid progenitors could be conducted in vivo to assess their differentiation potential in competitive settings.

      Dear Reviewer 1, we appreciate your thoughtful comments. We made the decision to address the Prog2 cluster as “pro-pDC myeloid precursor” because despite its lack of CSFR-1, its CIPR similarity score showed highest transcriptional similarity to the population “SC.CDP.BM” (GEO accession number: GSM791114), which is shown to be Sca1- Flt3+ cKitlo.

      A similar population identified as “common dendritic cell progenitor” is shown by Onai and colleagues (Onai et al. 2013, Immunity) to be capable of differentiating into pDCs by upregulating E2-2 and subsequently downregulating M-CSFR. In addition, we were unable to infer a developmental trajectory between Prog2 and B-pDCs using SimplePPT on Monocle3 (Figure 5B). Since we know our B-pDCs are CLP derived and most likely share a B cell progenitor population, we feel this lack of connectivity to the UMAP myeloid partition corroborates our assignment of Prog2 as a myeloid pDC progenitor (not CLP derived). Of note, recent work by Medina and colleagues has shown that while IL-7Rα knockout mice exhibit a block in B cell development at the all-lymphoid progenitor (ALP) stage, PDCA-1+ pDCs identified within the initially gated BLP population persisted (PLoS One, 2013), suggesting the IL7R chain is not required for the development of PDCA1+ cells. 

      Using their Mb1-reporter mice, the authors demonstrate that YFP pDCs originating from lymphoid progenitors are functionally distinct from conventional pDCs, mostly in vitro, but their in vivo relevance remains unknown. It is crucial to investigate how Bcl11a conditional deficiency in Mb1-expressing cells affects the anti-viral immune response, for example, using the M-CoV infection model as described by Sulczewski et al. in Nature Immunology, 2023. Particularly, the authors suggest that their B-pDCs act as antigen-presenting cells involved in T-cell activation compared to conventional pDCs. However, these findings contrast with those of Rodrigues et al., who have shown that pDCs of myeloid origin are more effective than pDCs of lymphoid origin in activating T-cell responses. The authors should discuss these discrepancies in greater detail. It is also notable that B-PDCs acquire the expression of ID2 (Figure S3A), commonly a marker of conventional/myeloid DCs. The authors could analyze in more detail the acquisition of specific myeloid features (CD11c, CX3CR1) by this B-PDCs subset and discuss how the expression of ID2 may impair classical pDC features, as ID2 is a repressor of E2-2, a master regulator of pDC fate.

      Both reviewers expressed the need to further investigate how Bcl11a conditional deficiency in Mb1-expressing cells affects anti-viral responses of B-pDCs. While the functional characterization of B-pDC in the context of infection could be highly informative, it is really outside the scope of the present study. Our discovery that B-pDCs expand robustly upon TLR-9 agonist challenges in vivo and can prime T cells in vitro efficiently, however, suggests that these cells might play an important role during viral infections or anti-cancer immunity.

      Finally, through the analysis of their single-cell RNA sequencing data, the authors show that the subset of B-pDCs they identified expresses Axl, confirmed at the protein level. Given this specific expression profile, the authors suggest that B-pDCs are related to a previously described subset of transitional DCs, which were reported to share a common developmental path with pDCs, (Sulczewski et al. in Nature Immunology, 2023). While intriguing, this observation requires further phenotypic and functional characterization to substantiate this claim.

      We agree with the reviewer’s comments. We are currently preparing a separate manuscript addressing the commonalities between human transitional DCs and murine non-conventional pDCs.

      Reviewer #2 (Public Review):

      Summary:

      The origin of plasmatoid dendritic cells and their subclasses continues to be a debated field, akin to any immune cell field that is determined through the expression of surface markers (relative to clear subclass separation based on functional biology and experimentation). In this context, in this manuscript by Araujo et al, the authors attempt to demonstrate that a subtype of pDCs comes from lymphoid origin due to the presence of some B cell gene expression markers. They nomenclature these cells as B-pDCs. Strikingly, pDCs function via expression of IFNa where as B-pDCs do not express IFNa - thereby raising the question of what are their physiological or pathophysiological properties. B-pDCs also express AXL, a marker not seen in mouse pDCs but observed in human pDCs. Overall, using a combination of gene expression profiling of immune cells isolated from mice via RNA-seq and single-cell profiling the authors propose that B-pDCs are a novel subtype of pDCs in mice that were not previously identified and characterized.

      Weaknesses:

      My two points of discussion about this manuscript are as follows.

      (1) How new are these observations that pDCs could also originate from common lymphoid progenitors. This fact has been previously outlined by many laboratories including Shigematsu et al, Immunity 2004. These studies in the manuscript can be considered new based on the single-cell profiling presented, only if the further characterization of the isolated B-pDCs is performed at the functional biology level. Overlapping gene expression profiles are often seen in developing immune cell types- especially when only evaluated at the RNA expression level- and can lead to cell type complexity (and identification of new cell types) that are not biologically and functionally relevant.

      Dear reviewer 2, we appreciate your thoughtful comments. We believe our single cell seq analysis adds new information to the studies mentioned because of our broader approach to BM profiling. By using only one marker (PDCA1+), scRNA-seq allowed us to dissect not only several subpopulations of pDCs that to our knowledge were not previously dissected in mice, but also linked the transcriptional similarity of B-pDCs to myeloid derived pDCs (and even other myeloid cell types), as well as B cells.

      (2) The authors hardly perform any experiments to interrogate the function of these B-pDCs. The discussion on this topic can be enhanced. Ideally, some biological experiments would confirm that B-pDCs are important.

      Dear reviewer 2, we appreciate your thoughtful comment and agree about the need for further functional characterization of B-pDCs (please see comments directed to reviewer 1 above).

      (1) Considering that Bcl11a conditional deficiency severely impacts the B cell lineage, there is a possibility that such an effect on B cells may indirectly influence pDC development. To address this, the authors could repeat their bone marrow transfer experiments in a competitive setting by mixing both Bcl11a WT and CKO BM cells (using congenic markers to track the origin of the BM cells) and then specifically assess whether BM cells originating from Bcl11a CKO donors have impaired pDC output.

      Dear reviewer 2, while the comment above is valid (that the reduced number of mature B cells in our Bcl11a conditional knockout might indirectly impact B-pDC development), we and many others have previously shown that lack of transcriptional regulation of E2-2 and other pDC differentiation modulators by Bcl11a  (including ID2 and MTG16) intrinsically and selectively disrupts the pDC lineage. At the current stage, we feel rederiving Bcl11a cKOs and performing bone marrow transfers (which usually take several months) only to investigate indirect effects of B cells on pDC developments is outside the scope of this publication.

      (2) As mentioned earlier, it is important to assess the potential of CLP, whether YFP- or YFP+, in their ability to give rise to pDCs both in vitro and in vivo. This is also crucial since the authors previously demonstrated that Bcl11a deficiency in all hematopoietic cells had a more drastic impact on pDC development than mb1-cre specific deficiency.

      We agree the manuscript could be strengthened by differentiation experiments. However, in our previous publication (mentioned above by the reviewer), we specifically show that although fewer overall LSK progenitors were detected in Vav-Cre+ F/F mice, both MDP and CDP progenitor populations persisted within the Flt3+ compartment in cKO mice at percentages similar to controls. MDP (Lin– Flt3+ Sca-1− CD115+ c-kithi); CDP (Lin– Flt3+ Sca-1− CD115+ c-kitlo). This data confirms that CLPs give rise to a substantial pool of pDC subpopulations. Other works have shown this as well, both in vivo and in vitro (Wang et al. Immunity 2004;  Karsunky et al, JEM 2003, etc). We therefore feel that confirming the previous observations that CLPs can give rise to pDCs is unnecessary, as our main goal in this manuscript was to describe a new pDC subpopulation that emerges primarily from CD79a+ B cell biased progenitors.

      (3) The authors show a more severe impact of Bcl11a CKO on pDC depletion in the spleen than in the BM. Is this effect specific to the spleen, or can it also be observed in lymph nodes? What is the overall impact of Bcl11a conditional deficiency on pDC distribution in tissues such as the liver and lung? These questions are important to address to understand whether the heterogeneity of pDCs is differentially affected by their localization.

      We agree heterogeneity of pDCs can be affected by their microenvironment. Although phenotyping of lymph nodes in Bcl11a cKOs would greatly add to our manuscript, the genetically altered strains required are no longer being bred in our facility and resurrecting them from frozen sperm is outside the realm of this publication.

      (4) Regarding the functional study of pDCs, as emphasized previously, it is important to assess the in vivo relevance of B-pDCs in infectious settings.

      Dear reviewer 2, we appreciate your thoughtful comment. Please see our response directed to reviewer 1 above.

      (5) The authors injected CpG-ODN into mice and analyzed pDC phenotype upon activation. It is important to note that upon activation, especially upon induction of IFN-I production in vivo, mPDCA1 expression is no longer specific to pDCs  (Blasius et al, Journal of Immunology, 2006). Therefore, to specifically characterize pDC phenotype upon activation, a differential gating strategy is required (CD11c, B220, Ly6C, and Siglec H) to ensure that bona fide pDCs are analyzed.

      We agree with the reviewer that this would be a more appropriate characterization. Regarding PDCA1 promiscuity in activated states, we are not aware of any cell types that express very high levels of B220 and PDCA1 simultaneously other than pDCs. We therefore firmly believe that our assignment is valid. Interestingly, gating B220+ cells of Cpg challenged mice that show intermediate expression of PDCA1 results in an increase in the frequency of CD19+ B cells, which we were careful to avoid by gating only the cells that most strongly express PDCA1.

      (6) How does pDC activation regulate their mb1 expression? Could conventional pDCs, upon activation, become B-PDCs? Could activation and induction of IFN-I production in vivo also affect CLP and increase the amount of YFP+ lymphoid progenitors and thus B-pDC output?

      Dear reviewer, we agree with your concern, albeit beyond the scope of the present study. While changes in YFP MFI via flow cytometry upon vaccination was not substantial, we have included the following comment in the manuscript discussion, acknowledging the aforementioned possibility: “Of note, whether induction of IFN-I production in vivo could also affect CLP and increase the amount of YFP+ lymphoid progenitors and thus B-pDC output is unclear. Further research is required to answer this question.”

      (7) If pDCs are preferentially expanding upon in vivo stimulation, it would be informative to assess their Ki67 profile. This is a surprising observation since pDCs are generally considered quiescent cells that were previously described to die in response to activation and IFN-I (Swiecki et al, Journal of Experimental Medicine, 2011).

      We agree and have entered the following statement to address this concern: “Functionally, they expand more readily after TLR9 engagement than classical pDCs (either through increased proliferation or differentiation of other cell types) and excel at activating T cells in culture.”

      (8) How does the conditional deficiency of BCL11a affect the production of IFN-I and IL-12 in vivo (serum) upon CpG-ODN stimulation?

      Dear reviewer 2, we are currently unable to rederive the conditional knockout mouse strain in a timely fashion. However, our ELISA experiments performed under controlled in vitro activation conditions, along with the in vivo findings of Zhang et al.(PNAS 2017) warrants the hypothesis that B-pDCs most likely exhibit a similar cytokine secreting profile under inflammatory conditions.

      (9) Given that B-PDCs show downregulation of pDC canonical markers, including IRF8 and TLR7, could the authors address how B-PDCs respond to TLR7 stimulation in vitro and assess a broader spectrum of cytokines produced by pDCs in response to such stimulation (IL-6, TNFa, CXCL10...)?

      Dear reviewer 2, although expanding our findings to include B-pDC responses to TLR-7 stimulation would greatly enhance our manuscript, a technical deterrent stands in our way. As mentioned prior, sorting B-pDCs for new experiments using reporter YFP mice is currently not possible, as we have retired this mouse strain. Sorting of live CD79a+ BpDCs via FACS is also not feasible, as CD79a staining with most antibody clones requires permeabilization of cells for easier access to the intra-membrane portion of CD79a.

      (10) It would be informative to compare scRNA sequencing data between control and Bcl11a CKO mice to ascertain their contribution to B-PDCs and whether this deficiency may affect other pDC clusters and/or progenitors.

      We are unable to sort B-pDCs for new experiments, as we unfortunately retired the transgenic colony.

      (11) Transitional DCs were reported to give rise to a subset of cDC2. Given that the authors claim that B-PDCs are related to this subset of transitional DCs, could the authors observe any YFP staining in cDC2 upon the generation of their BM chimeras?

      We saw no YFP positivity in CD11c hi cells (cDCs) via flow or through scRNA-seq, indicating CD79a expression is unique in mature B cells and B-pDCs.

      (12) Most of the statistical analysis is done with a student test. This requires a normal distribution of the sample which is highly unlikely given the size of the sample. Therefore, the authors shall rather use a non-parametric test (Mann Whitney) to compare their samples.

      We agree and have redone our statistical analyses using non-parametric test (Mann Whitney).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      (1)  In the subsets of the γδ T cells that exhibit reduced BLK expression in B6. SAP KO mice, have the authors examined the expression of Lck and/or Fyn? 

      The reviewer raises an excellent point. We have included in the revised manuscript additional data on Lck and Fyn expression in our scRNAseq dataset in (new Suppl. Fig. 1 and new Suppl. Fig. 4). These data revealed that in contrast to Blk, which appears primarily restricted to the γδT17 clusters, Lck and Fyn exhibit a much broader distribution and lack restriction to specific clusters. We did note that, like Blk, Lck and Fyn transcripts were abundant in SAP-dependent C2 cluster cells. Pseudobulk analysis on the immature clusters revealed that, neither Fyn nor Lck expression level differences reached our cut-off of 0.5 log2 FC (log2 FC Blk = 1.06), leading us to conclude that Blk is particularly dependent on SAP. We did note, however, that the magnitude of Lck differential expression was close to the 0.5 log2 FC cut-off and that its expression was increased in B6.SAP-/- γδ T cells (Suppl. Fig. 4). These results have been added to lines 202-212 in the Results section and lines 491-499 in the Discussion section.

      (2)  Does BLK directly associate with SLAM F1 and or SLAM F6 receptors? 

      The reviewer raises an interesting question given previous reports that BLK, LCK, and FYN have all been implicated in γδ T cell development. While SAP has a well-known ability to recruit FYN to SLAMF1 and there is evidence of a similar SAP-mediated recruitment of LCK to SLAMF6, we are not aware of any evidence a SAP-BLK interaction or of a direct binding of BLK to SLAM family receptors. Future experiments to investigate this possiibility are certainly warranted. In the revised ms, we have included additional discussion of these possibilities (lines 491- 499).  

      (3)  Given the emerging role of γδ T cells in host immunity, it would be useful if the authors could add a discussion of how their findings are relevant in disease conditions such as cancer. 

      We agree and have included new text in the Introduction (lines 37-45). 

      (4)  Delete repeated words in lines 546 and line 553. 

      Thank you—this has been corrected in the revised manuscript.

      Reviewer #2:

      This is a very complete study and requires no additional experimentation. One thing to keep in mind in assessing the ultimate fate of the "ab wannabe cells" is that mechanisms exist to silence the gd TCR as cells differentiate to the DP stage and so their presence as diverted DP cells may not be evident by staining for gdTCR expression - and will only be evident transcriptomically. 

      We appreciate this helpful comment from the reviewer which we will take into consideration in our future experimental design.

      There are a couple of minor points to raise: 

      (1)  Figure 3C is not called out in the text. 

      Thank you—this has been corrected in the revised manuscript.

      (2)  Line 546 - "dependent" is repeated.

      Thank you—this has been corrected in the revised manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Response to reviewers (minor points):

      We thank all reviewers for their very helpful suggestions and greatly appreciate their positive evaluation of our work.

      Reviewer #1:

      Ad 1) The reviewer states: Fig 5 While the data very nicely show that CPX and Syt1 have interdependent interactions in the chromaffin neurons, this seems to be not the case in neurons, where the loss of complexins and synaptotagmins have additive effects, suggesting independent mechanisms (eg Xue et al., 2010). This would be a good opportunity to discuss some possible differences between secretion in endocrine cells vs neurons.

      We greatly appreciate the insightful suggestion by the reviewer. To accommodate the reviewer’s suggestion, we now discuss this issue on page 21, line 486-491: “In murine hippocampal neurons, loss of CpxI and Syt1 has additive effects on fast synchronous release, suggesting independent mechanisms (Xue et al., 2010). On the other hand, the same study also showed that Syt1 heterozygosity fails to reduce release probability in wild-type neurons, but does so in the absence of Cpx, again suggesting that Cpx and Syt1 may functionally interact in Ca2+-triggered release.”

      Ad 2) The reviewer states: Fig 8 Shows an apparent shift in Ca sensitivity in N-terminal mutants suggesting a modification of Ca sensitivity of Syt1. Could there be also an alternative mechanism, that explains this phenotype which is based on a role of the n-term lowering the energy barrier for fusion, that in turn shifts corresponding fusion rates to take place at lower Ca saturation levels?

      We fully agree with the reviewer. While our data indicate that Cpx and Syt1 act in a dependent manner in accelerating exocytosis, they do not provide decisive evidence that the NTD of CpxII directly modulates the Ca2+ affinity of Syt1, an issue that we discuss on page 23 , line 523529: ”The results favor a model wherein the CpxII NTD either directly regulates the biophysical properties of the Ca2+-sensor by increasing the apparent forward rate of Ca2+-binding or indirectly affects SytI-SNARE or SytI-membrane interactions, thereby, lowering the energy barrier of Ca2+triggered fusion.”

      Reviewer #2:

      Ad 1) The reviewer states: The authors provide a "chromaffin cell-centric" view of the function of mammalian Cplx in vesicle fusion. With the exception of mammalian renal ribbon synapses (and some earlier RNAi knockdown studies that had off-target effects), there is very little evidence for a "fusion-clamp"-like function of Cplxs in mammalian synapses. At conventional mammalian synapses, genetic loss of Cplx (i.e. KO) consistently decreases AP-evoked release, and generally either also decreases spontaneous release rates or does not affect spontaneous release, which is inconsistent with a "fusion-clamp" theory. This is in stark contrast to invertebrate (D. m. and C. e.) synapses where genetic Cplx loss is generally associated with strong upregulation of spontaneous release, providing support for Cplx acting as a "fusion-clamp".

      We agree with the reviewer that it is difficult to reconcile contradictory findings regarding the role of Cpx in membrane fusion in vertebrates and invertebrates or between murine hippocampal neurons and neuroendocrine cells. On the other hand, we respectfully disagree with the statement of providing a "chromaffin cell-centric" view of the function of mammalian Cplx in vesicle fusion. In fact, a large number of model systems (in vitro and in vivo studies) support a scenario where complexin takes center stage in clamping of premature vesicle release. For example, in vitro analyses using a liposome fusion assay (Schaub et al., 2006, Nat Struct Mol Biol 13, 748; Schupp et al., 2016) or Hela cells that ectopically express “flipped” SNAREs on their cell surface (Giraudo et al., 2008, JBC 283, 21211) showed that complexin can inhibit the SNARE-driven fusion machinery. Likewise, several studies boosting complexin action by either genetic overexpression or peptide supplementation have provided evidence for the complexin clamp function in neuronal and nonneuronal cells (e.g. Itakura et al., 1999, BBRC 265, 691; Liu et al., 2007, Biochemistry 72, 439; Abderrahmani et al., 2004, J Cell Sci 117, 2239; Archer et al., 2002, JBC 277, 18249; Tang et al, 2006,

      Cell 126, 1175; Vaithianathan et al., 2013, J Neurosci 33, 8216; Roggero et al., 2007, JBC, 282, 26335.)

      In addition, chromaffin cells enable the investigation of secretion on the background of a well-defined intracellular calcium concentration. Indeed, CplxII knock-out in chromaffin cells demonstrated an enhanced tonic release which is evident at elevated levels of [Ca]i (>100nM), but absent at low resting [Ca]i (Dhara et al., 2014). Given this observation, it is tempting to speculate that variations in [Ca]i among the different preparations may contribute to the deviating expression of the complexin null phenotype in different preparations.

      Ad 2) The reviewer states: The authors use a Semliki Forest virus-based approach to express mutant proteins in chromaffin cells. This strategy leads to a strong protein overexpression (~7-8 fold, Figure 3 Suppl. 1). Therefore, experimental findings under these conditions may not necessarily be identical to findings with normal protein expression levels.

      As shown in Fig. 4, we use the secretion response of wt cells as a control so that we can assess the specificity and quality of the rescue approach in our experiments. In addition, the comparative analysis of the CpxII mutants was performed with respect to the equally overexpressed CpxII wt protein (Fig. 3 Suppl. 1), which we used as a control to determine the standard response under these conditions.

      Ad 3) The reviewer states: Measurements of delta Cm in response to Ca2+ uncaging by ramping [Ca2+ ] from resting levels up to several µM over a me period of several seconds were used to establish changes in the release rate vs [Ca2+ ]i relationship. It is not clear to this reviewer if and how concurrently occurring vesicle endocytosis together with a possibly Ca2+-dependent kinetics of endocytosis may affect these measurements.

      By infusing bovine chromaffin cells with 50µM free Ca2+, Smith and Betz have shown that the total capacitance increase is dominated by exocytosis and that significant endocytosis only sets in after 3 minutes (Smith and Betz, 1996, Nature, 380, 531). In the same line, we previously showed that mouse chromaffin cells (infused with 19µM free calcium over 2 minutes) responded with robust increase in membrane capacitance which strongly correlated with the number of simultaneously recorded amperometric events monitoring fusion of single vesicles (Dhara et al., 2014, Fig. 5B). Thus, capacitance alterations recorded under tonic intracellular Ca2+ increase in chromaffin cells are solely due to exocytosis and are not contaminated by significant endocytosis. As our Ca2+ ramp experiments were carried out for 6 seconds and the intracellular free [Ca]i did not exceed 19 µM the observed phenotypical differences between the experimental groups are most likely due to changes in exocytosis rather than endocytosis.

      Ad 4) The reviewer states: It should be pointed out that an altered "apparent Ca2+ affinity" or "apparent Ca2+ binding rate" does not necessarily reflect changes at Ca2+-binding sites (e.g. Syt1).

      We fully agree with the reviewer’s comment. As pointed out also in the response to reviewer 1, our experiments do not provide decisive evidence that the NTD of CpxII directly modulates the Ca2+ affinity of Syt1, an issue that we discuss on page 23 , line 523-529: ” The results favor a model wherein the CpxII NTD either directly regulates the biophysical properties of the Ca2+sensor by increasing the apparent forward rate of Ca2+-binding or indirectly affects SytI-SNARE or SytI-membrane interactions, thereby, lowering the energy barrier of Ca2+-triggered fusion.” 

      AD 5) There are alternative models on how Cplx may "clamp" vesicle fusion (see Bera et al. 2022, eLife) or how Cplx may achieve its regulation of transmitter release without mechanistically "clamping" fusion (Neher 2010, Neuron). Since the data presented here cannot rule out such alternative models (in this reviewer's opinion), the authors may want to mention and briefly discuss such alternative models.

      The study by Bara et al reiterates the model proposed by the Rothman group which attributes the clamping function of Cpx to its accessory alpha helix by hindering the progressive SNARE complex assembly. We have explicitly stated this issue in the original version of the manuscript (page 19, line 425) “As the accessory helix of Cpx has been found to bind to membrane proximal cytoplasmic regions of SNAP-25 and SybII (Malsam et al., 2012; Bykhovskaia et al., 2013; Vasin et al., 2016), an attractive scenario could be that both domains of CpxII, the CTD and the accessory helix, synergistically cooperate to stall final SNARE assembly”. In this context, we will now cite also the study by Bera et al.. 

      A related view of the function of complexin suggested that it may act as an allosteric adaptor for sytI (Neher 2010, Neuron). Here, rather than postulang independent "clamp" and "trigger" functions for the dual action of complexin, these were explained as facets of a simple allosteric mechanism by which complexin modulates the Ca2+ dependence of release. Yet, this interpretation appears to be difficult to reconcile with the observation of our and other laboratories, showing that the fusion-promoting and clamping effects are separable (e.g. Dhara et al., 2014; Lai et al., 2014; Makke et al., 2018; Bera et al., 2022).  

      Some parts of the Discussion are quite general and not specifically related to the results of the present study. The authors may want to consider shortening those parts.

      Considering the contrary findings in the field of SNARE-regulating proteins, the authors hope that the reviewer will agree that it is necessary to discuss the new observations in a broader context, as also acknowledged by the first reviewer.

      Last but not least, the presentation of the results could be improved to make the data more accessible to non-specialists, this concerns providing necessary background information, choice of colors, and labeling of diagrams.

      Done

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors): 

      Regarding figures: 

      (1) Please use clearly distinct colors in diagrams. For example, in Figure 2 Suppl. 3, four different shades of red (or reddish) are used to color the traces and the respective bars. These different shades of red are difficult to discriminate. In Figure 5 Suppl. 1, the two greens are nearly indistinguishable.  

      Done

      (2) RRP size and SRP size on the one hand, and SR rate on the other represent different quantities which are measured in different units. Please use a separate y-axis for the SR (a rate measured in fF/s) and do not combine with RRP and SRP (pool sizes measured in fF). This would also automatically alleviate the need for axis breaks in the plots of RRP size and SRP size. In general, please do not use axis breaks which make interpretation of data unnecessarily more complicated.  

      In order to clarify the display, we now define the different units together with the quantified parameter (e.g. RRP [fF], SRP [fF], SR [fF/s]) allowing us to omit a second axis in those subpanels.

      (3) When plotting bar graphs showing mean tau_RRP, mean tau_SRP, and mean delay, please always use the correct y-axis labels, i.e. use "tau_RRP", "tau_SRP" and "delay" as y-axis labels as it was done for example in Figure 4D, and do not use "tau_RRP", "tau_SRP" and "delay" as x-axis labels as it was done for example in Figure 1D and many other figure panels.  

      We have standardized the figure display. Yet, we would prefer to keep our way subpanel labelling which states the parameter underneath the bar graph and thereby makes the results more accessible.  

      (4) Are the asterisks indicating statistical significance perhaps missing in Figure 4D, middle panel (tau_SRP)?

      There was not a statistically significant difference (wt vs cpxIIko+CpxII EA, P=0.0826, Kruskal-Wallis with Dunn’ post hoc test).  

      (5) According to the Results section (pages 12 to 13), I assume that in Figures 6 and 7 the labels "+Cplx XYZ" are used by the authors to identify an overexpression of Cplx XYZ in a Cplx WT background. The legend text reads however " ... cells expressing either Cplx2 wt or the mutant ...", which would not be correct. Please check.

      We have changed the formulations to “overexpression” accordingly.

      (6) The x-axis unit in Figure 8C is likely "µM" and not "M".

      Done.

      (7) The abbreviations "CplxII LL-EE" and "CplxII LL-WW", and "CplxII LLEE" and "CplxII LLWW" are very similar but refer to different mutants. Could you please think of a more specific and unambiguous abbreviation? Perhaps "CplxII L124E-L128E"?  

      We have changed the abbreviations, accordingly (i.e. CpxII L124E-L128E).  

      Regarding the manuscript text:  

      Line 65: "prevents" instead of "impairs"? 

      done

      Line 67: why "in vivo"? 

      We changed the formulation to ‘Several’

      Line 83: "in addition to the clamping function ..." This is misleading. Many of the studies listed here did not provide evidence for enhanced spontaneous release following Cplx loss and often observed the opposite, reduced spontaneous release. The enhanced delayed release was observed by Strenzke et al 2009 J.Neurosci. and by Chang et al. 2015 J.Neurosci. (which the authors may want to cite). However, that enhanced delayed release occurred despite reduced spontaneous release indicating that it is not simply the result of a missing "fusion clamp". 

      To accommodate the reviewer’s suggestion, we have changed the formulation to “Independent of the clamping function of Cpx….”

      Line 104: "speeds up exocytosis that is controlled by the forward rate of Ca2+ binding" This is difficult to understand without context.  

      We have now added the corresponding citations (Voets et al., 2001; Sorensen et al., 2003), which showed that exocytosis timing in chromaffin cells is largely determined by the kinetics of Ca2+-binding to SytI.

      Line 116: "Cplx2 knock out ..." Please provide (here or earlier in the manuscript) information to the reader about which Cplx paralogs are expressed in chromaffin cells.  

      We now state on line 111 that “CpxII is the only Cpx isoform expressed in chromaffin cells (Cai et al., 2008)”

      Line 118: "=~" either "=" or "~". 

      done

      Line 120: "instead" seems superfluous.

      done

      Line 272: "calcium binding rates" should perhaps better read "apparent calcium binding rates". 

      done

      Line 290: "enhancing SytI's Ca2+ affinity" should perhaps better be "enhancing the apparent Ca2+ affinity of the release machinery". Ca2+ binding kinetics is never directly assayed here.

      We agree and have phrased the sentence accordingly.

      Line 300: "Expression of Cplx ... in Syt1 R233Q ki cells, ..." Perhaps better "Overexpression of Cplx ... in Syt1 R233Q ki/Cplx2 wt cells, ..." for clarification?

      done

      Lines 313ff: What is assayed here is the apparent Ca2+ binding kinetics and apparent KD values of the release machinery. Ca2+ binding to Syt1 is never directly measured!  

      We agree and have changed the wording accordingly to “CpxII NTD supports the forward rate of calcium binding to SytI in accelerating exocytosis”

      Line 347: "Complexin plays a dual role ..." This is partially misleading. It does so in chromaffin cells and D.m. and C.e. NMJs but not at conventional mammalian synapses. 

      We agree and have changed the formulation to “In many secretory systems, Complexin plays a dual role in the regulation of SNARE-mediated vesicle fusion”

    1. Author response:

      We thank the reviewers for their constructive comments that will help us clarify and strengthen the paper. We will be happy to address all the comments and adjust the text accordingly. Regarding the suggestion in the assessment to include a “more thorough comparison with with human behavior”, we believe this comment reflects one of the reviewer’s comments to compare with order effects (primacy and recency); we did not see any other comments that would reflect this (our existing simulations do make contact with other human behavior regarding error distributions, including probability of recall, precision, sensitivity to reinforcement history, and dopamine manipulation effects on human WM). We thank the reviewers for this comment and we will conduct the appropriate simulations and analysis to compare with sequential effects in working memory.

    1. Author response:

      Reviewer #1 (Recommendations For The Authors): 

      This paper represents a huge amount of work on a condition whose patients' health and well-being have not always been prioritized, and only relatively recently has the immune dysregulation seen in patients with Down Syndrome (DS) been garnering major research interest. 

      This paper provides an unparalleled examination of immune disorder in patients with DS. In a truly herculean effort, the authors provided the cumulative examination of over 440 patients with DS, confirmed the alterations in immune cell subsets (n=292, 96 controls) and multi-organ autoimmunity seen in these patients as they age, and identified autoantibody production that could contribute to conditions co-occurring in patients with DS. They also sought to look at whether the early immunosenescence seen in DS was due to the inflammatory profile by comparing age-associated markers in DS patients and euploid controls separately, finding that several markers are regulated with age regardless of group, while comparing the effect of age versus DS status on cytokine status identified inflammatory markers elevated in DS patients across the lifespan that do not increase with age or that increase with age only in the DS cohort. This is very interesting in the context of DS in particular, and immunity during aging in general. 

      The second part of the manuscript presents the results from a clinical trial with the JAK inhibitor tofacitinib in DS patients. While the number of DS patients treated with tofacitinib was small, the results were often quite striking. Treatment was well-tolerated and the improvement of dermatological conditions was clear. The less responsive patients AA4 and AA2 provide a very clear illustration that these patients are sensitive to immune triggers during treatment. Additionally, the demonstration that patients' IFN scores and cytokine levels decreased without clear immunosuppression with tofacitinib treatment is encouraging, since treatment with this drug would need to be continuous. I would be curious to see if the patients added past the cutoff for interim analysis follow a similar trajectory. I would not ask the authors to add any data; the paper is well-written and logically constructed. 

      I only have a small comment: I really did not like how Figure 2 a, d, and g tethered the coloring to the magnitude of fold change to show the effect of DS particularly for 2a and 2g. Given that these fold changes are quite modest, the coloring is very light and hard to distinguish. The clear takeaway is that the effect on T cells is greatest, but there must be a better way to illustrate this. Perhaps displaying this graph on a non-white background could help with contrast. 

      We are grateful for the Reviewer’s very positive assessment of the manuscript and constructive feedback. We want to assure the Reviewer that similar analyses will be completed in the future for the entire cohort recruited into the trial to determine if similar trajectories and results are observed with the larger sample size. Additionally, following Reviewer’s guidance, we will explore alternative ways to present the data in Figure 2 for greater clarity in a revised version of the manuscript.

      Reviewer #2 (Recommendations For The Authors): 

      • Although the focus of the patients in the first part of the paper is on autoimmune/inflammatory conditions, it will be useful to also list the non-autoimmune infectious manifestations for reference with prevalence data. For example, otitis media, or lung infections (mentioned within the paper), or mucosal candidiasis. Same for other manifestations such as cardiac or malignant conditions. Given the impressive number of patients, it will be useful to the readers to have prevalence data for these as well, even in brief statements within the results. 

      We appreciate this inquiry by the Reviewer and will present additional data on the co-occurring conditions mentioned by the Reviewer in a revised version of the manuscript.

      • Have the authors looked at DN T cells and whether they may be enriched in DS patients, given their enrichment in some autoimmune conditions? 

      Thanks for this inquiry. We did examine DN T cells (double negative T cells), which we referred to in our Figure 2 and Figure 2 – figure supplement 1 as non-CD4+ CD8+ T cells. Although this T cell subset is mildly elevated (in terms of frequency among T cells) in individuals with Down syndrome, the result did not reach statistical significance after multiple hypothesis correction. This negative result is shown in the heatmap in Figure 2 – figure supplement 1d.

      • It would be useful to move the segment of the discussion that discusses the interim predefined analysis of the phase 2 trial to the corresponding segment of the results. As this reviewer was reading the paper, it was unclear why the interim analysis was done, whether it was predefined and it was not until the discussion that it became apparent. I believe it will help the readers to have a brief mention that this interim analysis was predefined and set to occur at the first 10 DS enrollees. Also, it would be helpful to state what is the total number of DS patients planned for enrollment in the Phase 2 trial which is continuing recruitment. 

      We appreciate this comment and will modify the text following the Reviewer’s guidance in the revised manuscript. The trial will be considered complete once a total of 40 participants undergo 16-weeks of treatment with good medicine compliance (less that 15% missed doses).

      • Although the authors present data on TPO autoantibodies before and after tofacitinib, it remains unclear whether the other non-TPO autoantibodies were altered during treatment or whether this was a TPO autoantibody-specific phenomenon. Was there an alteration in mature B cells or plasmablast populations after tofacitinib? If these data are available, they would further enhance the manuscript. If they are not available, it would be useful for the authors to discuss those in the discussion of the manuscript. 

      We are grateful for this comment, which strongly aligns with our future research interests and plans for the analysis of the full cohort once the trial is completed. In the interim analysis, we analyzed only auto-antibodies related to autoimmune thyroid disease and celiac disease, as shown in the manuscript. However, we plan to complete a more comprehensive analysis of the effects of JAK inhibition on autoantibody production once the full sample set is available at the end of the trial. Likewise, the clinical trial protocol contemplates collection and processing of blood samples for immune mapping using mass cytometry, which will enable us to answer the question from the Reviewer about potential changes in B cells or plasmablasts populations. Following Reviewer’s guidance, we will discuss these planned analyses in the Discussion of the revised manuscript.

      Reviewer #3 (Recommendations For The Authors): 

      (1) Cellular immune phenotyping data in Figure 2 presents a large number of patients with DS versus euploid controls (292 and 96 respectively). Given the relatively large cohort there would seem to be an opportunity to determine whether age or sex alters the immune phenotype shown, for example, TEMRAs, etc. Was the data analyzed in this way? 

      We welcome this comment, which clearly aligns with our research interests and planned additional analyses of these datasets generated by the Human Trisome Project. We can share with the Reviewer that although sex as a biological variable has minimal impacts on the strong immune dysregulation observed in Down syndrome, there are clear age-dependent effects, with some immune changes occurring early during childhood versus others taking place later in adult life. A manuscript describing a complete analysis of age-dependent effects on the multi-omics datasets in the Human Trisome Project is currently under preparation.

      (2) The authors should strongly consider incorporating/discussing the findings from Gansa et al, Journal of Clinical Immunology May 2024 - where they reviewed the immune phenotype of 1299 patients with Down syndrome. 

      Thanks for this suggestion, we will surely cite and discuss this recent paper in the revised manuscript.

      (3) It is difficult to differentiate patients Hs2 and Ps1 in Figure 5d. 

      Thanks for this observation, we will modify the labels for greater clarity in the revised manuscript.

      (4) Given their finding of no correlation between cytokine levels/immune phenotype and autoimmunity, some additional discussion of the relevance of hypercytokinemia in the pathogenesis of autoimmunity would seem relevant (given that this was the basis for the clinical trial). The authors mention that cytokine levels may not be appropriate measures of disease in the patients. 

      We welcome this opportunity to expand the discussion of the relevance of hypercytokinemia in the pathogenesis of autoimmunity and will do so in the revised manuscript.

      (5) Data availability statement: appropriate.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors): 

      The authors should perform experiments to answer this question: does Cav3 transcription increase in the G369i-KI, or is there instead some post-transcriptional modulation that permits surface expression of functional Cav3-containing channels in the absence of typical HVA Ca conductances? Also, the authors should determine whether G369i-KI can mediate Ca2+ release from intracellular stores and whether release from stores is upregulated as Cav3-containing channel expression (or function) is increased. 

      We performed transcriptomic (drop-seq) analysis to test whether a Cav3 subtype is upregulated in cones of G369i KI mice. These experiments show that, consistent with previous studies (PMID 35803735, 26000488), Cacna1h appears to be the primary Cav3 subtype expressed mouse cones. However, as shown in new Supp.Fig.S3, there was no significant difference in the levels of Cacna1h transcripts in WT and G369i KI cones. Therefore, we propose that there may be some post-transcriptional modification, or alteration in a pathway that regulates channel availability, that enables the contribution Cav3 channels to the whole-cell Ca2+ current in the absence of functional Cav1.4 channels cones.

      We also performed Ca2+ imaging experiments in WT vs G369i KI cone terminals to assess whether the diminutive Cav3 current in G369i KI cone terminals may be compensated by upregulation of a Ca2+ signal such as from intracellular stores. Arguing against this possibility, depolarization-evoked Ca2+ signals in G369i KI cones were dramatically reduced compared to WT cones (new Fig.9). 

      Reviewer #2 (Recommendations For The Authors): 

      Major points- 

      (1) It is stated in too many places that cone features in the Cav1.4 knock-in are "intact", preserved, or spared, but this representation is not accurate. There are two instances in this study that qualify as intact when comparing KI to WT: 1) the photopic a-waves in the Cav1.4 knock-in (also demonstrated in Maddox et al 2020) and 2) latency to the platform (current MS, Figure 7f). However, in the numerous instances listed below, the authors compared the Cav1.4 knock-in to the Cav1.4 knock-out, and then referred to the KI as exhibiting intact responses. The reference point for intactness needs to be wildtype, as appropriately done for Figures 2 and 3, and when comparing the KI to the KO the phrasing should be altered; for example: "the KI was spared from the extensive degeneration witnessed in the KO....". 

      In most cases, we clearly note that there are key differences in the WT and the G369i KI cone synapses, which highlight the importance of Cav1.4-specific Ca2+ signals for certain aspects of the cone synapse. We disagree with the reviewer on the point that we did not often use the WT as a reference since most of our experiments involved comparisons of only WT and G369i KI (Figs. 3-6) or WT, G369i KI, and Cav1.4 KO (Figs.1,7—and in these cases comparisons specifically between WT and G369i KI mice were included). We used “intact” as a descriptor for G369i KI cone synapses since these are actually present, albeit abnormal in the G369i KI retina, whereas cone synapses are completely absent in the Cav1.4 KO retina. To avoid confusion, we modified our use of “intact” and “preserved” where appropriate.

      A. Abstract, line 34 to 35: ".......preserved in KI but not in KO.". 

      Abstract was rewritten and this line was removed.

      B. Line 36: "....synaptogenesis remains intact". The MS documents many differences in the morphology of KI and WT cones (immunofluorescence and electron microscopy data), which is counter to an intact phenotype. 

      The sentence was: “In CSNB2, we propose that Cav3 channels maintain cone synaptic output provided that the Ca2+-independent role of Cav1.4 in cone synaptogenesis remains intact.”

      Here the meaning of “intact” refers to the Ca2+ -independent role of Cav1.4, not synapses. Thus, we have left the sentence unchanged.

      C. This strikes the right balance, lines 67 to 68: "....although greatly impaired.....". 

      D. Line 149, "Cone signaling to a postsynaptic partner is intact in G369i KI mice". This description is inaccurate. Here there is only WT and KI, and the text reads as follows in line 162: "terminals (Figure 6b). The ON and OFF components of EPSCs in G369i KI HCs were measurable, although lower in amplitude than in WT (Figure 6a,b)." Neither "measurable" nor "lower in amplitude" meet the definition of "intact", and actual numerical values are lacking in the text. 

      We have added results showing that there are no light responses in the Cav1.4 KO horizontal cells and have modified the sentence to: “Cone synaptic responses are present in horizontal cells of G369i KI but not Cav1.4 KO mice”. 

      We have modified discussion of these results as (line 210-213): “Consistent with the lack of mature ribbons and abnormal cone pedicles (Fig.1), HC light responses were negligible in Cav1.4 KO mice (Fig.8a,b). In contrast, the ON and OFF responses were present in G369i KI HCs although significantly lower in amplitude than in WT HCs (Fig. 8a,b).”

      E. Please add a legend to Figure 6a to indicate the intensities. The shape of the KI responses is different from the control which is worthy of discussion: i) there is no clear cessation of HC EPSCs in the KI during the light ON period (when release stops, Im fluctuations should be minimal), and ii) the "peaked" appearances of the initial 500ms of the On and Off periods are very similar in shape for the KI (hard to interpret in the same fashion as a control response). How were the On and Off amplitudes analyzed? Furthermore, the OFF current is not summarized in Figure 6D, but should not this be when Cav3 should be opening and triggering release: Off response-EPSC? Lastly, Figure 6b,d shows a ~70% reduction in On-current in the KI, and the KI example of 6b an 80% reduction in Off current compared to WT. Yet, the only place asterisks are used to indicate sig diff is the DNQX data within each genotype in Fig 6d. These data cannot be described as showing "intact" KI responses, and the absence of numerical and statistical values needs to be addressed. 

      New Fig.8a depicting the horizontal cell light responses has been modified to include the legend indicating light intensities. The ON and OFF amplitudes were analyzed as the peak current amplitudes. This information has been added to the legend.

      The reviewer is correct in that the OFF response represents the EPSC whereas the ON response represents the decrease in the EPSC with light. To avoid confusion, we changed the y axis label for the averaged data to read ON or OFF “response” rather than “current” in new Fig.8b.

      As the reviewer suggests, the more transient nature of the KI response during the light ON period could result from aberrant continuation of vesicular release during the light-induced hyperpolarization of cones in the KI mice, in contrast to the prolonged suppression of release by light which is evident in the WT responses. We speculated on this difference as follows (lines 237-241):

      “In addition to its smaller amplitude, the transient nature of the ON response in G369i KI HCs suggested inadequate cessation of cone glutamate release by light (Fig.8b). Slow deactivation of Cav3 channels and/or their activation at negative voltages20 could give rise to Ca2+ signals that support release following light-induced hyperpolarization of G369i KI cones.”

      We added astericks to new Fig.8b,d indicating statistical differences and description of the tests in the legend.

      F. line 168 the section titled "Light responses of bipolar cells and visual behavior is spared in G369i KI but not Cav1.4 KO mice". 

      Changed to: “Light responses of bipolar cells and visual behavior are present in G369i KI but not Cav1.4 KO mice”

      Last sentence of erg results, 189-190: "These results suggest that cone-to-CBC signaling is intact in G369i KI mice.". "Spared and intact" are not accurate descriptions. The ERG data presented here shows massive differences between WT and the KI, except in the instance of awaves. 

      This sentence was removed.

      As for Figure 6, the results text related to Figure 7a-d does not present real numbers for ERG responses, and there is no indication of significant differences there or in the Figure panels. For instance, in Figure 7b, b-waves are KI are comparable to KO, except at the two highest-intensity flashes that show KI responses ~20% the amplitude of WT. Presentation of KI and KO data on a 6- to 10-fold expanded scale higher than WT can be misleading: a quick read of these Figure panels might make one incorrectly conclude that the KI is intact while the KO is impaired when compared to WT. The Methods section needs more details on the ERG analysis (e.g. any filtering out of oscillatory potentials when measuring b-wave, and what was the allowable range of time-to-peak for b-wave amplitude, etc..). 

      The vertical scaling of the ERG results in new Fig.10c,d has been changed so as to reflect clearly diminished responses of the KO and KI vs the WT. Further details regarding the ERG analysis was added to the Methods section.

      G. Can you point to other studies that have used the "visible platform swim test" used in Figure 7e, f, and specify further how mice were dark/light adapted prior to the recordings? 

      As referenced in the Methods, original line 674, the methods we used for the swim test were described in our previous study (PMID 29875267). Other studies that have used this assay include PMIDs: 28262416, 26402607.

      (2) The Maddox et al 2020 study does not safely address whether rods have a residual T-type Ca2+ current in the Cav 1.4 KO or KI. The study showed that membrane currents measured from rods in the KI and KO retina were distinct from WT, supporting their claim that L-type Ca2+ current is absent in the KI and KO. However, the recordings had shortcomings that challenge the analysis of Ca2+ currents: i) collected at room temp (22-24{degree sign}C), ii) at an unknown distance from the terminal (uncertain voltage clamp), iii) with a very slow voltage ramp rate that is not suitable for probing T-type currents (Figure 1d Maddox 2020, 140 mV over 1 sec: 7msec/1mV), and iv) at a signal-to-noise that does not allow to resolve a membrane current under 1 pA (avg wt rod Ca2+ current was -3.5 pA, and line noise ~1pA peak-to-peak in Maddox 2020). Suggestion: say T-type currents were not probed in Maddox et al 2020, but Davison et al 2022 did not find PCR signal for Cav3.2 in rods. 

      We disagree that recordings in the Maddox 2020 study were not sufficient to uncover a T-type current. The voltage ramps in that study were not much slower than that of the Davison et al. 2022 study (they used 0.19 mV/ms). Moreover, in new Supp. Fig.S1, we show that like the slower voltage ramp (0.15 mV/ms) used in the prior study of G369i KI rods, the voltage ramps we used in the present study (0.5 mV/ms), which clearly evoke currents with T-type properties in G369i KI cones (Fig.2a,b, Fig.3a,b) do not evoke currents in WT or G369i KI rods.  

      Minor comments. 

      (1) Suggestion: add an overview panel to Figure 1 that shows the rod terminals in the KI. The problem is that cropping out the ribbon and active zone signals from rods, to highlight cones, can give the impression that the cones are partially spared in the KI, and the rods are not spared at all. (yet you nicely clarify this in Figure 4 and in the legend and text, etc.). 

      We chose to modify the legend with this information as in Fig.4 rather than modify the figure.

      (2) Mouse wt cone Ca2+ currents look like L-type currents, as do your monkey and squirrel cone recordings, and also much like those of mouse rods (see Figure S5, Hagiwara et al., 2018 or Grabner and Moser 2021). Your pharm data from mice and squirrels further supports your conclusion, and certainly took much effort. Davison et al 2022 J Neurosci showed PCR results that support their claim that a Cav3 current exists in wt cones. Questions: 1) have you tried PCR? 2) Can you offer more details on what Cav3 KO you tried and what antibodies failed to confirm the KO? As the authors know, one complication is that the deletion of one Cav can be compensated for by the expression of a new Cav. There are 3 types of Cav3s and removal of one type may be compensated for by another Cav3. 

      We have included drop-seq data (new Supp.Fig.S3) implicating Cav3.2 as the main Cav3 subtype in cones and have modified our discussion of these results accordingly. These experiments did not reveal any changes in Cav3 subtype expression in G369i KI vs WT cones.

      (3) Lines 95/96- onward, spend more time telling the story. When working out the biophysical and pharmacological behavior of the Ca2+ currents, you might want to initially refer to the membrane current as a membrane current, and then state how your voltage protocols, intra- and extra-cell solutions, and drugs helped you verify 1) L-type and 2) T-type Ca2+ currents. 

      We have modified the text with more detail.

      (4) If data is in hand, add a ramp I-V to Figure S2, which shows the response of the ground squirrel cone. The steps in S2a are excellent for making your point that a transient current is missing, and the bipolar is a great control to illustrate ML218 works. However, a comparison of a squirrel cone ramp to a bipolar ramp response could complete the figure. 

      See Reponse to #5 below.

      (5) Consider moving Supplementary Figures S2 and S3 to the main text; these are highly relevant to the story, novel, and well-executed. 

      Fig.S2 and S3 were added as new Figs.4,5. The new Fig.4 includes voltage ramps in ground squirrel cones (panel a) to compare with the bipolar data (panel f).

      (6) The nice electron microscopy reconstructions are not elaborated on in any detail, and there is no mention of ribbon size. Is the resolution sufficient to estimate ribbon size, the number of synaptic vesicles around the ribbon and in the adjacent cytosol? The images indicate major changes in the morphology of the terminals. Is the glial envelope similar in WT and KI? 

      Since ribbons were quantified extensively in the confocal analyses in Fig.6, we felt it unnecessary to add this to the EM analysis which focused mainly on aspects of 3D structure (i.e., arrangement of ribbons, postsynaptic wiring, cone pedicle morphology). We added further discussion of the change in morphology of the G369i KI cone pedicle (lines 200-203): “Compared to WT, ribbons in G369i KI pedicles appeared disorganized and were often parallel rather than perpendicular to the presynaptic membrane (Fig.7a-c). Consistent with our confocal analyses (Fig.1), G369i KI cone pedicles extended telodendria in multiple directions rather than just apically (Fig. 7a).”

      While we did not opt to characterize the glial envelope in WT cones, we did add an analysis of synaptic vesicles around ribbons to Table 2.

      (7) Discussion line 250: "we found no evidence for a functional contribution of Cav3 in our recordings of cones in WT mice (Figures. 2,3), ground squirrels, or macaque (Supplementary Figures S2 and S3).". I would not use "functional" in this context because when comparing your work to Davison et al 2022, they defined functional as a separate response component driven by Cav3. For instance, they examined the influence of their T-type current on exocytosis (by membrane capacitance) and other features like spiking Ca2+ transients. Suggestion: substitute functional with "detectable", and say "we found no detectable Cav currents". Or if you had Ttype staining, but not T-type Ca2+ currents, then say "no functional current even though there is staining...". 

      We have modified the text as (lines 336-338): “However, in contrast to recordings of WT mouse cone pedicles in a previous study21, we found no evidence for Cav3-mediated currents in somatic recordings of cones in WT mice (Figs.2,3).”

      We propose an alternative interpretation of the results in the Davison et al study concerning the conclusion that Cav3 channels contribute to Ca2+ spikes and exocytosis. That study used 100 µM Ni2+ to block a “T-type” contribution to spike activity in cones. In their Figs.4,5, the spikes are suppressed by 100 µM Ni2+ and 10 µM nifedipine, a Cav1 antagonist, and spared by the T-type selective drug Z944. This is problematic for several reasons. First, as shown by the authors

      (their Fig.2A1,A2) and others (PMID: 15541900), 100 µM Ni2+ inhibits Cav1-type currents in photoreceptors. Second, Z944 potentiates Cav1 current in their mouse cones (their Fig.2C1,C2). Thus, both reagents are suboptimal for dissecting the contribution of either Cav subtype to spiking activity. With respect to Cav3 channels and exocytosis, these authors interpreted a reduction in exocytosis upon holding at -39 mV compared to at -69 mV as indicating a loss of a T-type driven component of release. However, Cav1 channel inactivation (PMID: 12473074) could lead to the observed reduction in exocytosis at -30 mV.

      (8) Additional literature related to your Intro and Discussion. Regarding CSNB2, related mutations of active zone proteins, and what happens to Ca2+ currents when ribbons are deleted, you might want to consider the following studies that measure Ca2+ currents from rods: conditional KO of RIM1/2 (Grabner et al 2015 JN), KO of ELKS1/2 (Hagiwara et al, 2018 JCB), and KO of Ribeye (Grabner and Moser eLife 2021). In these studies, the Cav currents were absent in rods of the ELKS1/2 DKO, strongly reduced (80%) in the RIM1/2DKO, but altered in more subtle ways (activation-inactivation) without significantly changing steady-state Ca2+ current in the Ribeye KO. This does not seem to support some of the arguments you have made in the Introduction and Discussion regarding ribbon size and Ca2+ currents, yet the suggested literature is related to the topic at hand. 

      A description of these synaptic proteins as potential mediators of the effect of Cav1.4 on ribbon morphogenesis was added to the Discussion, lines 325-327.

      (9) Line 129: "Along with the major constituents of the ribbon, CtBP2, and RIBEYE", for clarity Ribeye has two domains, one that is identical to CtBP2 (B-domain) and the unique Ribeye domain (A-domain) that is only expressed at ribbon synapses. And, Piccolino is also embedded in the ribbon (Brandstaetter lab, Wichmann/Moser labs). In other words, Ribeye and Piccolino are the major constituents of the ribbon. 

      To avoid confusion, we simply mention Ctbp2 and RIBEYE in the context of the corresponding antibodies that were used to label ribbons.

      (10) Abstract: consider to rephrase "Ca2+-independent role of Cav1.4" by "Ca2+-permeationindependent role of Cav1.4" or alike 

      Sentence changed to: “In CSNB2, we propose that Cav3 channels maintain cone synaptic output provided that the nonconducting role of Cav1.4 in cone synaptogenesis remains intact.”

      Reviewer #3 (Recommendations For The Authors): 

      Cav1.4 voltage-gated calcium channels play an important role in neurotransmission at mammalian photoreceptor synapses. Mutations in the CACNA1f gene lead to congenital stationary night blindness that particularly affects the rod pathway. Mouse Cav1.4 knockout and Cav1.4 knockin models suggest that Cav1.4 is also important for the cone pathway. Deletion of Cav1.4 in the knockout models leads to signaling malfunctions and to abundant morphological re-arrangements of the synapse suggesting that the channel not only has a role in the influx of Ca2+ but also in the morphological organization of the photoreceptor synapse. Of note, also additional Cav-channels have been previously detected in cone synapses by different groups, including L-type Cav1.3 (Wu et al., 2007; pmid; Kersten et al., 2020; pmid), and also T-type Cav3.2 (Davison et al., 2021; pmid 35803735). 

      In order to study a conductivity-independent role of Cav1.4 in the morphological organization of photoreceptor synapses, the authors generated the knockin (KI) mouse Cav1.4 G369i in a previous study (Maddox et al., eLife 2020; pmid 32940604). The Cav1.4 G369i KI channel no longer works as a Ca2+-conducting channel due to the insertion of a glycine in the pore-forming unit (Madox et al. elife 2020; pmid 32940604). In this previous study (Madox et al. elife 2020; pmid 32940604), the authors analyzed Cav1.4 G369i in rod photoreceptor synapses. In the present study, the authors analyzed cone synapses in this KI mouse. 

      For this purpose, the authors performed a comprehensive set of experimental methods

      including immunohistochemistry with antibodies (also with quantitative analyses), electrophysiological measurements of presynaptic Ca2+ currents from cone photoreceptors in the presence/absence of inhibitors of L-type- and T-type- calcium channels, electron microscopy (FIB-SEM), ERG recordings and visual behavior tests of the Cav G369i KI in comparison to the Cav1.4 knockout and wild-type control mice. 

      The authors found that the non-conducting Cav channel is properly localized in cone synapses and demonstrated that there are no gross morphological alterations (e.g., sprouting of postsynaptic components that are typically observed in the Cav1.4 knockout). These findings demonstrate that cone synaptogenesis relies on the presence of Cav1.4 protein but not on its Ca2+ conductivity. This result, obtained at cone synapses in the present study, is similar to the previously reported results observed for rod synapses (Maddox et al., eLife 2020, pmid 32940604). No further mechanistic insights or molecular mechanisms were provided that demonstrated how the presence of the Cav channels could orchestrate the building of the cone synapse. 

      We respectfully disagree regarding the mechanistic advance of our study. As indicated by Reviewer 2, a major advance of our study is in providing a mechanism that can explain the longstanding conundrum that congenital stationary night blindness type 2 mutations that would be expected to severely compromise Cav1.4 function do not produce complete blindness. Our study provides an important contrast to the Maddox et al 2020 study in showing that rods and cones respond differentially to loss of Cav1.4 function, which is also relevant to the visual phenotypes of CSNB2. How the presence of Cav1.4 orchestrates cone synaptogenesis is an important topic that is outside the scope of our present study.

      In the present study, the authors also propose a homeostatic switch from L-type to (newly occurring) T-type calcium channels in the Cav1.4 G369i KI mouse as a consequence of the deficient calcium channel conductivity in the Cav1.4 G369i Cav1.4 KI mouse. In cones of the Cav1.4 G369i, the high-voltage activated, L-type Ca2+-entry was abolished, in agreement with their previous paper (Maddox et al., eLife 2020, pmid 32940604). The authors found a lowvoltage activated Ca2+ current instead that they assigned to T-type Ca2+-currents based on pharmacological inhibitor experiments. T-type Ca2+-currents/channels were already previously identified in other studies by independent groups and independent techniques

      (electrophysiology, RT-PCR, single-cell sequencing) in cones of wild-type mice (Davison et al.,

      2021, pmid 35803735; Macosko et al., 2015, pmid 26000488; Williams et al., 2022, pmid 35650675). In the present manuscript (Figures 3a/b), the authors also observed a low-voltage activated, T-type like current in cones of wild-type mice, that is isradipine-resistant and affected by the T-type inhibitor ML218. This finding appears compatible with a T-type-like current in wildtype cones and is consistent with the published data mentioned above, although the authors interpret this data in a different way in the discussion. 

      Due to the noise inherent in whole cell voltage clamp measurements and some crossover effects in the pharmacology, we cannot completely exclude the presence of a T-type current in WT mouse cones. However, our results very clearly support a conclusion opposite to that stated by the reviewer. Namely, if WT mouse cones have T-type Ca currents, then they are far smaller than those in the Cav1.4 G369i KI and KO cones. In particular, while we identified message for Cav3.2 in WT mouse cones, we were unable to identify a functional T-type current by either voltage clamp measurements or pharmacology. See below for a detailed rebuttal.

      This proposal of a homeostatic switch is not convincingly supported in this reviewer's opinion

      (for further details, please see below). Furthermore, no data on possible molecular mechanisms were provided that would support such a proposal of a homeostatic switch of calcium channels. No mechanistic/molecular insights were provided for a proposed homeostatic switch between Ltype to T-type channels that the authors propose to occur between wild-type and Cav1.4 G369i as a consequence of conduction-deficient Cav1.4 G369i channels. Is this e.g. based on posttranslational modifications that switch on T-type channels or regulation at the transcriptional level inducing expression of T-type calcium channel or on other mechanisms? The authors remain descriptive with their central hypotheses. No molecular mechanisms/signaling pathways were provided that would support the idea of such a homeostatic switch. 

      Homeostatic plasticity refers to the maintenance of neuronal function in response to some perturbation in neuronal activity and can result from changes in the expression of ion channel genes (PMID: 36377048, 32747440, 19778903) or regulatory pathways that modulate ion channels (PMID: 15051886, 32492405). We present multiple lines of evidence showing that Cav3 currents appear in cones upon genetically induced Cav1.4 loss of function and can support cone synaptic responses and visual behavior if cone synapse structure is maintained. Our new transcriptomic studies show no difference between levels of Cav3 channel transcripts in WT and G369i KI cones, suggesting that the appearance of the Cav3 currents in G369i KI cones does not result from an increase in Cav3 gene expression. We are currently investigating our transcriptomic dataset to determine if Cav3 regulatory pathways are upregulated in G369i KI cones and will present this in a follow-up study.

      The authors show residual photopic signaling in the non-conducting Cav1.4 G369i KI mouse as judged by the recording of postsynaptic currents, ERG recordings and visual behavior tests though in a reduced manner. The residual cone-based signaling could be based on the nonaffected T-type Ca2+ channel conductivity in cone synapses. Given that the L-type current through Cav1.4 is gone in the Cav1.4 G369i KI as previously shown (Maddox et al., 2020, pmid 32940604), the T-type calcium current will remain. However as discussed above, this does not necessarily support the idea of a homeostatic switch. 

      A major point which we highlighted with new results is that despite the expression of Cav3 transcripts in WT mouse cones, Cav3 channels do not contribute to the cone Ca2+ current. This is at odds with the Davison et al study (PMID: 35803735, see our response to Reviewer 2, pt 7 for caveats of this study), but our results convincingly show that the Cav3 current appears only when Cav1.4 is genetically inactivated. Pharmacological or electrophysiological methods that should reveal the presence of Cav3 currents do not change the properties of the Ca2+ current in cones of WT mice, ground squirrel, or macaque:

      • Figs.2-4: Voltage steps to -40 mV (Fig 2e) that activate a sizeable T-current in G369i KI mouse cones produce a negligible transient at pulse onset in WT mouse cones. Similarly, transient currents that are obvious in G369i KI mouse cones during the final step to -30 mV are absent in WT cones.  When we block Cav1.4 with isradipine either in cones of WT mice or ground squirrel, the current that remains does not resemble a Cav3 current but rather a scaled down version of the L-type current. ML218, which readily blocks Cav3 channels in HEK293T cells and in G369i KI cones, has only minor effects in cones of WT mice and ground squirrel; these effects of ML218 can be attributed to non-specific actions on Cav1.4 (new Supp.Fig.S2). New Fig.4 (moved from the supplementary data to the main article) clearly shows that the ML218-sensitive current in ground squirrel cones exhibits properties of Cav1.4 not Cav3 channels. 

      • Figs.2,5: Holding voltages that inactivate Cav3 channels have no effect on the Ca2+ current in cones of WT mice or macaque (recordings of macaque cones were moved from the supplement to the main article as new Fig.5).

      In Figure 4 the authors measured an increase in the size of the active zone (as judged by the size of the bassoon cluster) and of the synaptic ribbons in the Cav1.4 G369i. A mechanistic explanation for this phenomenon was not provided and the underlying molecular mechanisms were not unraveled. 

      The FIB-SEM data uncover some ultrastructural alteration/misalignments of the synaptic ribbons and misalignments of the regular arrangement of the postsynaptic dendrites in the G369i KI mice. Also concerning this observation, the study remains descriptive and does not reveal the underlying mechanisms as it would be expected for eLife. 

      We respectfully disagree on the descriptive nature of our study and the need for a full characterization of the molecular mechanism underlying the cone synaptic defects in the G369i KI mouse.   

      An important study in the field (Zanetti et al., Sci. Rep. 2021; pmid 33526839) should be also cited that used a gain-of-function mutation of Cav1.4 to analyze its functional and structural role in the cone pathway. 

      We have added citation of this paper to the Discussion (lines 354-356).

      In conclusion, the study has been expertly performed but remains descriptive without deciphering the underlying molecular mechanisms of the observed phenomena, including the proposed homeostatic switch of synaptic calcium channels. Furthermore, a relevant part of the data in the present paper (presence of T-type calcium channels in cone photoreceptors) has already been identified/presented by previous studies of different groups (Macosko et al., 2015; pmid 26000488; Davison et al., 2021; pmid 35803735; Williams et al., 2022; pmid 35650675). The degree of novelty of the present paper thus appears limited. I think that the study might be better suited in a more specialized journal than eLife. 

      We thank the reviewer for acknowledging the rigor of our study but disagree with their evaluation regarding the novelty of our work as outlined in our responses above.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      My comments are largely limited to suggestions to make the manuscript easier to read and digest.

      In the abstract they say RNA sequencing highlights changes in innate...

      Could they be more specific? Innate immune system up or down? They do not indicate actual findings in the abstract.

      We thank the reviewer for the comment and we have revised the abstract accordingly.  

      Their use of non‐intuitive abbreviations is often confusing. Perhaps they can add a table in methods listing all the abbreviations so that the reader can follow the data better. mNGA, vmHT....etc.

      As suggested, we have now included a list of the abbreviations used in the paper.

      There are mis‐spellings in the manuscript.

      We have gone through the manuscript and corrected the mis-spellings.   

      Has the SPR RNAi line been validated?

      The SPR RNAi line that we used has been extensively validated by Yapici et al., 2007 and several subsequent publications. Importantly, the effectiveness of SPR knockdown is evident in female flies as they exhibit dramatically reduced egg laying and, importantly, lack the typical post-mating behaviors (such as rejection of male flies after initial mating) observed in the wild type mated female flies. In fact, female flies with RNAi-mediated SPR knockdown behave identically to females mated with SP-null male flies, confirming the effective disruption of the SP-SPR signaling pathway. We have revised the manuscript and added these statements in the results section concerning SPR RNAi.  

      In the figures showing the Climbing Index vs time, can they abbreviate seconds as sec vs s? At least I think it is seconds. At first, I thought it was Time or Times, and was confused about what they were indicating on those types of graphs (Figures 1D‐F).

      We have revised the figure as suggested by the reviewer.

      In Figure 3F, they have a significance indicated in an unclear manner. It looks like they are comparing neuropil to the cortex, but I think they really mean to compare the cortex of sham to cortex of D31?

      The reviewer was correct. We have revised figure 3F to make this clear.     

      In Figure 4B, what is the y‐axis? Percentage of what? Is that percentage of total flies?

      The reviewer was correct. We have revised the figure to make this clear. 

      In a figure like SF3 B, what is the y‐axis? "Norm. Accum. CI" Can they explain the abbreviation?

      We have revised the Y-axis label to be “Normalized accumulative CI”.  We have also made this clear in the legend.   

      In the methods, what does this mean: "Regions devoid of Hoechst and phalloidin signal in non‐physiologically appropriate areas were considered vacuoles"? What are non‐physiologically appropriate areas? To me, that would mean outside of the brain. I would have thought the areas should be physiologically appropriate (aka neuropil and cortex)? This is confusing.

      We have revised the method section to be more specific.  In the Drosophila brain, there are structures such as esophagus that are devoid of both Hoechst and phalloidin staining, which were excluded from our vacuole quantification.    

      Reviewer #2 (Recommendations For The Authors):

      Since I use mammalian systems, my comment about the confirmation of siRNA should be removed if this is not possible in the Drosophila system.

      We have revised the figures to include total N values when appropriate. Including individual n values for each experimental assay and condition will inevitably crowd the figure legends, so specific values are available upon request. 

      Regarding RNAi knockdown of sex peptide receptors (SPRs), we agree that confirmation of the knockdown by IHC or qRT-PCR will further strengthen our findings. It should be noted, however, that the RNAi line we used has been extensively validated by Yapici et al., 2007 and several subsequent publications. Importantly, the effectiveness of SPR knockdown is evident in female flies as they exhibit dramatically reduced egg laying and, importantly, lack the typical post-mating behaviors (such as rejection of male flies after initial mating) observed in the wild type mated female flies. In fact, female flies with RNAi-mediated SPR knockdown behave identically to females mated with SP-null male flies, confirming the effective disruption of the SP-SPR signaling pathway. We have revised the manuscript to include these statements in the results concerning the SPR RNAi knockdown.    

      Reviewer #3 (Recommendations For The Authors):

      (1) In Figures 1 and 2, the authors found that females have a lower climbing index in the acute phase in D17 injury, not due to neurodegeneration as shown no significant changes of brain vacuolation and other markers. However, in Figure 3, the authors found that female flies have a lower climbing index, more brain vacuolation, and neurodegeneration in the late phase. It's not very convincing that having a lower climbing index at the late phase is due to neurodegeneration. Is it possible that females suffered from more severe acute effects, at least in D17 injury?

      We thank the reviewer for this point. Female flies injured on D17 displayed acute climbing deficits at 90 minutes post-injury. Since we did not observe significant structural changes in the brain at this time, we believe that this short-term functional deficit is not due to acute neuronal death. Here it is important to note that males did not display any acute climbing deficits when injured on D17, which suggests that the females suffered from more severe acute effects than males. However, these injured female flies recovered fully at 24 hours post-injury and displayed no climbing deficits. At two weeks post-injury, we observe climbing deficits and increased vacuole formation as a direct result of the injuries on D17 (see Supplemental Figure 3). When we assessed sensorimotor behavior and brain vacuolation on D45, we found that the injured females had significantly lower climbing indices and more brain vacuolation than the non-injured females of the same age. In this case, the concurrent observance of decreased climbing ability and increased brain vacuolation suggests chronic neurodegeneration in aged, injured females. This is not to be confused with the acute neuronal death observed by other groups using injury models of stronger severity. Overall, our data are consistent with the current view that in many neurodegenerative diseases, functional deficits often precede observable brain degeneration, which may take years to manifest.

      (2) The authors determined late‐life brain deficits and neurodegeneration purely based on climbing index and vacuole formation. These phenotypes are not really specific to TBI‐related neurodegeneration and the significance and mechanisms of vacuole formation are not clear. Indeed, in Figures 3 A and B, male flies especially D31inj tend to have a much larger variation than any other groups. What could be the reasons? The authors should perform additional analyses on TBI‐related neurodegeneration in flies, which have been shown before, such as retinal degeneration and loss, neuronal degeneration, and loss, neuromuscular junction abnormalities, etc (Genetics. 2015 Oct; 201(2): 377‐402).

      We thank the reviewer for the thorough evaluation of our manuscript. The reviewer raised a very important question: whether the neurodegeneration observed in our model is specific to TBI. As the reviewer rightly pointed out, the neurodegenerative phenotypes are unlikely to be specific to TBI-related neurodegeneration. Throughout the manuscript, we have tried to convey the notion that the mild physical impacts to the head represent one form of environmental insults, which in combination with other risk factors such as aging can lead to the emergence of neurodegenerative conditions. It should be noted that the negative geotaxis assay and vacuolation quantification are two well-established approaches to assess sensorimotor deficits and frank brain degeneration in fly brains. 

      It is important to emphasize that the head-specific impacts delivered to the flies in our study are much milder than those used in previous studies. As we showed in our figure 1, this very mild form of head trauma (referred to as vmHT) did not cause any death, nor affected the lifespan of the injured flies. Our supplemental data also show very minimal structural neuronal damage and no acute and chronic apoptosis induced by vmHT exposure. Consistently, we did not observe any exoskeletal or eye damage immediately following injuries, nor did we observe any retinal degeneration and pseudopupil loss at the chronic stage of these flies. We have incorporated these important points in the revised manuscript.  

      (3) In Figure 4, it would be important to perform the behavior test fly speed and directional movement in the acute phase as well to determine whether the females have reduced performance at the acute phase.

      We thank the reviewer for this suggestion. Please note that our modified NGA has already improved the spatiotemporal resolution over the classic NGA.  The data presented in Fig.3 show that there are no acute deficits for young cohorts.  Therefore, we do not believe that the detailed analysis of the direction and speed of these flies is essential.  

      Unfortunately, the current setup for the AI-based analysis requires manual corrections of tracking errors, which are time-consuming and tedious.  We are building a newly designed AI-based NGA (NGA.ai) that will allow automatic tracking and quantification with minimal manual interventions. Once it is completed, we will perform some of the analyses that the reviewer suggested.  

      (4) In Figure 8, the authors performed an RNA‐seq analysis and identified some dysregulated gene expressions. However, it is really surprising to see so few DEGs even in wild‐type males and mated females, and to see that none of DEGs overlap among groups or related to the SP‐signaling. This raises questions about the validity of the RNAseq analysis. It is critical to independently verify their RNA‐sequencing results and to add some more molecular evidence to support their conclusion.

      We agree that future studies are needed to independently validate our RNA sequencing results. We believe that the small number of DEGs are likely due to two unique features of our study: (1) the very mild nature of our injury paradigm and (2) the chronic examination timepoint that was long after the head injury and SP exposure, which distinguish our study from previous fly TBI studies.  As pointed out in the manuscript, our study was aimed to understand how early life exposure to repetitive head traumatic insults could lead to the latelife onset of neurodegenerative conditions. We hope to further validate our results in our next phase of experiments using single-cell RNA sequencing and RT-qPCR. 

      (5) The current results raise a series of interesting questions: what implication of female fly mating and its associated Sex Peptide signaling would be to mammalians or humans? Would mammalian female animals mating with wild‐type or sex hormone‐null male animals have different effects on their post‐injury behavior tests or neuropathological changes? What are the mechanisms underlying the sexual dimorphism?

      As the reviewer pointed out, it would be very interesting to explore the possible roles of sex peptide-signaling in other animals and humans. As far as we know, there is no known mammalian ortholog to the insect sex peptide, so it would be difficult to study SP or an SPlike molecule in mammalian models. However, we believe that prolonged post-mating changes associated with reproduction in female fruit flies contribute to their elevated vulnerability to neurodegeneration.  In this regard, drastic changes within the biology of female mammals associated with reproduction can potentially lead to vulnerability to neurodegeneration. We agree that this demands further study, which may be done with future collaborators using rodent or large animal models.  We have discussed this point in the manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We would like to thank you very much for reviewing our manuscript and express our sincere appreciation for the valuable and thoughtful comments that led us to significantly improve the manuscript on Fshr-ZsGreen reporter mice. We have seriously taken your comments to make a major revision of the manuscript, and here is a summary of the revision:

      (1) New data on Fshr expression are input to the revised Manuscript:

      a. Fshr expression in the testis and adipose tissues (WAT and BAT) of B6 mice;

      b. Fshr expression in the testis of B6 by RNA-smFISH;

      c. Comparison of Fshr expression in the testis and ovary between Fshr-ZsGreen and B8 mice by ddRT-PCR to prove Fshr expression without interruptions by insertion of P2A-ZsGreen vector;

      d. Reduction of Fshr expression in osteocytes within the femoral sections from DMP1-CreERT2:Fshrfl/fl mice;

      e. Fshr expression in an established Leydig cell line-TM3 by immunofluorescence and ddRT-PCR, also show Fshr located in the nuclei of TM3 cells;

      f. Fshr expression at scRNA-seq level from 5 public single cell portals as Supplementary Data 3 to support our findings of the widespread expression pattern of Fshr, particularly in Leydig cells.

      (2) Re-organization of Figure 2 with a new legend.

      (3) A new paragraph is added to the Discussion Section of the revised MS to explain the function of P2A peptide in generation of GFP reporter mice and why Fshr express is not interrupted by the P2A-ZsGreen insertion in Fshr-ZsGreen reporter.

      (4) Deletion of Figure 1-D-c, as it is not necessary.

      (5) Replace of Figure 8-A (the left panel) with a reduced exposure time image.

      (6) Amended parts of the revised MS are labeled in red.

      A point by point response to the Reviewers’ comments:

      Reviewer 1:

      One of the shocking observations in this manuscript is the expression of FSHR in Leydig cells. Other observations are in the osteoblasts and endothelial cells as well as epithelial cells in different organs. The expression of ZsGreen in these tissues seems high and one shall start questioning if there are other mechanisms at play here.

      First, the turnover of fluorescent proteins is long, longer than 48h, which means that they accumulate at a different speed than the endogenous FSHR This means that ZsGreen will accumulate in time while the FSHR receptor might be degraded almost immediately. This correlated with mRNA expression (by the authors) but does not with the results of other studies in single-cell sequencing (see below).

      The expression of ZsGreen in Leydig cells seems much higher than in Sertoli cells, this is "disturbing" to put it mildly. This is visible in both the ZsGreen expression and the FISH assay (Figure 2 B-D).

      Thank you for this valuable comments. We added new data on Fshr expression to prove the presence of Fshr in Leydig cells in B6 detected by immunofluorescence staining, RNA-smFISH and ddRT-PCR, as well as in TM3 cells-isolated Leydig cells from a male mice in the revise MS (Fig 2E, F and G), that demonstrate no interruptions of normal Fshr expression by insertion of P2A-ZsGreen vector into a locus located between exon10 and stop code. We use ZsGreen as an indicator for active Fshr promoter status, rather than a method to measure Fshr expression, which is done by ddRT-PCR. These data are shown in Figure 2G of the revised MS

      In addition, we provide scRNA-seq based evidence on Fshr expression in human Leydig cells from two single cell portals (DISCO and BioGPS) as shown in Supplementary Data 3 in the revised MS. We also cited a recent report on scRNA-seq analysis of Fshr expression in Hu sheep in the revised MS as Reference 65 (PMID: 37541020) 1, which also clearly showed Fshr expression in Leydig cells at single cell level in Hu Sheep.

      We believe that the lack of Fshr expression in some single cell databases may be due to the degradation of Fshr transcript in cells during the process of single cell populations. In our laboratory, we spent more than 6 months to optimize methods and reagents to perverse mRNA integrity more than 8 for RAN-seq.

      The expression in WAT and BAT is also questionable as the expression of ZsGreen is high everywhere. That makes it difficult to believe that the images are truly informative. For example, the stainings of aorta show the ZsGreen expression where elastin and collagen fibres are - these are not "cells" and therefore are not expressing ZsGreen.

      FISH expression (for FSHR) in WT mice is missing.

      Also, the tissue sections were stained with the IgG only (neg control) but in practice both the KI and the WT tissues should be stained with the primary and secondary antibodies. The only control that I could think of to truly get a sense of this would be a tagged receptor (N-terminal) that could then be analysed by immunohistochemistry.

      Reply 2 and 3: Thank you for these comments. New data on Fshr expression in WAT and BAT of B6 mice by immunofluorescence staining and in the testis of B6 mice by immunofluorescence staining and RNA-smFISH are added to the revised MS (Fig.2D and E, and Fig. 4G), showing similar patterns to that of Fshr-ZsGreen mice. Furthermore, we provide more evidences as Supplementary Data 3 on Fshr expression obtained from 4 public single cell portables, showing FSHR expression in a widespread organs and tissues (including different fractions of adipose cells) of human, mice and rat at single cell levels. Please also check Fshr expression pattern in adipose tissues by immunostaining for Fshr in previous reports (Fig. 3a of PMID: 28538730 and Fig. 2 of PMID: 25754247) 2 3, which showed a similar expression pattern to our finding. These data should address your concerns on Fshr expression in WAT and BAT and other organs/tissues.

      Regard of “For example, the stainings of aorta show the ZsGreen expression where elastin and collagen fibres are - these are not "cells" and therefore are not expressing ZsGreen.” We believe that you referred to the image of the aorta in Supplementary Data2. However, Please take a look at the images of the aorta in Figure 5-C, which shows positively stained the layer of ‘elastin and collagen fibres’ for EMCN and a-SMA colocalized with Fshr expression with stained DAPI at a 1000X magnification, indicating endothelial cells and the cellular membrane presented in this layer, not just ‘elastin and collagen’.

      The authors also claim:

      To functionally prove the presence of FSHR in osteoblasts/osteocytes, we also deleted FSHR in osteocytes using an inducible model. The conditional knockout of FSHR triggered a much more profound increase in bone mass and decrease in fat mass than blockade by FSHR antibodies (unpublished data).

      This would be a good control for all their images. I think it is necessary to make the large claim of extragonadal expression, as well as intragonadal such as Leydig cells.

      Thank you for this very encouraging comment. As you suggested, we did add a result of reduced Fshr expression in osteocytes from DMP1-CreERT2+:Fshrfl/fl mice treated with tamoxifen to the revise MS, as shown in Figure 3D, demonstrating Fshr present in osteocytes and the specificity of Fshr antibody. Furthermore, we incorporated your advice on making ‘ large claim of extrogonadal and intragonadal expression of Fshr’ into the revised MS in red.

      Claiming that the under-developed Leydig cells in FSHR KO animals are due to a direct effect of the FSHR, and not via a cross-talk between Sertoli and Leydig cells, is too much of a claim. It might be speculated to some degree but as written at the moment it suggests this is "proven".

      Thank you for pointing out this incorrect claim and we apologized for it. In the revised MS, we deleted this claim.

      We also do not know if this FSHR expressed is a spliced form that would also result in the expression of ZsGreen but in a non-functional FSHR, or whether the FSHR is immediately degraded after expression. The insertion of the ZsGreen might have disturbed the epigenetics, transcription, or biosynthesis of the mRNA regulation.

      Thanks for this comment. In the revised MS, we added a new section to explain the function of P2A peptide in generation of a GFP reporter by sgRNA-guilded site specific knockin of P2A ZsGreen vector through CRISPRA/cas9 and provided a new result on comparison of Fshr expression in the testes and ovaries from Fshr-ZsGreen and B6 mice, showing equivalent Fshr expression between Fshr-ZsGreen and B6 mice (Figure 2G), which indicates no interruptions of Fshr expression by the insertion of P2A vector.

      The authors should go through single-cell data of WT mice to show the existence of the FSHR transcript(s).<br /> For example here:<br /> https://www.nature.com/articles/sdata2018192

      Thank you so much for the valuable comment. Yes, we took you critical advice to check Fshr expression through 4 single cell portals, including DISCO, GTEx, BioGPS and Human single cell portal, and present the collected data as Supplementary Data 3 in the revised MS, that strongly support our findings of the wider Fshr expression. Particularly, Fshr expression in Leydig cells is proved by scRNA-seq studies of human cells from DISCO and BioGPS, as well as a recent study in Hu sheep (PMID: 37541020) 1 and we cited it in the revised MS.

      Reviewer 2:

      Is the FSHR expression pattern affected by the knockin mice (no side-by-side comparison between wt and GSGreen mice, using in situ hybridization and ddRTPCR, at least in the gonads, is provided)?

      Thanks for the comment. In the revised MS, we provided a set of new data on Fshr expression in the testis, ovary, WAT and BAT of B6 mice by immunofluorescence staining and by RNA-smFISH for Fshr expression, showing similar expression patterns. Additionally, we also performed ddRT-PCT to compare Fshr expression in the testes and ovaries between Fshr-ZsGreen and B6 mice, demonstrating equivalent expression of Fshr expression between Fshr-ZsGreen and B6 mice. Interestingly, we also observed an significantly higher Fshr expression in the testis than that in the ovary (more than 30 folds).

      Is the splicing pattern of the FSHR affected in the knockin compared to wt mice, at least in the gonads?

      Thanks for the question. Please see our reply to the Reviewer 1 for the function of P2A peptide used for generation of GFP reporters.  Although we didn’t directly assess the splicing pattern, we provide a result of comparison of Fshr expression in Figure 2F in the revised MS, indirectly showing no changes of the splicing pattern. We will assess the splicing pattern of Fshr in the future that has been neglected in the field.

      Are there any additional off-target insertions of GSGreen in these mice?” and “Are similar results observed in separate founder mice?

      Thanks for the questions. As we describe it in the method section  in detail in the MS, Fshr-ZsGreen reporter was produced by the a site-specific long ssDNA recombination of the P2A-ZsGreen targeting vector to the locus between Exon10 and stop code by CRIPRA/cas9, which was guided by site-specific single guide RNA (sgRNA). We showed the results of Southern blot, DNA sequencing and site-specific PCR, proving the site-specific insertion of P2A-ZsGreen as shown in Figure 1. Because of the site-specific recombination, professionally, only one funder line is required for the study and there are no additional off-target insertions.

      How long is GSGreen half-life? Could a very long half-life be a major reason for the extremely large expression pattern observed?

      Thanks for the question. The half life of ZsGreen, also called ZsGreen1, is at least 26 h in mammalian cells or slightly longer due to its tetrameric structure, in contrast with the monomeric configuration of other well-known fluorescent proteins (PMID: 17510373) 4. The rationale for using this GFP protein is that ZsGreen is an exceptionally bright green fluorescent protein, which is up to 4X brighter than EGFP—and is ideally suited for whole-cell labelling, promoter-reporter studies, considering of the higher turnover and rapid degradation of Fshr transcript. In this study, we used ZsGreen as a monitor or an indicator of the active Fshr endogenous promoter, rather than a means for measuring the promoter activity. Therefore, regardless of its accumulation or not, ZsGreen driven by Fshr promoter, indicates the presence of active Fshr promoter in the defined cells. In stead, we used ddRT-PCR to measure Fshr expression degrees in this study. In addition, we also provide single cell sequence-based evidence from 4 public single cell portables to support our findings of the wide Fshr expression. Please see Supplementary Data 3 in the revised MS.

      References:

      (1) Su J, Song Y, Yang Y, et al. Study on the changes of LHR, FSHR and AR with the development of testis cells in Hu sheep. Anim Reprod Sci. Sep 2023;256:107306. doi:10.1016/j.anireprosci.2023.107306

      (2) Liu P, Ji Y, Yuen T, et al. Blocking FSH induces thermogenic adipose tissue and reduces body fat. Nature. Jun 1 2017;546(7656):107-112. doi:10.1038/nature22342

      (3) Liu XM, Chan HC, Ding GL, et al. FSH regulates fat accumulation and redistribution in aging through the Galphai/Ca(2+)/CREB pathway. Aging Cell. Jun 2015;14(3):409-20. doi:10.1111/acel.12331

      (4) Bell P, Vandenberghe LH, Wu D, Johnston J, Limberis M, Wilson JM. A comparative analysis of novel fluorescent proteins as reporters for gene transfer studies. J Histochem Cytochem. Sep 2007;55(9):931-9. doi:10.1369/jhc.7A7180.2007

    1. Author response:

      eLife assessment 

      This important study identifies a novel gastrointestinal enhancer of Ctnnb1. The authors present convincing evidence to support their claim that the dosage of Wnt/β-catenin signaling controlled by this enhancer is critical to intestinal epithelia homeostasis and the progression of colorectal cancers. The study will be of interest to biomedical researchers interested in Wnt signaling, tissue-specific enhancers, intestinal homeostasis, and colon cancer. 

      We greatly appreciate editors’ and reviewers’ extensive and constructive comments and suggestions. We will do our utmost to revise the manuscript accordingly.

      Public Reviews: 

      Reviewer #1 (Public Review)

      Summary: 

      Ctnnb1 encodes β-catenin, an essential component of the canonical Wnt signaling pathway. In this study, the authors identify an upstream enhancer of Ctnnb1 responsible for the specific expression level of β-catenin in the gastrointestinal tract. Deletion of this promoter in mice and analyses of its association with human colorectal tumors support that it controls the dosage of Wnt signaling critical to the homeostasis in intestinal epithelia and colorectal cancers. 

      Strengths: 

      This study has provided convincing evidence to demonstrate the functions of a gastrointestinal enhancer of Ctnnb1 using combined approaches of bioinformatics, genomics, in vitro cell culture models, mouse genetics, and human genetics. The results support the idea that the dosage of Wnt/β-catenin signaling plays an important role in the pathophysiological functions of intestinal epithelia. The experimental designs are solid and the data presented are of high quality. This study significantly contributes to the research fields of Wnt signaling, tissue-specific enhancers, and intestinal homeostasis. 

      Weaknesses: 

      One weakness of this manuscript is an insufficient discussion on the Ctnnb1 enhancers for different tissues. For example, do specific DNA motifs and transcriptional factors contribute to the tissue-specificity of the neocortical and gastrointestinal enhancers? It is also worth discussing the potential molecular mechanisms controlling the gastrointestinal expression of Ctnnb1 in different species since the identified human and mouse enhancers don't seem to share significant similarities in primary sequences. 

      We agree with the reviewer that the manuscript lacks sufficient discussions on how enhancers control cell-type-specific expressions of target genes, which is one of the most important questions in the field of transcription regulation. Equally important are the common and species-specific features of this regulation. In general, motif composition, location, order, and affinity with trans-factors within enhancers are four key elements. We will elaborate the point in follow-up revision.

      Reviewer #2 (Public Review): 

      Wnt signaling is the name given to a cell-communication mechanism that cells employ to inform on each other's position and identity during development. In cells that receive the Wnt signal from the extracellular environment, intracellular changes are triggered that cause the stabilization and nuclear translocation of β-catenin, a protein that can turn on groups of genes referred to as Wnt targets. Typically these are genes involved in cell proliferation. Genetic mutations that affect Wnt signaling components can therefore affect tissue expansion. Loss of function of APC is a drastic example: APC is part of the β-catenin destruction complex, and in its absence, β-catenin protein is not degraded and constitutively turns on proliferation genes, causing cancers in the colon and rectum. And here lies the importance of the finding: β-catenin has for long been considered to be regulated almost exclusively by tuning its protein turnover. In this article, a new aspect is revealed: Ctnnb1, the gene encoding for β-catenin, possesses tissue-specific regulation with transcriptional enhancers in its vicinity that drive its upregulation in intestinal stem cells. The observation that there is more active β-catenin in colorectal tumors not only because the broken APC cannot degrade it, but also because transcription of the Ctnnb1 gene occurs at higher rates, is novel and potentially game-changing. As genomic regulatory regions can be targeted, one could now envision that mutational approaches aimed at dampening Ctnnb1 transcription could be a viable additional strategy to treat Wnt-driven tumors. 

      We appreciate the reviewer for acknowledging the potential significance represented by the manuscript. We also recognize that targeting genomic regulatory regions to dampen Ctnnb1 transcription could be a promising strategy for treating Wnt-driven tumors, including many colorectal carcinomas. However, we would like to point out that three are significant technical challenges associated with AAV delivery to the GI epithelium, including the hostile environment, immune response, and low delivery efficiency.

      Reviewer #3 (Public Review): 

      The authors of this paper identify an enhancer upstream of the Ctnnb1 gene that selectively enhances expression in intestinal cells. This enhancer sequence drives expression of a reporter gene in the intestine and knockout of this enhancer attenuates Ctnnb1 expression in the intestine while protecting mice from intestinal cancers. The human counterpart of this enhancer sequence is functional and involved in tumorigenesis. Overall, this is an excellent example of how to fully characterize a cell-specific enhancer. The strength of the study is the thorough nature of the analysis and the relevance of the data to the development of intestinal tumors in both mice and humans. A minor weakness is that the loss of this enhancer does not completely compromise the expression of the Ctnnb1 gene in the intestine, suggesting that other elements are likely involved. Adding some discussion on that point would be helpful.

      We are quite encouraged by the reviewer’s positive comments. We agree with the reviewer that other cis-regulatory elements may be involved in the transcription of Ctnnb1 within the GI epithelium. It is also possible that the basal transcription of Ctnnb1 within the GI epithelium is relatively high, and that enhancers can only boost transcription within a certain range. We will discuss these possibilities in the revision.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The manuscript presents a machine-learning method to predict protein hotspot residues. The validation is incomplete, along with the misinterpretation of the results with other current methods like FTMap.

      We believe that validation is complete: The two most common techniques for testing and validating machine-learning methods are to split the dataset into either (1) a training set and a test set with a fixed ratio (e.g., 70% for training and 30% for testing) or (2) multiple subsets/folds; i.e., cross-validation. We did not employ a training set to train the model and a separate test set to evaluate its performance, as Reviewer 2 assumed. Instead, we employed cross-validation, as it helps reduce the variability in performance estimates compared to a single training/test split, and utilizes the entire dataset for training and testing, making efficient use of the limited data. Each fold was used once as a test set and the remaining folds as the training set - this process was repeated for each fold and the model's performance was measured using the F1 score. We had listed the mean validation F1 score in Table 1.

      We have clarified our comparison with FTMAP  - see reply to point 1 of reviewer 1 below. 

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The paper describes a program developed to identify PPI-hot spots using the free protein structure and compares it to FTMap and SPOTONE, two webservers that they consider as competitive approaches to the problem. On the positive side, I appreciate the effort in providing a new webserver that can be tested by the community but have two major concerns as follows.

      (1) The comparison to the FTMap program is wrong. The authors misinterpret the article they refer to, i.e., Zerbe et al. "Relationship between hot spot residues and ligand binding hot spots in protein-protein interfaces" J. Chem. Inf. Model. 52, 2236-2244, (2012). FTMap identifies hot spots that bind small molecular ligands. The Zerbe et al. article shows that such hot spots tend to interact with hot spot residues on the partner protein in a protein-protein complex (emphasis on "partner"). Thus, the hot spots identified by FTMap are not the hot spots defined by the authors. In fact, because the Zerbe paper considers the partner protein in a complex, the results cannot be compared to the results of Chen et al. This difference is missed by the authors, and hence the comparison of the FTMap is invalid. I did not investigate the comparison to SPOTONE, and hence have no opinion.

      Brenke et al. (Bioinformatics 2009 25: 621-627), who developed FTMAP, defined hot spots as regions of the binding surface that “contribute a disproportionate amount to the binding free energy”. Kozakov et al. (Proc. Natl. Acad. Sci. 2011:108, 13528-1353) used unbound protein structures as input to FTMap to predict binding hot spots for protein-protein interactions (PPIs), which are defined as regions (so-called consensus sites) on a protein surface that bind multiple probe clusters − the main hot spot is the largest consensus site binding the largest number of probe clusters. 

      Zerbe et al. (J. Chem. Inf. Model. 2012:52, 2236) noted that a consensus “site is expected to be important in any interaction that involves that region of the target independent of any partner protein.” They showed that for hot spot residues found by Ala scanning not only overlapped with the probe ligands but also form consensus sites, as shown in Figure 4. They stated that “A residue can also be identified as a hot spot by alanine scanning if it contributes to creating such a favorable binding environment by being among the residues forming a consensus site on the protein to which it belongs.”

      To clarify the comparison with FTmap in the revised version, we have added the following sentence in the Abstract on p. 3:

      “We explored the possibility of detecting PPI-hot spots using (i) FTMap in the PPI mode, which identifies hot spots on protein-protein interfaces from the free protein structure, and (ii) the interface residues predicted by AlphaFold-Multimer.”

      We have added the following sentences in the Introduction section on p. 4:

      “We explored the possibility of detecting PPI-hot spots using the FTMap server in the PPI mode, which identifies hot spots on protein-protein interfaces from free protein structures.45 These hot spots are identified by consensus sites − regions that bind multiple probe clusters.42,45,59 Such regions are deemed to be important for any interaction involving that region of the target, independent of partner protein.42 PPIhot spots were identified as residues in van der Waals (vdW) contact with probe ligands within the largest consensus site containing the most probe clusters.”

      and in the Results section on p. 5:

      “Given the free protein structure, PPI-HotspotID and SPOTONE53 predict PPI-hot spots based on a probability threshold (> 0.5). FTMap, in the PPI mode, detects PPIhot spots as consensus sites/regions on the protein surface that bind multiple probe clusters.59 Residues in vdW contact with probe molecules within the largest consensus site were compared with PPI-hotspotID/SPOTONE predictions.”

      (2) Chen et al. use a number of usual features in a variety of simple machine-learning methods to identify hot spot residues. This approach has been used in the literature for more than a decade. Although the authors say that they were able to find only FTMap and SPOTONE as servers, there are dozens of papers that describe such a methodology. Some examples are given here: (Higa and Tozzi, 2009; Keskin, et al., 2005; Lise, et al., 2011; Tuncbag, et al., 2009; Xia, et al., 2010). There are certainly more papers. Thus, while I consider the web server as a potentially useful contribution, the paper does not provide a fundamentally novel approach.

      Our paper introduces several novel elements in our approach: 

      (1) Most PPI-hot spot prediction methods employ PPI-hotspots where mutations decrease protein binding free energy by > 2 kcal/mol (J. Chem. Inf. Model. 2022, 62, 1052). In contrast, our method incorporates not only PPI-hot spots with such binding free energy changes, but also those whose mutations have been curated in UniProtKB to significantly impair/disrupt PPIs. Because our method employs the largest collection of experimentally determined PPI-hot spots, it could uncover elusive PPI-hot spots not within binding interfaces, as well as potential PPI-hot spots for other protein partners (see point 3 below). 

      (2) Whereas most machine-learning methods for PPI-hot spot prediction focus on features derived from (i) primary sequences or (ii) protein-protein complexes, we introduce novel features such as per-residue free energy contributions derived from unbound protein structures. We further revealed the importance of one of our novel features, namely, the gas-phase energy of the target protein relative to its unfolded state and provided the physical basis for its importance. For example, PPI-hot spots can enhance favorable enthalpic contributions to the binding free energy through hydrogen bonds or van der Waals contacts across the protein’s interface. This makes them energetically unstable in the absence of the protein’s binding partner and solvent; hence providing a rationale for the importance of the gas-phase energy of the target protein relative to its unfolded state.

      (3) As a result of these novel elements, our approach, PPI-HotspotID,  could identify many true positives that were not detected by FTMap or SPOTONE (see Results and Figure 1). Previous methods generally predict residues that make multiple contacts across the proteinprotein interface as PPI-hot spots. In contrast, PPI-HotspotID can detect not only PPI-hot spots that make multiple contacts across the protein-protein interface, but also those lacking direct contact with the partner protein (see Discussion).

      (4) Unlike most machine-learning methods which require feature customization, data preprocessing, and model optimization, our use of AutoGluon’s AutoTabular module automates data preprocessing, model selection, hyperparameter optimization, and model evaluation. This automation reduces the need for manual intervention.

      We have revised and added the following sentences on p. 9 in the Discussion section to highlight the novelty of our approach: 

      “Here, we have introduced two novel elements that have helped to identify PPI-hot spots using the unbound structure. First, we have constructed a dataset comprising 414 experimentally known PPI-hot spots and 504 nonhot spots, and carefully checked that PPI-hot spots have no mutations resulting in ΔΔGbind < 0.5 kcal/mol, whereas nonhot spots have no mutations resulting in ΔΔGbind ≥ 0.5 kcal/mol or impact binding in immunoprecipitation or GST pull-down assays (see Methods). In contrast, SPOTONE53 employed nonhot spots defined as residues that upon alanine mutation resulted in ΔΔGbind < 2.0 kcal/mol. Notably, previous PPI-hot spot prediction methods did not employ PPIhot spots whose mutations have been curated to significantly impair/disrupt PPIs in UniProtKB (see Introduction). Second, we have introduced novel features derived from unbound protein structures such as the gas-phase energy of the target protein relative to its unfolded state.”

      Strengths:

      A new web server was developed for detecting protein-protein interaction hot spots.

      Weaknesses:

      The comparison to FTMap results is wrong. The method is not novel.

      See reply to points 1 and 2 above.

      Reviewer #2 (Public Review):

      Summary:

      The paper presents PPI-hotspot a method to predict PPI-hotspots. Overall, it could be useful but serious concerns about the validation and benchmarking of the methodology make it difficult to predict its reliability.

      Strengths:

      Develops an extended benchmark of hot-spots.

      Weaknesses:

      (1) Novelty seems to be just in the extended training set. Features and approaches have been used before.

      The novelty of our approach extends beyond just the expanded training set, as summarized in our reply to Reviewer #1, point 2 above. To our knowledge, previous studies did not leverage the gas-phase energy of the target protein relative to its unfolded state for detecting PPI-hot spots from unbound structures. Previous studies did not automate the training and validation process. In contrast, we used AutoGluon’s AutoTabular module to automate the training  of (i) individual “base” models, including LightGBM, CatBoost, XGBoost, random forests, extremely randomized trees, neural networks, and K-nearest neighbours, then (ii) multiple “stacker” models. The predictions of multiple “stacker” models were fed as inputs to additional higher layer stacker models in an iterative process called multi-layer stacking. The output layer used ensemble selection to aggregate the predictions of the stacker models. To improve stacking performance, AutoGluon used all the data for both training and validation through repeated k-fold bagging of all models at all layers of the stack, where k is determined by best precision. This comprehensive approach, including repeated k-fold bagging of all models at all layers of the stack, sets our methodology apart from previous studies, including SPOTONE (see Methods). 

      (2) As far as I can tell the training and testing sets are the same. If I am correct, it is a fatal flaw.

      The two most common techniques for testing and validating machine-learning methods are to split the dataset into either (1) a training set and a test set with a fixed ratio (e.g., 70% for training and 30% for testing) or (2) multiple subsets/folds; i.e., cross-validation. We did not employ a training set to train the model and a separate test set to evaluate its performance. Instead, we employed cross-validation, where the model was trained and evaluated multiple times. Each fold was used once as a test set and the remaining folds serve as the training set - this process was repeated for each fold. For each test set, we assessed  the model's performance using the F1 score. We had listed the mean validation F1 score in Table 1 in the original manuscript. Cross-validation helps reduce the variability in performance estimates compared to a single training/test split. It also utilizes the entire dataset for training and testing, making efficient use of the limited data. We have clarified this on p. 14 in the revised version:

      “AutoGluon was chosen for model training and validation due to its robustness and userfriendly interface, allowing for the simultaneous and automated exploration of various machine-learning approaches and their combinations. Instead of using a single training set to train the model and a separate test set to evaluate its performance, we employed cross-validation, as it utilizes the entire dataset for both training and testing, making efficient use of the limited data on PPI-hot spots and PPI-nonhot spots. AutoGluonTabular automatically chose a random partitioning of our dataset into multiple subsets/folds for training and validation. Notably, the training and validation data share insignificant homology, as the average pairwise sequence identity in our dataset is 26%. Each fold was used once as a test set, while the remaining folds served as the training set. For each test set, the model's performance was measured using the F1 score.”

      (3) Comparisons should state that: SPOTONE is a sequence (only) based ML method that uses similar features but is trained on a smaller dataset. FTmap I think predicts binding sites, I don't understand how it can be compared with hot spots. Suggesting superiority by comparing with these methods is an overreach.

      In the Introduction on page 3, we had already stated that:

      “SPOTONE53 predicts PPI-hot spots from the protein sequence using residue-specific features such as atom type, amino acid (aa) properties, secondary structure propensity, and mass-associated values to train an ensemble of extremely randomized trees. The PPIhot spot prediction methods have mostly been trained, validated, and tested on data from the Alanine Scanning Energetics database (ASEdb)55 and/or the Structural Kinetic and Energetic database of Mutant Protein Interactions (SKEMPI) 2.0 database.56”

      On p. 4, we have clarified how we used FTMAP to detect hot spots - see reply to Reviewer #1, point 1. 

      “We explored the possibility of detecting PPI-hot spots using the FTMap server in the PPI mode, which identifies hot spots on protein-protein interfaces from free protein structures.45 These hot spots are identified by consensus sites − regions that bind multiple probe clusters.42,45,59 Such regions are deemed to be important for any interaction involving that region of the target, independent of partner protein.42 PPI-hot spots were identified as residues in van der Waals (vdW) contact with probe ligands within the largest consensus site containing the most probe clusters.”

      (4) Training in the same dataset as SPOTONE, and then comparing results in targets without structure could be valuable.

      We think that the dataset used by SPOTONE is not as “clean” as ours since SPOTONE employed nonhot spots defined as aa residues that upon alanine mutation resulted in ΔΔGbind < 2.0 kcal/mol.  In contrast, we define nonhot spots as residues whose mutations resulted in protein  ΔΔGbind changes < 0.5 kcal/mol. Moreover, we carefully checked that the nonhot spots have no mutations resulting in ΔΔGbind changes ≥ 0.5 kcal/mol or impact binding in immunoprecipitation or GST pull-down assays (see Methods). We cannot compare results in targets without structure because we require the free protein structure to compute the perresidue free energy contributions. 

      (5) The paper presents as validation of the prediction and experimental validation of hotspots in human eEF2. Several predictions were made but only one was confirmed, what was the overall success rate of this exercise?

      We did not test all predicted PPI-hot spots but only the PPI-hot spot with the highest probability of 0.67 (F794) and 7 other predicted PPI-hot spots that were > 12 Å from F794 as well as 4 predicted PPI-nonhot spots. Among the 13 predictions tested, F794 and the 4 predicted nonhot spots were confirmed to be correct. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Remove the comparison to FTMap, and find a more appropriate reference method, even if it requires installing programs rather than using the available web servers.

      We have clarified comparison to FTMap in the revised ms - see our reply above.

    1. Author response:

      eLife assessment

      This useful study examines the neural activity in the motor cortex as a monkey reaches to intercept moving targets, focusing on how tuned single neurons contribute to an interesting overall population geometry. The presented results and analyses are solid, though the investigation of this novel task could be strengthened by clarifying the assumptions behind the single neuron analyses, and further analyses of the neural population activity and its relation to different features of behaviour.

      Thanks for recognizing the content of our research, and please stay tuned for our follow-up studies on neural dynamics during interception.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This study addresses the question of how task-relevant sensory information affects activity in the motor cortex. The authors use various approaches to address this question, looking at single units and population activity. They find that there are three subtypes of modulation by sensory information at the single unit level. Population analyses reveal that sensory information affects the neural activity orthogonally to motor output. The authors then compare both single unit and population activity to computational models to investigate how encoding of sensory information at the single unit level is coordinated in a network. They find that an RNN that displays similar orbital dynamics and sensory modulation to the motor cortex also contains nodes that are modulated similarly to the three subtypes identified by the single unit analysis.

      Strengths:

      The strengths of this study lie in the population analyses and the approach of comparing single-unit encoding to population dynamics. In particular, the analysis in Figure 3 is very elegant and informative about the effect of sensory information on motor cortical activity. The task is also well designed to suit the questions being asked and well controlled.

      We appreciate these kind comments.

      It is commendable that the authors compare single units to population modulation. The addition of the RNN model and perturbations strengthen the conclusion that the subtypes of individual units all contribute to the population dynamics. However, the subtypes (PD shift, gain, and addition) are not sufficiently justified. The authors also do not address that single units exhibit mixed modulation, but RNN units are not treated as such.

      We’re sorry for not providing sufficient grounds to introduce the subtypes. We determined the PD shift, gain, and addition as pertinent subtypes based on classical cosine tuning model (Georgopoulos et al., 1982) and referred to some gain modulation studies (e.g. Pesaran et al. 2010, Bremner and Andersen, 2012). Here, we applied this subtype analysis as a criteria to identify the modulation in neuronal population rather than to sort neuron into distinct cell types. We will update Methods in the revised version of manuscript.

      Weaknesses:

      The main weaknesses of the study lie in the categorization of the single units into PD shift, gain, and addition types. The single units exhibit clear mixed selectivity, as the authors highlight. Therefore, the subsequent analyses looking only at the individual classes in the RNN are a little limited. Another weakness of the paper is that the choice of windows for analyses is not properly justified and the dependence of the results on the time windows chosen for single-unit analyses is not assessed. This is particularly pertinent because tuning curves are known to rotate during movements (Sergio et al. 2005 Journal of Neurophysiology).

      The mixed selectivity or precisely the mixed modulation is indeed a significant feature of neuronal population in the present study. The purpose of the subtype analysis was to serve as a criterion for the potential modulation mechanisms. However, the results appear to be a spectrum than clusters. It still through some insights to understand the modulation distribution and we will refine the description in the next version. In the current version, we observed single-unit tuning and population neural state with sliding windows, focusing on the period around movement onset (MO) due to the emergence of a ring-like structure. We will clarify the choice of windows and the dependence assessment in the next version. It’s a great suggestion to consider the role of rotating tuning curves in neural dynamics during interception.

      This paper shows sensory information can affect motor cortical activity whilst not affecting motor output. However, it is not the first to do so and fails to cite other papers that have investigated sensory modulation of the motor cortex (Stavinksy et al. 2017 Neuron, Pruszynski et al. 2011 Nature, Omrani et al. 2016 eLife). These studies should be mentioned in the Introduction to capture better the context around the present study. It would also be beneficial to add a discussion of how the results compare to the findings from these other works.

      Thanks for the reminder. We will introduce the relevant research in the next version of manuscript.

      This study also uses insights from single-unit analysis to inform mechanistic models of these population dynamics, which is a powerful approach, but is dependent on the validity of the single-cell analysis, which I have expanded on below.

      I have clarified some of the areas that would benefit from further analysis below:

      (1) Task:

      The task is well designed, although it would have benefited from perhaps one more target speed (for each direction). One monkey appears to have experienced one more target speed than the others (seen in Figure 3C). It would have been nice to have this data for all monkeys.

      Great suggestion! However, it’s hard to implement as the implanted arrays have been removed.

      (2) Single unit analyses:

      In some analyses, the effects of target speed look more driven by target movement direction (e.g. Figures 1D and E). To confirm target speed is the main modulator, it would be good to compare how much more variance is explained by models including speed rather than just direction. More target speeds may have been helpful here too.

      Nice suggestion! The fitting goodness of the simple model (just motor direction) is much less than the complex model (including target speed). We will update the results in the next version.

      The choice of the three categories (PD shift, gain addition) is not completely justified in a satisfactory way. It would be nice to see whether these three main categories are confirmed by unsupervised methods.

      A good point. We will have a try with unsupervised methods. 

      The decoder analyses in Figure 2 provide evidence that target speed modulation may change over the trial. Therefore, it is important to see how the window considered for the firing rate in Figure 1 (currently 100ms pre - 100ms post movement onset) affects the results.

      Thanks for the suggestion and close reading. We will test the decoder in other epochs.

      (3) Decoder:

      One feature of the task is that the reach endpoints tile the entire perimeter of the target circle (Figure 1B). However, this feature is not exploited for much of the single-unit analyses. This is most notable in Figure 2, where the use of a SVM limits the decoding to discrete values (the endpoints are divided into 8 categories). Using continuous decoding of hand kinematics would be more appropriate for this task.

      This is a very reasonable suggestion. In this study, we discrete the reach-direction as the previous studies (Li et al., 2018&2022) and thought that the discrete decoding was already enough to show the interaction of sensory and motor variables. In future studies, we will try continuous decoding of hand kinematics.

      (4) RNN:

      Mixed selectivity is not analysed in the RNN, which would help to compare the model to the real data where mixed selectivity is common. Furthermore, it would be informative to compare the neural data to the RNN activity using canonical correlation or Procrustes analyses. These would help validate the claim of similarity between RNN and neural dynamics, rather than allowing comparisons to be dominated by geometric similarities that may be features of the task. There is also an absence of alternate models to compare the perturbation model results to.

      Thank you for these helpful suggestions. We will perform decoding analysis on RNN units to verify if there is interaction of sensory and motor variables as in real data, as well as the canonical correlation or Procrustes analysis.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Zhang et al. examine neural activity in the motor cortex as monkeys make reaches in a novel target interception task. Zhang et al. begin by examining the single neuron tuning properties across different moving target conditions, finding several classes of neurons: those that shift their preferred direction, those that change their modulation gain, and those that shift their baseline firing rates. The authors go on to find an interesting, tilted ring structure of the neural population activity, depending on the target speed, and find that (1) the reach direction has consistent positioning around the ring, and (2) the tilt of the ring is highly predictive of the target movement speed. The authors then model the neural activity with a single neuron representational model and a recurrent neural network model, concluding that this population structure requires a mixture of the three types of single neurons described at the beginning of the manuscript.

      Strengths:

      I find the task the authors present here to be novel and exciting. It slots nicely into an overall trend to break away from a simple reach-to-static-target task to better characterize the breadth of how the motor cortex generates movements. I also appreciate the movement from single neuron characterization to population activity exploration, which generally serves to anchor the results and make them concrete. Further, the orbital ring structure of population activity is fascinating, and the modeling work at the end serves as a useful baseline control to see how it might arise.

      Thank you for recognizing our work.

      Weaknesses:

      While I find the behavioral task presented here to be excitingly novel, I find the presented analyses and results to be far less interesting than they could be. Key to this, I think, is that the authors are examining this task and related neural activity primarily with a single-neuron representational lens. This would be fine as an initial analysis since the population activity is of course composed of individual neurons, but the field seems to have largely moved towards a more abstract "computation through dynamics" framework that has, in the last several years, provided much more understanding of motor control than the representational framework has. As the manuscript stands now, I'm not entirely sure what interpretation to take away from the representational conclusions the authors made (i.e. the fact that the orbital population geometry arises from a mixture of different tuning types). As such, by the end of the manuscript, I'm not sure I understand any better how the motor cortex or its neural geometry might be contributing to the execution of this novel task.

      The present study shows the sensory modulation on motor tuning in single units and neural state during motor execution period. It’s a pity that the findings were constrained in certain time windows. We are still working this topic, and hopefully will address related questions in our follow-up studies.

      Main Comments:

      My main suggestions to the authors revolve around bringing in the computation through a dynamics framework to strengthen their population results. The authors cite the Vyas et al. review paper on the subject, so I believe they are aware of this framework. I have three suggestions for improving or adding to the population results:

      (1) Examination of delay period activity: one of the most interesting aspects of the task was the fact that the monkey had a random-length delay period before he could move to intercept the target. Presumably, the monkey had to prepare to intercept at any time between 400 and 800 ms, which means that there may be some interesting preparatory activity dynamics during this period. For example, after 400ms, does the preparatory activity rotate with the target such that once the go cue happens, the correct interception can be executed? There is some analysis of the delay period population activity in the supplement, but it doesn't quite get at the question of how the interception movement is prepared. This is perhaps the most interesting question that can be asked with this experiment, and it's one that I think may be quite novel for the field--it is a shame that it isn't discussed.

      Great idea! We are on the way, and close to complete the puzzle.

      (2) Supervised examination of population structure via potent and null spaces: simply examining the first three principal components revealed an orbital structure, with a seemingly conserved motor output space and a dimension orthogonal to it that relates to the visual input. However, the authors don't push this insight any further. One way to do that would be to find the "potent space" of motor cortical activity by regression to the arm movement and examine how the tilted rings look in that space (this is actually fairly easy to see in the reach direction components of the dPCA plot in the supplement--the rings will be highly aligned in this space). Presumably, then, the null space should contain information about the target movement. dPCA shows that there's not a single dimension that clearly delineates target speed, but the ring tilt is likely evident if the authors look at the highest variance neural dimension orthogonal to the potent space (the "null space")--this is akin to PC3 in the current figures, but it would be nice to see what comes out when you look in the data for it.

      Nice suggestion. Target-speed modulation mainly influences PC3, which is consistent with ‘null space’ hypothesis. We will try other methods of dimensionality reduction (e.g. dPCA, Manopt) to determine the potent and null space.

      (3) RNN perturbations: as it's currently written, the RNN modeling has promise, but the perturbations performed don't provide me with much insight. I think this is because the authors are trying to use the RNN to interpret the single neuron tuning, but it's unclear to me what was learned from perturbing the connectivity between what seems to me almost arbitrary groups of neurons (especially considering that 43% of nodes were unclassifiable). It seems to me that a better perturbation might be to move the neural state before the movement onset to see how it changes the output. For example, the authors could move the neural state from one tilted ring to another to see if the virtual hand then reaches a completely different (yet predictable) target. Moreover, if the authors can more clearly characterize the preparatory movement, perhaps perturbations in the delay period would provide even more insight into how the interception might be prepared.

      We are sorry that we didn’t clarify the definition of “none” type, which can be misleading. The 43% unclassified nodes include those inactive ones, when only activate (task-related) nodes included, the ratio of unclassified nodes would be much lower. By perturbing the connectivity, we intended to explore the interaction between different modulations.

      Thank you for the great advice. We tried moving neural states from one ring to another without changing the directional cluster, but this perturbation didn’t have a significant influence on network performance as expected. We will check this result again and try perturbations in the delay period.

      Reviewer #3 (Public Review):

      Summary:

      This experimental study investigates the influence of sensory information on neural population activity in M1 during a delayed reaching task. In the experiment, monkeys are trained to perform a delayed interception reach task, in which the goal is to intercept a potentially moving target.

      This paradigm allows the authors to investigate how, given a fixed reach endpoint (which is assumed to correspond to a fixed motor output), the sensory information regarding the target motion is encoded in neural activity.

      At the level of single neurons, the authors found that target motion modulates the activity in three main ways: gain modulation (scaling of the neural activity depending on the target direction), shift (shift of the preferred direction of neurons tuned to reach direction), or addition (offset to the neural activity).

      At the level of the neural population, target motion information was largely encoded along the 3rd PC of the neural activity, leading to a tilt of the manifold along which reach direction was encoded that was proportional to the target speed. The tilt of the neural manifold was found to be largely driven by the variation of activity of the population of gain-modulated neurons.

      Finally, the authors studied the behaviour of an RNN trained to generate the correct hand velocity given the sensory input and reach direction. The RNN units were found to similarly exhibit mixed selectivity to the sensory information, and the geometry of the « neural population » resembled that observed in the monkeys.

      Strengths:

      - The experiment is well set up to address the question of how sensory information that is directly relevant to the behaviour but does not lead to a direct change in behavioural output modulates motor cortical activity.

      - The finding that sensory information modulates the neural activity in M1 during motor preparation and execution is non trivial, given that this modulation of the activity must occur in the nullspace of the movement.

      - The paper gives a complete picture of the effect of the target motion on neural activity, by including analyses at the single neuron level as well as at the population level. Additionally, the authors link those two levels of representation by highlighting how gain modulation contributes to shaping the population representation.

      Thanks for your recognition.

      Weaknesses:

      - One of the main premises of the paper is the fact that the motor output for a given reach point is preserved across different target motions. However, as the authors briefly mention in the conclusion, they did not record muscle activity during the task, but only hand velocity, making it impossible to directly verify how preserved muscle patterns were across movements. While the authors highlight that they did not see any difference in their results when resampling the data to control for similar hand velocities across conditions, this seems like an important potential caveat of the paper whose implications should be discussed further or highlighted earlier in the paper.

      Thanks for the suggestion. We will highlight the resampling results as important control in the next version of manuscript.

      - The main takeaway of the RNN analysis is not fully clear. The authors find that an RNN trained given a sensory input representing a moving target displays modulation to target motion that resembles what is seen in real data. This is interesting, but the authors do not dissect why this representation arises, and how robust it is to various task design choices. For instance, it appears that the network should be able to solve the task using only the motion intention input, which contains the reach endpoint information. If the target motion input is not used for the task, it is not obvious why the RNN units would be modulated by this input (especially as this modulation must lie in the nullspace of the movement hand velocity if the velocity depends only on the reach endpoint). It would thus be important to see alternative models compared to true neural activity, in addition to the model currently included in the paper. Besides, for the model in the paper, it would therefore be interesting to study further how the details of the network setup (eg initial spectral radius of the connectivity, weight regularization, or using only the target position input) affect the modulation by the motion input, as well as the trained population geometry and the relative ratios of modulated cells after training.

      Great suggestions. It’s a considerable pity that we didn’t dissect the formation reason and influence factor of the representation in the current version. We’ve tried several combinations of inputs before: in the network which received only motor intention and GO inputs, there were rings but not tilting related to target-speed; in the network which received only target location and GO inputs, there were ring-like structures but not clear directional clusters. We will check these results and try alternative models in the next version. In future studies, we will examine the influence of network setup details.

      - Additionally, it is unclear what insights are gained from the perturbations to the network connectivity the authors perform, as it is generally expected that modulating the connectivity will degrade task performance and the geometry of the responses. If the authors wish the make claims about the role of the subpopulations, it could be interesting to test whether similar connectivity patterns develop in networks that are not initialized with an all-to-all random connectivity or to use ablation experiments to investigate whether the presence of multiple types of modulations confers any sort of robustness to the network.

      Thank you for the great suggestions. By perturbations, we intended to explore the contribution of interaction between certain subpopulations. We tried ablation experiments, but the result was not significant. Probably because the most units were of mixed selectivity, the units of only modulations were not enough for bootstrapping, or the random sampling from single subpopulation (bearing mixed selectivity) could be repeated. We will consider these suggestions carefully in the revised version.

      - The results suggest that the observed changes in motor cortical activity with target velocity result from M1 activity receiving an input that encodes the velocity information. This also appears to be the assumption in the RNN model. However, even though the input shown to the animal during preparation is indeed a continuously moving target, it appears that the only relevant quantity to the actual movement is the final endpoint of the reach. While this would have to be a function of the target velocity, one could imagine that the computation of where the monkeys should reach might be performed upstream of the motor cortex, in which case the actual target velocity would become irrelevant to the final motor output. This makes the results of the paper very interesting, but it would be nice if the authors could discuss further when one might expect to see modulation by sensory information that does not directly affect motor output in M1, and where those inputs may come from. It may also be interesting to discuss how the findings relate to previous work that has found behaviourally irrelevant information is being filtered out from M1 (for instance, Russo et al, Neuron 2020 found that in monkeys performing a cycling task, context can be decoded from SMA but not from M1, and Wang et al, Nature Communications 2019 found that perceptual information could not be decoded from PMd)?

      How and where sensory information modulates M1 are very interesting and open questions. We will discuss further about this topic in the next version.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Semenova et al. have studied a large cross-sectional cohort of people living with HIV on suppressive ART, N=115, and performed high dimensional flow cytometry to then search for associations between immunological and clinical parameters and intact/total HIV DNA levels.

      A number of interesting data science/ML approaches were explored on the data and the project seems a serious undertaking. However, like many other studies that have looked for these kinds of associations, there was not a very strong signal. Of course, the goal of unsupervised learning is to find new hypotheses that aren't obvious to human eyes, but I felt in that context, there were (1) results slightly oversold, (2) some questions about methodology in terms mostly of reservoir levels, and (3) results were not sufficiently translated back into meaning in terms of clinical outcomes.

      We appreciate the reviewer’s perspective.  In our revised version of the manuscript, we have attempted to address these concerns by more adequately explaining the limitations of the study and by more thoroughly discussing the context of the findings.  We are not able to associate the findings with specific clinical outcomes for individual study participants but we speculate about the overall biological meaning of these associations across the cohort.  We cannot disagree with the reviewer, but we find the associations statistically significant, potentially reflecting real biological associations, and forming the basis for future hypothesis testing research. 

      Strengths:

      The study is evidently a large and impressive undertaking and combines many cutting-edge statistical techniques with a comprehensive experimental cohort of people living with HIV, notably inclusive of populations underrepresented in HIV science. A number of intriguing hypotheses are put forward that could be explored further. Sharing the data could create a useful repository for more specific analyses.

      We thank the reviewer for this assessment.

      Weaknesses:

      Despite the detailed experiments and methods, there was not a very strong signal for the variable(s) predicting HIV reservoir size. The Spearman coefficients are ~0.3, (somewhat weak, and acknowledged as such) and predictive models reach 70-80% prediction levels, though sometimes categorical variables are challenging to interpret.

      We agree with the reviewer that individual parameters are only weakly correlated with the HIV reservoir, likely reflecting the complex and multi-factorial nature of reservoir/immune cell interactions.  Nevertheless, these associations are statistically significant and form the basis for functional testing in viral persistence.

      There are some questions about methodology, as well as some conclusions that are not completely supported by results, or at minimum not sufficiently contextualized in terms of clinical significance.  On associations: the false discovery rate correction was set at 5%, but data appear underdetermined with fewer observations than variables (144vars > 115ppts), and it isn't always clear if/when variables are related (e.g inverses of one another, for instance, %CD4 and %CD8).

      When deriving a list of cell populations whose frequency would be correlated with the reservoir, we focused on well-defined cell types for which functional validation exists in the literature to consider them as distinct cell types.  For many of the populations, gating based on combinations of multiple markers leads to recovery of very few cells, and so we excluded some potential combinations from the analysis.  We are also making our raw data available for others to examine and find associations not considered by our manuscript.

      The modeling of reservoir size was unusual, typically intact and defective HIV DNA are analyzed on a log10 scale (both for decays and predicting rebound). Also, sometimes in this analysis levels are normalized (presumably to max/min?, e.g. S5), and given the large within-host variation of level we see in other works, it is not trivial to predict any downstream impact of normalization across population vs within-person.

      We have repeated the analysis using log10 transformed data and the new figures are shown in Figure 1 and S2-S5.

      Also, the qualitative characterization of low/high reservoir is not standard and naturally will split by early/later ART if done as above/below median. Given the continuous nature of these data, it seems throughout that predicting above/below median is a little hard to translate into clinical meaning.

      Our ML models included time before ART as a variable in the analysis, and this was not found to be a significant driver of the reservoir size associations, except for the percentage of intact proviruses (see Figure 2C). Furthermore, we analyzed whether any of the reservoir correlated immune variables were associated with time on ART and found that, although some immune variables are associated with time on therapy, this was not the case for most of them (Table S4). We agree that it is challenging to translate above or below median into clinical meaning for this cohort, but we emphasize that this study is primarily a hypothesis generating approach requiring additional validation for the associations observed.  We attempted to predict reservoir size as a continuous variable using the data and this approach was not successful (Figure S13). We believe that a significantly larger cohort will likely be required to generate a ML model that can accurately predict the reservoir as a continuous variable.  We have added additional discussion of this to the manuscript.

      Lastly, the work is comprehensive and appears solid, but the code was not shared to see how calculations were performed.

      We now provide a link to the code used to perform the analyses in the manuscript, https://github.com/lesiasemenova/ML_HIV_reservoir.

      Reviewer #2 (Public Review):

      Summary:

      Semenova et. al., performed a cross-sectional analysis of host immunophenotypes (using flow cytometry) and the peripheral CD4+ T cell HIV reservoir size (using the Intact Proviral DNA Assay, IPDA) from 115 people with HIV (PWH) on ART. The study mostly highlights the machine learning methods applied to these host and viral reservoir datasets but fails to interpret these complex analyses into (clinically, biologically) interpretable findings. For these reasons, the direct translational take-home message from this work is lost amidst a large list of findings (shown as clusters of associated markers) and sentences such as "this study highlights the utility of machine learning approaches to identify otherwise imperceptible global patterns" - lead to overinterpretation of their data.

      We have addressed the reviewer’s concern by modifications to the manuscript that enhance the interpretation of the findings in a clinical and biological context.

      Strengths:

      Measurement of host immunophenotyping measures (multiparameter flow cytometry) and peripheral HIV reservoir size (IPDA) from 115 PWH on ART.

      Major Weaknesses:

      (1) Overall, there is little to no interpretability of their machine learning analyses; findings appear as a "laundry list" of parameters with no interpretation of the estimated effect size and directionality of the observed associations. For example, Figure 2 might actually give an interpretation of each X increase in immunophenotyping parameter, we saw a Y increase/decrease in HIV reservoir measure.

      We have added additional text to the manuscript in which we attempt to provide more immunological and clinical interpretation of the associations.  We also have emphasized that these associations are still speculative and will require additional validation.  Nevertheless, our data should provide a rich source of new hypotheses regarding immune system/reservoir interaction that could be tested in future work.

      (2) The correlations all appear to be relatively weak, with most Spearman R in the 0.30 range or so.

      We agree with the review that the associations are mostly weak, consistent with previous studies in this area.  This likely is an inherent feature of the underlying biology – the reservoir is likely associated with the immune system in complex ways and involves stochastic processes that will limit the predictability of reservoir size using any single immune parameter. We have added additional text to the manuscript to make this point clearer.

      (3) The Discussion needs further work to help guide the reader. The sentence: "The correlative results from this present study corroborate many of these studies, and provide additional insights" is broad. The authors should spend some time here to clearly describe the prior literature (e.g., describe the strength and direction of the association observed in prior work linking PD-1 and HIV reservoir size, as well as specify which type of HIV reservoir measures were analyzed in these earlier studies, etc.) and how the current findings add to or are in contrast to those prior findings.

      We have added additional text to the manuscript to help guide the readers through the possible biological significance of the findings and the context with respect to prior literature.

      (4) The most interesting finding is buried on page 12 in the Discussion: "Uniquely, however, CD127 expression on CD4 T cells was significantly inversely associated with intact reservoir frequency." The authors should highlight this in the abstract, and title, and move this up in the Discussion. The paper describes a very high dimensional analysis and the key takeaways are not clear; the more the author can point the reader to the take-home points, the better their findings can have translatability to future follow-up mechanistic and/or validation studies.

      We appreciate the reviewer’s comment.  We have increased the emphasis on this finding in the revised version of the manuscript.

      (5) The authors should avoid overinterpretation of these results. For example in the Discussion on page 13 "The existence of two distinct clusters of PWH with different immune features and reservoir characteristics could have important implications for HIV cure strategies - these two groups may respond differently to a given approach, and cluster membership may need to be considered to optimize a given strategy." It is highly unlikely that future studies will be performing the breadth of parameters resulting here and then use these directly for optimizing therapy.

      Our analyses indicate that membership of study participants in cluster1 or cluster 2 can be fairly accurately determined by a small number of individual parameters (KLRG1 etc, Figure 4F), and measuring the cells of PWH with the degree of breadth used in this paper would not be necessary to classify PWH into these clusters.  As such, we feel that it is not unrealistic to speculate that this finding could turn out to be clinically useful, if it becomes clear that the clusters are biologically meaningful.

      (6) There are only TWO limitations listed here: cross-sectional study design and the use of peripheral blood samples. (The subsequent paragraph notes an additional weakness which is misclassification of intact sequences by IPDA). This is a very limited discussion and highlights the need to more critically evaluate their study for potential weaknesses.

      We have expanded on the list of limitations discussed in the manuscript. In particular, we now address the size of the cohort, the composition with respect to different genders and demographics, lack of information for the timing of ART and the lack of information regarding intracellular transcriptional pathways.

      (7) A major clinical predictor of HIV reservoir size and decay is the timing of ART initiation. The authors should include these (as well as other clinical covariate data - see #12 below) in their analyses and/or describe as limitations of their study.

      All of the participants that make up our cohort were treated during chronic infection, and the precise timing of ART initiation is unclear in most of these cases.  We have added additional information to explain this in the manuscript and include this in the list of limitations.

      Reviewer #3 (Public Review):

      Summary:

      This valuable study by Semenova and colleagues describes a large cross-sectional cohort of 115 individuals on ART. Participants contributed a single blood sample which underwent IPDA, and 25-color flow with various markers (pre and post-stimulation). The authors then used clustering, decision tree analyses, and machine learning to look for correlations between these immunophenotypic markers and several measures of HIV reservoir volume. They identified two distinct clusters that can be somewhat differentiated based on total HIV DNA level, intact HIV DNA level, and multiple T cell cellular markers of activation and exhaustion.

      The conclusions of the paper are supported by the data but the relationships between independent and dependent variables in the models are correlative with no mechanistic work to determine causality. It is unclear in most cases whether confounding variables could explain these correlations. If there is causality, then the data is not sufficient to infer directionality (ie does the immune environment impact the HIV reservoir or vice versa or both?). In addition, even with sophisticated and appropriate machine learning approaches, the models are not terribly predictive or highly correlated. For these reasons, the study is very much hypothesis-generating and will not impact cure strategies or HIV reservoir measurement strategies in the short term.

      We appreciate the reviewer’s comments regarding the value of our study.  We fully acknowledge that the causal nature and directionality of these associations are not yet clear and agree that the study is primarily hypothesis generating in nature.  Nevertheless, we feel that the hypotheses generated will be valuable to the field.  We have added additional text to the manuscript to emphasize the hypothesis generating nature of this paper.

      Strengths:

      The study cohort is large and diverse in terms of key input variables such as age, gender, and duration of ART. Selection of immune assays is appropriate. The authors used a wide array of bioinformatic approaches to examine correlations in the data. The paper was generally well-written and appropriately referenced.

      Weaknesses:

      (1) The major limitation of this work is that it is highly exploratory and not hypothesis-driven. While some interesting correlations are identified, these are clearly hypothesis-generating based on the observational study design.

      We agree that the major goal of this study was hypothesis generating and that our work is exploratory in nature. Performing experiments with mechanism testing goals in human participants with HIV is challenging.  Additionally, before such mechanistic studies can be undertaken, one must have hypotheses to test. As such we feel our study will be useful for the field in helping to identify hypotheses that could potentially be tested.

      (2) The study's cross-sectional nature limits the ability to make mechanistic inferences about reservoir persistence. For instance, it would be very interesting to know whether the reservoir cluster is a feature of an individual throughout ART, or whether this outcome is dynamic over time.

      We agree with the reviewer’s comment. Longitudinal studies are challenging to carry out with a study cohort of this size, and addressing questions such as the one raised by the reviewer would be of great interest. We believe our study nevertheless has value in identifying hypotheses that could be tested in a longitudinal study.

      (3) A fundamental issue is that I am concerned that binarizing the 3 reservoir metrics in a 50/50 fashion is for statistical convenience. First, by converting a continuous outcome into a simple binary outcome, the authors lose significant amounts of quantitative information. Second, the low and high reservoir outcomes are not actually demonstrated to be clinically meaningful: I presume that both contain many (?all) data points above levels where rebound would be expected soon after interruption of ART. Reservoir levels would also have no apparent outcome on the selection of cure approaches. Overall, dividing at the median seems biologically arbitrary to me.

      The reviewer raises a valid point that the clinical significance of above or below median reservoir metrics is unclear, and that the size of the reservoir has potentially little relation to rebound and cure approaches.  In the manuscript, we attempted to generate models that can predict reservoir size as a continuous variable in Figure S13 and find that this approach performs poorly, while a binarized approach was more successful. As such we have included both approaches in the manuscript.  It is possible that future studies with larger sample sizes and more detailed measurements will perform better for continuous variable prediction.  While this is a fairly large study (n=115) by the standards of HIV reservoir analyses, it is a small study by the standards of the machine learning field, and accurate predictive ML models for reservoir size as a continuous variable will likely require a much larger set of samples/participants.  Nevertheless, we feel our work has value as a template for ML approaches that may be informative for understanding HIV/immune interactions and generates novel hypotheses that could be validated by subsequent studies.

      (4) The two reservoir clusters are of potential interest as high total and intact with low % intact are discriminated somewhat by immune activation and exhaustion. This was the most interesting finding to me, but it is difficult to know whether this clustering is due to age, time on ART, other co-morbidity, ART adherence, or other possible unmeasured confounding variables.

      We agree that this finding is one of the more interesting outcomes of the study. We examined a number of these variables for association with cluster membership, and these data are reported in Figure S8A-D.  Age, years of ART and CD4 Nadir were all clearly different between the clusters.   The striking feature of this clustering, however, is the clear separation between the two groups of participants, as opposed to a continuous gradient of phenotypes.  This could reflect a bifurcation of outcomes for people with HIV, dynamic changes in the reservoir immune interactions over time, or different levels of untreated infection.  It is certainly possible that some other unmeasured confounding variables contribute to this outcome and we have attempted to make this limitation clearer.

      (5) At the individual level, there is substantial overlap between clusters according to total, intact, and % intact between the clusters. Therefore, the claim in the discussion that these 2 cluster phenotypes may require different therapeutic approaches seems rather speculative. That said, the discussion is very thoughtful about how these 2 clusters may develop with consideration of the initial insult of untreated infection and / or differences in immune recovery.

      We agree with the reviewer that this claim is speculative, and we have attempted to moderate the language of the text in the revised version.

      (6) The authors state that the machine learning algorithms allow for reasonable prediction of reservoir volume. It is subjective, but to me, 70% accuracy is very low. This is not a disappointing finding per se. The authors did their best with the available data. It is informative that the machine learning algorithms cannot reliably discriminate reservoir volume despite substantial amounts of input data. This implies that either key explanatory variables were not included in the models (such as viral genotype, host immune phenotype, and comorbidities) or that the outcome for testing the models is not meaningful (which may be possible with an arbitrary 50/50 split in the data relative to median HIV DNA volumes: see above).

      We acknowledge that the predictive power of the models generated from these data is modest and we have clarified this point in the revised manuscript. As the reviewer indicates, this may result from the influence of unmeasured variables and possible stochastic processes.  The data may thus demonstrate a limit to the predictability of reservoir size which may be inherent to the underlying biology.  As we mention above, this study size (n-115) is fairly small for the application of ML methods, and an increased sample size will likely improve the accuracy of the models. At this stage, the models we describe are not yet useful as predictive clinical tools, but are still nonetheless useful as tools to describe the structure of the data and identify reservoir associated immune cell types.

      (7) The decision tree is innovative and a useful addition, but does not provide enough discriminatory information to imply causality, mechanism, or directionality in terms of whether the immune phenotype is impacting the reservoir or vice versa or both. Tree accuracy of 80% is marginal for a decision tool.

      The reviewer is correct about these points.  In the revised manuscript, we have attempted to make it clear that we are not yet advocating using this approach as a decision tool, but simply a way to visualize the data and understand the structure of the dataset.  As we discuss above, the models will likely need to be trained on a larger dataset and achieve higher accuracy before use as a decision tool.

      (8) Figure 2: this is not a weakness of the analysis but I have a question about interpretation. If total HIV DNA is more predictive of immune phenotype than intact HIV DNA, does this potentially implicate a prior high burden of viral replication (high viral load &/or more prolonged time off ART) rather than ongoing reservoir stimulation as a contributor to immune phenotype? A similar thought could be applied to the fact that clustering could only be detected when applied to total HIV DNA-associated features. Many investigators do not consider defective HIV DNA to be "part of the reservoir" so it is interesting to speculate why these defective viruses appear to have more correlation with immunophenotype than intact viruses.

      We agree with the reviewer that this observation could reflect prior viral burden and we have added additional text to make this clearer.  Even so, we cannot rule out a model in which defective viral DNA is engaged in ongoing stimulation of the immune system during ART, leading to the stronger association between total DNA and the immune cell phenotypes. We hypothesize that the defective proviruses could potentially be triggering innate immune pattern recognition receptors via viral RNA or DNA, and a higher burden of the total reservoir leads to a stronger apparent association with the immune phenotype.  We have included text in the discussion about this hypothesis.

      (9) Overall, the authors need to do an even more careful job of emphasizing that these are all just correlations. For instance, HIV DNA cannot be proven to have a causal effect on the immunophenotype of the host with this study design. Similarly, immunophenotype may be affecting HIV DNA or the correlations between the two variables could be entirely due to a separate confounding variable

      We have revised the text of the manuscript to emphasize this point, and we acknowledge that any causal relationships are, at this point, simply speculation. 

      (10) In general, in the intro, when the authors refer to the immune system, they do not consistently differentiate whether they are referring to the anti-HIV immune response, the reservoir itself, or both. More specifically, the sentence in the introduction listing various causes of immune activation should have citations. (To my knowledge, there is no study to date that definitively links proviral expression from reservoir cells in vivo to immune activation as it is next to impossible to remove the confounding possible imprint of previous HIV replication.) Similarly, it is worth mentioning that the depletion of intact proviruses is quite slow such that provial expression can only be stimulating the immune system at a low level. Similarly, the statement "Viral protein expression during therapy likely maintains antigen-specific cells of the adaptive immune system" seems hard to dissociate from the persistence of immune cells that were reactive to viremia.

      We updated the text of the manuscript to address these points and have added additional citations as per the reviewer’s suggestion.

      (11) Given the many limitations of the study design and the inability of the models to discriminate reservoir volume and phenotype, the limitations section of the discussion seems rather brief.

      We have now expanded the limitations section of the discussion and added additional considerations. We now include a discussion of the study cohort size, composition and the detail provided by the assays.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      A few specific comments:

      "This pattern is likely indicative of a more profound association of total HIV DNA with host immunophenotype relative to intact HIV DNA."

      Most studies I have seen (e.g. single cell from Lictherfeld/Yu group) show intact proviruses are generally more activated/detectable/susceptible to immune selection, so I have a hard time thinking defective proviruses are actually more affected by immunotype.

      We hypothesize that this association is actually occurring in the opposite direction – that the defective provirus are having a greater impact on the immune phenotype, due to their greater number and potential ability to engage innate or adaptive immune receptors. We have clarified this point in the manuscript

      "The existence of two distinct clusters of PWH with different immune features and reservoir characteristics could have important implications for HIV cure strategies - these two groups may respond differently to a given approach, and cluster membership may need to be considered to optimize a given strategy."

      I find this a bit of a reach, given that the definition of 2 categories depended on the total size.

      We have modified the language of this section to reduce the level of speculation.

      "This study is cross-sectional in nature and is primarily observational, so caution should be used interpreting findings associated with time on therapy".

      I found this an interesting statement because ultimately time on ART shows up throughout the analysis as a significant predictor, do you mean something about how time on ART could indicate other confounding variables like ART regimen or something?

      We have rephrased this comment to avoid confusion.  We were simply trying to make the point that we should avoid speculating about longitudinal dynamics from cross sectional data.

      "As expected, the plots showed no significant correlation for intact HIV DNA versus years of ART (Figure 1B), while total reservoir size was positively correlated with the time of ART (Figure 1A, Spearman r = 0.31)."<br />  Is this expected? Studies with longitudinal data almost uniformly show intact decay, at least for the first 10 or so years of ART, and defective/total stability (or slight decay). Also probably "time on ART" to not confuse with the duration of infection before ART.

      We have updated the language of this section to address this comment.  We have avoided comparing our data with respect to time on ART to longitudinal studies for reasons given above.

      On dimensionality reduction, as this PaCMAP seems a relatively new technique (vs tSNE and UMAP which are more standard, but absolutely have their weaknesses), it does seem important to contextualize. I think it would still be useful to show PCA and asses the % variance of each additional dimension to assess the effective dimensionality, it would be helpful to show a plot of % variance by # components to see if there is a cutoff somewhere, and if PaCMAP is really picking this up to determine the 2 dimensions/2 clusters is ideal. Figure 4B ultimately shows a lot of low/high across those clusters, and since low/high is defined categorically it's hard to know which of those dots are very close to the other categories.

      We have added this analysis to the manuscript – found in Figure S9. The PCA plot indicates that members of the two clusters also separate on PCA although this separation is not as clear as for the PaCMAP plot.

      Minor comments on writing etc:

      Intro

      -Needs some references on immune activation sequelae paragraph.

      We have added some additional references to this section.

      -"promote the entry of recently infected cells into the reservoir" -- that is only one possible mechanistic explanation, it's not unreasonable but it seems important to keep options open until we have more precise data that can illuminate the mechanism of the overabundance.

      We have modified the text to discuss additional hypotheses.

      -You might also reference Pankau et al Ppath for viral seeding near the time of ART.

      We have added this reference.

      -"Viral protein expression during therapy likely maintains antigen-specific cells of the adaptive immune system" - this was unclear to me, do you mean HIV-specific cells that act against HIV during ART? I think most studies show immunity against HIV (CD8 and CD4) wanes over time during ART.

      The Goonetilleke lab has recently generated data indicating that antiviral T cell responses are remarkably stable over time on ART, but we agree with the reviewer that the idea that ongoing antigen expression in the reservoir maintains these cells is speculative.  We have modified the text to make this point clearer.

      -Overall I think the introduction lacked a little bit of definitional precision: i.e. is the reservoir intact vs replication competent vs all HIV DNA and whether we are talking about PWH on long-term ART and how long we should be imagining? The first years of ART are certainly different than later, in terms of dynamics. The ultimate implications are likely specific for some of these categorizations.

      -"persistent sequelae of the massive disruptions to T cell homeostasis and lymphoid structures that occur during untreated HIV infection" needs a lot more context/referencing. For instance, Peter Hunt showed a decrease in activation after ART a long time ago.

      -Heather Best et al show T cell clonality stays perturbed after ART.

      We have updated the text of the introduction and added references to address the reviewer’s comments.

      Results

      -It would be important to mention the race of participants and any information about expected clades of acquired viruses, this gets mentioned eventually with reference to the Table but the breakdown would be helpful right away.

      We have added this information to the results section.

      -"performed Spearman correlations", may be calculated or tested?

      We have corrected the language for this sentence.

      Comments on figures:

      -Figure 1 data on linear scale (re discussion above) -- hard to even tell if there is a decay (to match with all we know from various long-term ART studies).

      -Figure 4 data is shown on ln (log_e) scale, which is hard to interpret for most people.

      -Figures 4 C,D, and E should have box plots to visually assess the significance.

      -Figure 4B legend says purple/pink but I think the colors are different in the plot, could be about transparency

      -Figure 5 it is now not clear if log_e(?).

      -Figure 6 "HIV reservoir characteristics" might be better to make this more explicit. Do you mean for instance in the 6B title Total HIV DNA per million CD4+ T cells I think?

      We have made these modifications.

      Reviewer #2 (Recommendations For The Authors):

      Minor Weaknesses:

      (1) The Introduction is too long and much of the text is not directly related to the study's research question and design.

      We have streamlined the introduction in the revised manuscript.

      (2) While no differences were seen by age or race, according to the authors, this is unlikely to be useful since the numbers are so small in some of these subcategories. Results from sensitivity analyses (e.g., excluding these individuals) may be more informative/useful.

      We agree that the lower numbers of participants for some subgroupings makes it challenging to know for sure if there are any differences based on these variables.  Have added text to clarify this. We have added age, race and gender to the LOCO analysis and to the variable inflation importance analysis (Table S5).

      (3) For Figure 4, based on what was described in the Results section of the manuscript, the authors should clarify that the figures show results for TOTAL HIV DNA only (not intact DNA): "Dimension reduction machine learning approaches identified two robust clusters of PWH when using total HIV DNA reservoir-associated immune cell frequencies (Figure 4A), but not for intact or percentage intact HIV DNA (Figure 4B and 4C)".

      We have added this information.

      (4) The statement on page 5, first paragraph, "Interestingly, when we examined a plot of percent intact proviruses versus time on therapy (Figure 1C), we observed a biphasic decay pattern," is not new (Peluso JCI Insight 2020, Gandhi JID 2023, McMyn JCI 2023). Prior studies have clearly demonstrated this biphasic pattern and should be cited here, and the sentence should be reworded with something like "consistent with prior work", etc.

      We have added citations to these studies and rephrased this comment.

      (5) The Cohort and sample collection sections are somewhat thin. Further details on the cohort details should include at the very minimum some description of the timing of ART initiation (is this mostly a chronic-treated cohort?) and important covariate data such as nadir CD4+ T cell count, pre-ART viral load, duration of ART suppression, etc.

      The cohort was treated during chronic infection, and we have clarified this in the manuscript.  Information regarding CD4 nadir and years on ART are included in Table 1.  Unfortunately, pre-ART viral load was not available for most members of this cohort, so we did not use it for analyses. The partial pre-ART viral load data is included with the dataset we are making publicly available.

      Reviewer #3 (Recommendations For The Authors):

      Minor points:

      (1) What is meant by CD4 nadir? Is this during primary infection or the time before ART initiation?

      We have clarified this description in the manuscript.  This term refers to the lowest CD4 count recorded during untreated infection.

      (2) The authors claim that determinants of reservoir size are starting to emerge but other than the timing of ART, I am not sure what studies they are referring to.

      We have updated the language of this section.  We intended to refer to studies looking at correlates of reservoir size, and feel that this is a more appropriate term that ‘determinants’

      (3) The discussion does not tie in the model-generated hypotheses with the known mechanisms that sustain the reservoir: clonal proliferation balanced by death and subset differentiation. It would be interesting to tie in the proposed reservoir clusters with these known mechanisms.

      We have added additional text to the manuscript to address these mechanisms.

      (4) Figure 1: Total should be listed as total HIV DNA.

      We have updated this in the manuscript.

      (5) Figure 1C: Worth mentioning the paper by Reeves et al which raises the possibility that the flattening of intact HIV DNA at 9 years may be spurious due to small levels of misclassification of defective as intact.

      We have added this reference.

      (6) "Total reservoir frequency" should be "total HIV DNA concentration"

      We respectfully feel that “frequency” is a more accurate term than “concentration”, since we are expressing the reservoir as a fraction of the CD4 T cells, while “concentration” suggests a denominator of volume.

      (7) Figure S2-5: label y-axis total HIV DNA.

      We have updated this figure.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In the manuscript entitled "Rtf1 HMD domain facilitates global histone H2B monoubiquitination and regulates morphogenesis and virulence in the meningitis-causing pathogen Cryptococcus neoformans" by Jiang et al., the authors employ a combination of molecular genetics and biochemical approaches, along with phenotypic evaluations and animal models, to identify the conserved subunit of the Paf1 complex (Paf1C), Rtf1, and functionally characterize its critical roles in mediating H2B monoubiquitination (H2Bub1) and the consequent regulation of gene expression, fungal development, and virulence traits in C. deneoformans or C. neoformans. Specially, the authors found that the histone modification domain (HMD) of Rtf1 is sufficient to promote H2B monoubiquitination (H2Bub1) and the expression of genes related to fungal mating and filamentation, and restores the fungal morphogenesis and pathogenicity defects caused by RTF1 deletion.

      Strengths:

      The manuscript is well-written and presents the findings in a clear manner. The findings are interesting and contribute to a better understanding of Rtf1-mediated epigenetic regulation of fungal morphogenesis and pathogenicity in a major human fungal pathogen, and potentially in other fungal species, as well.

      Weaknesses:

      A major limitation of this study is the absence of genome-wide information on Rtf1-mediated H2B monoubiquitination (H2Bub1), as well as a lack of detail regarding the function of the Plus3 domain. Although overexpression of HMD in the rtf1Δ mutant restored global H2Bub1 levels, it did not rescue certain critical biological functions, such as growth at 39 °C and melanin production (Figure 4C-D). This suggests that the precise positioning of H2Bub1 is essential for Rtf1's function. A comprehensive epigenetic landscape of H2Bub1 in the presence of HMD or full-length Rtf1 would elucidate potential mechanisms and shed light on the function of the Plus3 domain.

      We thank the reviewer (and other reviewers) for this excellent suggestion. We have planned to carry out CUT&Tag assay to gain a comprehensive epigenetic landscape of H2Bub1 in the presence of HMD or full-length Rtf1 under conditions, where overexpression of HMD failed to rescue the phenotypes in the _rtf1_Δ mutant, such as growth at 39 °C.

      Reviewer #2 (Public Review):

      Summary:

      The authors set out to determine the role of Rtf1 in Cryptococcal biology, and demonstrate that Rtf1 acts independently of the Paf1 complex to exert regulation of Histone H2B monoubiquitylation (H2Bub1). The biological impact of the loss of H2Bub1 was observed in defects in morphogenesis, reduced production of virulence factors, and reduced pathogenic potential in animal models of cryptococcal infection.

      Strengths:

      The molecular data is quite compelling, demonstrating that the Rtf1-depednent functions require only this histone modifying domain of Rtf1, and are dependent on nuclear localization. A specific point mutation in a residue conserved with the Rtf1 protein in the model yeast demonstrates the conservation of that residue in H2Bub1 modification. Interestingly, whereas expression of the HMD alone suppressed the virulence defect of the rtf1 deletion mutant, it did not suppress defects in virulence factor production.

      Weaknesses:

      The authors use two different species of Cryptococcus to investigate the biological effect of Rtf1 deletion. The work on morphogenesis utilized C. deneoformans, which is well-known to be a robust mating strain. The virulence work was performed in the C. neoformans H99 background, which is a highly pathogenic isolate. The study would be more complete if each of these processes were assessed in the other strain to understand if these biological effects are conserved across the two species of Cryptococcus. H99 is not as robust in morphogenesis, but reproducible results assessing mating and filamentation in this strain have been performed. Similarly, C. deneoformans does produce capsule and melanin.

      This is a fair point raised by the reviewer, and we are going to test whether these biological effects are conserved across the two species. We will access effects of RTF1 deletion on bisexual mating hyphal formation in C. neoformans H99 background and capsule and melanin productions in C. deneoformans XL280 background.

      There are some concerns with the conclusions related to capsule induction. The images reported in Figure B are purported to be grown under capsule-inducing conditions, yet the H99 panel is not representative of the induced capsule for this strain. Given the lack of a baseline of induction, it is difficult to determine if any of the strains may be defective in capsule induction. Quantification of a population of cells with replicates will also help to visualize the capsular diversity in each strain population.

      We thank the reviewer for raising this concern. We are going to confirm the conclusions related to capsule induction under multiple capsule-inducing conditions, including Dulbecco’s Modified Eagle’s Medium (DMEM), Littman’s medium, and 10% fetal bovine serum (FBS) agar medium [1].

      The authors demonstrate that for specific mating-related genes, the expression of the HMD recapitulated the wild-type expression pattern. The RNA-seq experiments were performed under mating conditions, suggesting specificity under this condition. The authors raise the point in the discussion that there may be differences in Rtf1 deposition on chromatin in H99, and under conditions of pathogenesis. The data that overexpression of HMD restores H2Bub1 by western is quite compelling, but does not address at which promoters H2Bub1 is modulating expression under pathogenesis conditions, and when full-length Rtf1 is present vs. only the HMD.

      We thank the reviewer for raising these concerns. As mentioned in the response to Reviewer 1, our CUT&Tag assay will provide evidence to address these questions.

      Reviewer #3 (Public Review):

      Summary:

      In this very comprehensive study, the authors examine the effects of deletion and mutation of the Paf1C protein Rtf1 gene on chromatin structure, filamentation, and virulence in Cryptococcus.

      Strengths:

      The experiments are well presented and the interpretation of the data is convincing.

      Weaknesses:

      Yet, one can be frustrated by the lack of experiments that attempt to directly correlate the change in chromatin structure with the expression of a particular gene and the observed phenotype. For example, the authors observed a strong defect in the expression of ZNF2, a known regulator of filamentation, mating, and virulence, in the rtf1 mutant. Can this defect explain the observed phenotypes associated with the RTF1 mutation? Is the observed defect in melanin production associated with altered expression of laccase genes and altered chromatin structure at this locus?

      We completely agree with the reviewer, and as mentioned in our response to Reviewer 1 and 2, we are going to conduct CUT&Tag assay to investigate the genetic relationship between Rtf1-mediated H2Bub1 and the expression of particular genes.

      (1) Jang, E.-H., et al., Unraveling Capsule Biosynthesis and Signaling Networks in Cryptococcus neoformans. Microbiology Spectrum, 2022. 10(6): p. e02866-22.

    1. Author response:

      We thank the editor and reviewers for the time they spent reviewing our manuscript entitled ‘Overnight fasting facilitates safety learning by changing the neurophysiological response to relief from threat omission’ which was sent as an original paper for a potential publication in eLife.

      Since we take the reviewer comments at heart and recognize the very complex scenario of our previous and current results we will take more time to re-think the paper. This time will serve us to look back to the interpretation of the results of our previous behavioral study, to the preregistration plan as well as findings of our current fMRI (replication) study.

      We aim to address the fundamental issues indicated by the reviewers as soon and as clearly as possible.

    1. Author response:

      “Overall, the paper has several strengths, including leveraging large-scale, multi-modal datasets, using computational reasonable tools, and having an in-depth discussion of the significant results.”

      We thank the reviewer for the very supportive comments.

      Based on the comments and questions, we have grouped the concerns and corresponding responses into three categories.

      (1) The scope and data selection

      “The results are somewhat inconclusive or not validated.

      The overall results are carefully designed, but most of the results are descriptive. While the authors are able to find additional evidence either from the literature or explain the results with their existing knowledge, none of the results have been biologically validated. Especially, the last three result sections (signaling pathways, eQTLs, and TF binding) further extended their findings, but the authors did not put the major results into any of the figures in the main text.”

      The goal of this manuscript is to provide a list of putative childhood obesity target genes to yield new insights and help drive further experimentation. Moreover, the outputs from signaling pathways, eQTLs, and TF binding, although noteworthy and supportive of our method, were not particularly novel. In our manuscript we placed our focus on the novel findings from the analyses. We did, however, report the part of the eQTLs analysis concerning ADCY3, which brought new insight to the pathology of obesity, in Figure 4C.

      “The manuscript would benefit from an explanation regarding the rationale behind the selection of the 57 human cell types analyzed. it is essential to clarify whether these cell types have unique functions or relevance to childhood development and obesity.”

      We elected to comprehensively investigate the GWAS-informed cellular underpinnings of childhood development and obesity. By including a diverse range of cell types from different tissues and organs, we sought to capture the multifaceted nature of cellular contributions to obesity-related mechanisms, and open new avenues for targeted therapeutic interventions.

      There are clearly cell types that are already established as being key to the pathogenesis of obesity when dysregulated: adipocytes for energy storage, immune cell types regulating inflammation and metabolic homeostasis, hepatocytes regulating lipid metabolism, pancreatic cell types intricately involved in glucose and lipid metabolism, skeletal muscle for glucose uptake and metabolism, and brain cell types in the regulation of appetite, energy expenditure, and metabolic homeostasis.

      While it is practical to focus on cell types already proven to be associated with or relevant to obesity, this approach has its limitations. It confines our understanding to established knowledge and rules out the potential for discovering novel insights from new cellular mechanisms or pathways that could play significant roles in the pathogenesis if obesity. Therefore, it is was essential to reflect known biology against the unexplored cell types to expand our overall understanding and potentially identify innovative targets for treatment or prevention.

      “I wonder whether the used epigenome datasets are all from children. Although the authors use literature to support that body weight and obesity remain stable from infancy to adulthood, it remains uncertain whether epigenomic data from other life stages might overlook significant genetic variants that uniquely contribute to childhood obesity.”

      The datasets utilized in our study were derived from a combination of sources, both pediatric and adult. We recognize that epigenetic profiles can vary across different life stages but our principal effort was to characterize susceptibility BEFORE disease onset.

      “Given that the GTEx tissue samples are derived from adult donors, there appears to be a mismatch with the study's focus on childhood obesity. If possible, identifying alternative validation strategies or datasets more closely related to the pediatric population could strengthen the study's findings.” 

      We thank the reviewer for raising this important point. We acknowledge that the GTEx tissue samples are derived from adult donors, which might not perfectly align with the study's focus on childhood obesity. The ideal strategy would be a longitudinal design that follows individuals from childhood into adulthood to bridge the gap between pediatric and adult data, offering systematic insights into how early-life epigenetic markers influencing obesity later in life. In future work, we aim to carry out such efforts, which will represent substantial time and financial commitment.

      Along the same lines, the Developmental Genotype-Tissue Expression (dGTEx) Project is a new effort to study development-specific genetic effects on gene expression at 4 developmental windows spanning from infant to post-puberty (0-18 years). Donor recruitment began in August 2023 and remains ongoing. Tissue characterization and data production are underway. We hope that with the establishment of this resource, our future research in the field of pediatric health will be further enhanced.

      “Figure 1B: in subplots c and d, the results are either from Hi-C or capture-C. Although the authors use different colors to denote them, I cannot help wondering how much difference between Hi-C and capture-C brings in. Did the authors explore the difference between the Hi-C and capture-C?”.

      Thank you for your comment. It is not within the scope of our paper to explore the differences between the Hi-C and Capture-C methods. In the context of our study, both methods serve the same purpose of detecting chromatin loops that bring putative enhancers to sometimes genomically distant gene promoters. Consequently, our focus was on utilizing these methods to identify relevant chromatin interactions rather than comparing their technical differences.

      (2) Details on defining different categories of the regions of interest

      “Some technical details are missing.

      While the authors described all of their analysis steps, a lot of the time, they did not mention the motivation. Sometimes, the details were also omitted.”

      We will add a section to the revision to address the rationale behind different OCRs categories.

      “Line 129: should "-1,500/+500bp" be "-500/+500bp"? 

      A gene promoter was defined as a region 1,500 bases upstream to 500 bases downstream of the TSS. Most transcription factor binding sites are distributes upstream (5’) from TSS, and the assembly of transcription machinery occurs up to 1000 bases 5’ from TSS. Given our interest in SNPs that can potentially disrupt transcription factor binding, this defined promoter length allowed us to capture such SNPs in our analyses.

      “How did the authors define a contact region?”

      Chromatin contact regions identified by Hi-C or Capture-C assays are always reported as pairs of chromatin regions. The Supplementary eMethods provide details on the method of processing and interaction calling from the Hi-C and Capture-C data.

      “The manuscript would benefit from a detailed explanation of the methods used to define cREs, particularly the process of intersecting OCRs with chromatin conformation data. The current description does not fully clarify how the cREs are defined.”

      “In the result section titled "Consistency and diversity of childhood obesity proxy variants mapped to cREs", the authors introduced the different types of cREs in the context of open chromatin regions and chromatin contact regions, and TSS. Figure 2A is helpful in some way, but more explanation is definitely needed. For example, it seems that the authors introduced three chromatin contacts on purpose, but I did not quite get the overall motivation.”

      We apologize for the confusion. Our definition of cREs is consistent throughout the study. Figure 2A will be the first Figure 1A in the revision in order to aid the reader.

      The 3 representative chromatin loops illustrate different ways the chromatin contact regions (pairs of blue regions under blue arcs) can overlap with OCRs (yellow regions under yellow triangles – ATAC peaks) and gene promoters.

      [1] The first chromatin loop has one contact region that overlaps with OCRs at one end and with the gene promoter at the other. This satisfies the formation of cREs; thus, the area under the yellow ATAC-peak triangle is green.

      [2] The second loop only overlapped with OCR at one end, and there was no gene promoter nearby, so it is unqualified as cREs formation.

      [3] The third chromatin loop has OCR and promoter overlapping at one end. We defined this as a special cRE formation; thus, the area under the yellow ATAC-peak triangle is green.

      To avoid further confusion for the reader, we will eliminate this variation in the new illustration for the revised manuscript.

      “Figure 2A: The authors used triangles filled differently to denote different types of cREs but I wonder what the height of the triangles implies. Please specify.”

      The triangles are illustrations for ATAC-seq peaks, and the yellow chromatin regions under them are OCRs. The different heights of ATAC-seq peaks are usually quantified as intensity values for OCRs. However, in our study, when an ATAC-seq peak passed the significance threshold from the data pipeline, we only considered their locations, regardless of their intensities. To avoid further confusion for the reader, we will eliminate this variation in the new illustration for the revised manuscript.

      “Figure 1B-c. the title should be "OCRs at putative cREs". Similarly in Figure 1B-d.”

      cREs are a subset of OCRs.

      - In the section "Cell type specific partitioned heritability", the authors used "4 defined sets of input genomic regions". Are you corresponding to the four types of regions in Figure 2A? 

      Figure 2A will be the first Figure 1A in the revision and will be modified to showcase how we define OCRs and cREs.

      “It seems that the authors described the 771 proxies in "Genetic loci included in variant-to-genes mapping" (ln 154), and then somehow narrowed down from 771 to 94 (according to ln 199) because they are cREs. It would be great if the authors could describe the selection procedure together, rather than isolated, which made it quite difficult to understand.”

      In the Methods section entitled “Genetic loci included in variant-to-genes mapping," we described the process of LD expansion to include 771 proxies from 19 sentinel obesity-significantly associated signals. Not all of these proxies are located within our defined cREs. Figure 2B, now Figure 2A in the revision, illustrates different proportions of these proxies located within different types of regions, reducing the proxy list to 94 located within our defined cREs.

      “Figure 2. What's the difference between the 771 and 758 proxies? “

      13 out of 771 proxies did not fall within any defined regions. The remaining 758 were located within contact regions of at least one cell type regardless of chromatin state.

      (3) Typos

      “In the paragraph "Childhood obesity GWAS summary statistics", the authors may want to describe the case/control numbers in two stages differently. "in stage 1" and "921 cases" together made me think "1,921" is one number.”

      This will be amended in the revision.

      “Hi-C technology should be spelled as Hi-C. There are many places, it is miss-spelled as "hi-C". In Figure 1, the author used "hiC" in the legend. Similarly, Capture-C sometime was spelled as "capture-C" in the manuscript.”

      “At the end of the fifth row in the second paragraph of the Introduction section: "exisit" should be "exist".

      “In Figure 2A: "Within open chromatin contract region" should be "Within open chromatin contact region". 

      These typos and terminology inconsistencies will be amended in the revision.

    1. Author response:

      Provisional author response to Reviewer #1<br /> We would like the reviewer for his/her careful evaluation of our manuscript and appreciate his/her appraisal for the strengths of our study. Regarding the weaknesses, we plan to address these as good as possible during the revision of our manuscript.<br /> We can already state that miR-26b has clear anti-inflammatory effects on human liver slices, which is in line with our results demonstrating that miR-26b plays a protective role in MASH development in mice. The notion that patients with liver cirrhosis have increasing plasma levels of miR-26b, seems contradictory at first glance. However, we believe that this increased miR-26b expression is a compensatory mechanism to counteract the MASH/cirrhotic effects. However, the exact source of this miR-26b remains to be elucidated in future studies.<br /> The performed kinase activity analysis revealed that miR-26b affects kinases that particularly play an important role in inflammation and angiogenesis. Strikingly and supporting these data, these effects could be inverted again by LNP treatment. Combined, these results already provide strong mechanistic insights on molecular and intracellular signalling level. Although the exact target of miR-26b remains elusive and its identification is probably beyond the scope of the current manuscript due to its complexity, we believe that the kinase activity results already provide a solid mechanistic basis.

      Provisional author response to Reviewer #2<br /> We would like the reviewer for his/her careful evaluation of our manuscript and appreciate his/her appraisal for the strengths of our study. Regarding the weaknesses, we plan to address these as good as possible during the revision of our manuscript. Particularly the validation suggestions are very valuable and we plan to address these in the revision by performing additional experiments.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study, Komarova et al. investigate the clinical prognostic ability of cell-level metabolic heterogeneity quantified via the fluorescence lifetime characteristics of NAD(P)H. Fluorescence lifetime imaging microscopy (FLIM) has been studied as a minimally invasive approach to measure cellular metabolism in live cell cultures, organoids, and animal models. Its clinical translation is spearheaded through macroscopic implementation approaches that are capable of large sampling areas and enable access to otherwise constrained spaces but lack cellular resolution for a one-to-one transition with traditional microscopy approaches, making the interpretation of the results a complicated task. The merit of this study primarily lies in its design by analyzing with the same instrumentation and approach colorectal samples in different research scenarios, namely in vitro cells, in vivo animal xenografts, and tumor tissue from human patients. These conform to a valuable dataset to explore the translational interpretation hurdles with samples of increasing levels of complexity. For human samples, the study specifically investigates the prediction ability of NAD(P)H fluorescence metrics for the binary classification of tumors of low and advanced stage, with and without metastasis, and low and high grade. They find that NAD(P)H fluorescence properties have a strong potential to distinguish between high- and low-grade tumors and a moderate ability to distinguish advanced-stage tumors from low-stage tumors. This study provides valuable results contributing to the deployment of minimally invasive optical imaging techniques to quantify tumor properties and potentially migrate into tools for human tumor characterization and clinical diagnosis.

      Strengths:

      The investigation of colorectal samples under multiple imaging scenarios with the same instrument and approach conforms to a valuable dataset that can facilitate the interpretation of results across the spectrum of sample complexity.

      The manuscript provides a strong discussion reviewing studies that investigated cellular metabolism with FLIM and the metabolic heterogeneity of colorectal cancer in general.

      The authors do a thorough acknowledgement of the experimental limitations of investigating human samples ex vivo, and the analytical limitation of manual segmentation, for which they provide a path forward for higher throughput analysis.

      Weaknesses:

      To substantiate the changes in fluorescence properties at the examined wavelength range (associated with NAD(P)H fluorescence) in relationship to metabolism, the study would strongly benefit from additional quantification of metabolic-associated metrics using currently established standard methods. This is especially interesting when discussing heterogeneity, which is presumably high within and between patients with colorectal cancer, and could help explain the particularities of each sample leading to a more in-depth analysis of the acquired valuable dataset.

      In order to address this issue, we have performed immunohistochemical staining of the available tumor samples for the two standard metabolic markers GLUT3 and LDHA.

      The results are included in Supplementary (Fig.S4). Discussion has been extended.

      Additionally, NAD(P)H fluorescence does not provide a complete picture of the cell/tissue metabolic characteristics. Including, or discussing the implications of including fluorescence from flavins would comprise a more compelling dataset. These additional data would also enable the quantification of redox metrics, as briefly mentioned, which could positively contribute to the prognosis potential of metabolic heterogeneity.

      We agree with the Reviewer that fluorescence from flavins could be helpful to obtain more complete data on cellular metabolic states. However, we lack to detect sufficiently intensive emission from flavins in colorectal cancer cells and tissues. The paragraph about flavins was added in Discussion and representative images - in Supplementary Material (Figure S5).

      In the current form of the manuscript, there is a diluted interpretation and discussion of the results obtained from the random forest and SHAP analysis regarding the ability of the FLIM parameters to predict clinicopathological outcomes. This is, not only the main point the authors are trying to convey given the title and the stated goals, but also a novel result given the scarce availability of these type of data, which could have a remarkable impact on colorectal cancer in situ diagnosis and therapy monitoring. These data merit a more in-depth analysis of the different factors involved. In this context, the authors should clarify how is the "trend of association" quantified (lines 194 and 199).

      We thank the Reviewer for this suggestion. The section has been updated with SHAP analysis using different parameters (dispersion D of t2, a1, tm and bimodality index BI of t2, a1, tm). It is now more clear that D-a1 is more strongly associated with clinicopathological outcomes compared with other variables. We have also added some biological interpretation of these results in the Discussion.

      Reviewer #2 (Public Review):

      Summary:

      In the manuscript "Metabolic heterogeneity of colorectal cancer as a prognostic factor: insights gained from fluorescence lifetime imaging" by Komarova et al., the authors used fluorescence lifetime imaging and quantitative analysis to assess the metabolic heterogeneity of colorectal cancer. Generally, this work is logically well-designed, including in vitro and in vivo animal models and ex vivo patient samples. However, since the key parameter presented in this study, the BI index, is already published in a previous paper by this group (Shirshin et al., 2022), and the quantification method of metabolic heterogeneity has already been well (and even better) described in previous studies (such as the one by Heaster et al., 2019), the novelty of this study is doubted. Moreover, I am afraid that the way of data analysis and presentation in this study is not well done, which will be mentioned in detail in the following sections.

      Strengths:

      (1) Solid experiments are performed and well-organized, including in vitro and in vivo animal models and ex vivo patient samples.

      (2) Attempt and efforts to build the association between the metabolic heterogeneity and prognosis for colorectal cancer.

      Weaknesses:

      (1) The human sample number (from 21 patients) is very limited. I wonder how the limited patient number could lead to reliable diagnosis and prognosis;.

      Additional 8 samples of patients’ tumors collected while the manuscript was under review were added to the present data. We agree that the number is still limited to conclude about the prognostic value of cell-level metabolic heterogeneity. But at this point we can expect that this parameter will become a metric for prognosis. We will continue this study to collect more samples of colorectal tumors and expand the approach to different cancer types.

      (2) The BI index or similar optical metrics have been well established by this and other groups; therefore, the novelty of this study is doubted.

      The purpose of this research was to quantify and compare the cellular metabolic heterogeneity across the systems of different complexity - commercial cell lines, tumor xenografts and patients’ tumors - using previously established FLIM-based metrics. For the first time, using FLIM, it was shown that heterogeneity of patients’ samples is much higher than of laboratory models and that it has associations with clinical characteristics of the tumors - the stage and the grade. In addition, this study provides evidence that bimodality (BI) in the distribution of metabolic features in the cell population is less important than the width of the spread (the dispersion value D).

      Some corrections have been made in the text on this point.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The following comments should be addressed to strengthen the rigor and clarity of the manuscript.

      (1) The ethical committee that approved the human studies should also be mentioned in the methods section, as was done with the animal studies.

      Information about the ethics committee has been added in the Manuscript.

      The study with the use of patients’ material was approved by the ethics committee of the Privolzhsky Research Medical University (approval № 09 from 30.06.2023).

      (2) The captions in Figures 2 and 3 must be revised. In Figure 2, it seems the last 2 sentences for the description of (C) do not belong there, and instead, the last sentence in the description of (D) may need to be included in (C) instead. Figure 3 is similar.

      The captions were revised.

      (3) From supplement Figure S2 it seems that EpCam and vimentin staining were only done in two of the mouse tumor types. No further mention is made in the results or methods section. Is there any reason this was not performed in the other tumor types? Were the histology and IHC protocols the same for the mouse and human tumors?

      The data on other tumor types and patients’ tumors have been added in Figure S3. Discussion was extended with the following paragraph.

      One of the possible reasons for metabolic heterogeneity could be the presence of stromal cells or diversity of epithelial and mesenchymal phenotypes of cancer cells within a tumor. Immunohistochemical staining of tumors for EpCam (epithelial marker) and vimentin (mesenchymal marker) showed that the fraction of epithelial, EpCam-positive, cells was more than 90% in tumor xenografts and on average 76±10 % in patients’ tumors (Figure S3). However, the ratio of EpCam- to vimentin-positive cells in patients’ samples neither correlated with D-a1 nor with BI-a1, which means that the presence of cells with mesenchymal phenotype did not contribute to metabolic heterogeneity of tumors identified by NAD(P)H FLIM.

      (4) Clarify the design of the experiments: The results come from 50 - 200 cells in each sample (except 30 in the CaCo2 cell culture) that were counted from 5 - 10 images acquired from each sample. There were 21 independent human samples. How many independent samples were included in the cell culture experiments and the mouse tumor models? Why is there an order of magnitude fewer cells included in the CaCo2 group compared to the other groups (Figure 1)? From the image (Figure 1A - CaCo2), it seems to be a highly populated type of sample, yet only 30 cells were quantified. What prevents the inclusion of the same number of cells to be quantified in each group for a more systematic evaluation?

      We thank the Reviewer for this comment.

      Cell culture experiments included two independent replicates for each cell line, the data from which were then combined. In animal experiments measurements were made in three mice (numbered 1-3 in Figure 2C) for each tumor type. We have made calculations for additional >100 cells of CaCo2 cell line. In the revised version the number of Caco2 cells is 146.

      The text of the Manuscript was revised accordingly.

      (5) Regarding references: Some claims throughout the text would benefit from an additional reference. For example: line 70 "Metabolic heterogeneity [...] is believed to have prognostic value"; line 121 " [...] the uniformity of cell metabolism in a culture, which is consistent with the general view on standard cell lines [...]". The clinical translational aspect (i.e., paragraph in line 255) warrants the inclusion of the efforts already done with FLIM imaging in the clinical setting both in vivo and ex vivo with point-spectroscopy and macroscopy imaging (e.g., Jo Lab, Marcu Lab, French Lab, and earlier work by Mycek and Richards-Kortum in colorectal cancer to name a few).

      Additional references were added.

      Reviewer #2 (Recommendations For The Authors):

      (1) In the Introduction, line 85, the authors mention that "Specifically, the unbound state of NAD(P)H has a short lifetime (~0.4 ns) and is associated with glycolysis, while the protein-bound state has a long lifetime (~1.7-3.0 ns) and is associated with OXPHOS". I do not think this claim is appropriate. One cannot simply say that the unbound state is associated with glycolysis, nor that the bound state is associated with OXPHOS; both unbound and bound state are associated with almost all the metabolic pathways. Instead, the expression of "glycolytic/ OXPHOS shift", as authors used in other sections of this manuscript, is a more appropriate one in this case.

      The text of the Introduction was revised.

      (2) What are the biological implications of the bimodality index (BI)? Please provide specific insights.

      Bimodal distribution indicates there are two separate and independent peaks in the population data. In the metabolic FLIM data, this indicates that there are two sub-populations of cells with different metabolic phenotypes. Previously, we have observed bimodal distribution in the population of chemotherapy treated cancer cells, where one sub-population was responsive (shifted metabolism) and the second - non-responsive (unchanged metabolism) [Shirshin et al., PNAS, 2022]. In the naive tumor, a number of factors have an impact on cellular metabolism, including genetics features and microenvironment, so it is difficult to determine which ones resulted in bimodality. Our data on correlation of bimodality (BI) with clinical characteristics of the tumors show that there are no associations between them. What really matters is the width of the parameter spread in the population. The early-stage tumors (T1, T2) were metabolically more heterogeneous than the late-stage ones (T3, T4). A degree of heterogeneity was also associated with differentiation state, a stage-independent prognostic factor in colorectal cancer where the lower grade correlates with better the prognosis. The early-stage tumors (T1, T2) and high-grade (G3) tumors had significantly higher dispersion of NAD(P)H-a1, compared with the late-stage (T3, T4) and low-grade ones (G1, G2). From the point of view of biological significance of heterogeneity, this means that in stressful and unfavorable conditions, to which the tumor cells are exposed, the spread of the parameter distribution in the population rather than the presence of several distinct clusters (modes) matters for adaptation and survival. The high diversity of cellular metabolic phenotypes provided the survival advantage, and so was observed in more aggressive (undifferentiated or poorly differentiated) and the least advanced tumors.

      The discussion has been expanded on this account.

      (3) Have you run statistics in Figure 1B? If yes, do you find any significance? The same question also applies to Figures 2C and 3C.

      We performed statistical analysis to compare different cell lines in in vitro and in vivo models, the results obtained are presented in Table S4.

      (4) Line 119, why is the BI threshold set at 1.1?

      When setting the BI threshold at 1.1, we relied on the work by Wang et al, Cancer Informatics, 2009. The authors recommended the 1.1 cutoff as more reliable to select bimodally expressed genes. Further, we validated this BI threshold to identify chemotherapy responsive and non-responsive sub-populations of cancer cells (Shirshin et al. PNAS, 2022)

      (5) Line 123, what does the high BI of mean lifetime stand for? Please provide biological implications and insights.

      The sentence was removed because inclusion of additional CaCo2 cells (n=146) for quantification NAD(P)H FLIM data showed no bimodality in this cell culture.

      (6) In the legend for Figure 2C, the authors mention that "the bimodality index (BI-a1) is shown above each box"; however, I do not see such values. It is also true for Figure 3C.

      The legends for Fig. 2 and 3 were corrected.

      (7) In Figure 2, t1-t3 were not explained and mentioned in the main text. What do they mean? Do they mean different time points or different tumors?

      t1-t3 means different tumors in a group. Changes have been made to the figure - individual tumors are indicated by numbers.

      (8) In Figure 3, what do p13, p15 and p16 mean? It is not clearly explained. If they just represent patients numbered 13, 15, and 16, then why are these patients chosen as representatives? Do they represent different stages or are they just chosen randomly?

      Figure 3 was revised. Representative images were changed and a short description for each representative sample was included. In the revised version, representatives have been selected to show different stages and grades.

      (9) In Figure 3, instead of showing the results for each patient, I would suggest that authors show representative results from tumors at different stages; or, at least, clearly indicate the specific information for each patient. I do not think that providing the patient number only without any patient-specific information is helpful.

      Figure 3 was revised.

      (10) The sample number (21 patients) is very limited. I wonder how the limited patient number could lead to reliable diagnosis and prognosis.

      Additional eight samples were added. The text, figures and tables were revised accordingly.

      (11) In Discussion, it would be helpful to compare the BI index used in this study with the previously developed OMI-index (Line 275).

      We believe that BI index and OMI index describe different things and, therefore, it is hard to compare them. While BI index is used to describe the degree of the metabolic heterogeneity, OMI index is an integral parameter that includes redox ratio, mean fluorescence lifetimes of NAD(P)H and FAD, and rather indicates the metabolic state of a cell. In this sense it is more relevant to compare it with conventional redox ratio or Fluorescence Lifetime Redox Ratio (FLIRR) (H. Wallrabe et al., Segmented cell analyses to measure redox states of autofluorescent NAD(P)H, FAD & Trp in cancer cells by FLIM, Sci. Rep. 2018; 8: 79). The assessment of the heterogeneity of the FLIM parameters has been previously reported using the weighted heterogeneity (wH) index (Amy T. Shah et al, In Vivo Autofluorescence Imaging of Tumor Heterogeneity in Response to Treatment, Neoplasia 17, pp. 862–870 (2015). To the best of our knowledge, this is the only metric to quantify metabolic heterogeneity on the basis of FLIM data for today. A comparison of BI with the wH-index showed that the value of wH-index provides results similar to BI in the heterogeneity evaluation as demonstrated in our earlier paper (E.A. Shirshin et al, Label-free sensing of cells with fluorescence lifetime imaging: The quest for metabolic heterogeneity, PNAS 119 (9) e2118241119 (2022).  Yet, the BI provides dimensionless estimation on the inherent heterogeneity of a sample, and therefore it can be used to compare heterogeneity assessed by different decay parameters and FLIM data analysis methods. The limitation of using the OMI index for FLIM data analysis is the low intensity of the FAD signal, which was the case in our experiments.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      We would like to see the major conclusions constrained to better fit the data presented in the manuscript. Speed is only a single performance metric of a very complicated, very diverse system of locomotion.

      If the authors would like to maintain the broader conclusions, the study should be repeated with a number of different performance metrics to shore up the manuscript's results. Particularly with efficiency, speed is not a reliable measure of efficiency to begin with, so this needs to be explored in a more targeted and appropriate manner.

      We agree with Reviewer 1 that we should be more precise about the fitness metrics used and more constrained about the conclusions. Considering the points raised in each paragraph, we’ve modified the text as follows:

      - [line 17] “... to test the necessity of both traits for sustained and effective displacement on the ground.”

      - [starting on line 105] “We generate the robot’s sample using an artificial evolutionary process that selects for better locomotion ability - defined as higher average speed as it is a proxy for organisms with sustained and effective displacement.”

      - [starting on line 287] “We also found that different gravitational environments require different shape structures to optimize locomotion average speed.”

      - [starting on line 311] “This consistency is evidence that a small number of sparsely connected modules is a morphological computation principle for an organism’s optimized average speed.”

      - [starting on line 348] “Beyond that, extending the tests for other important aspects of locomotion behavior - as noise on the ground, energetic costs, and maneuverability - by using other locomotion metrics - as energy efficiency, stability margin, and dissipated power (Paez and Melo, 2014; Aoi et al., 2016 ) - would also be relevant to evaluate the principle’s robustness.”

      - [starting on line 524] “As the robots with the highest average speed are the ones that succeed in maximizing displacement and having robust dynamics (they will not tumble with time), we defined $\bar s$ as the fitness value using it as a proxy of successful directed locomotion. Selecting for bodies that maximize speed is a common locomotion bias in natural selection, as both predators and prey and thus fecundity and mortality depend on it (Alexander, 2006). Other measures - such as energy efficiency - can capture distinct important aspects of the locomotion complexity (Paez and Melo, 2014) and would be worthy of investigating in future work.”

      Paper Premise/Mission Statement: As defined in the abstract and also called out in the text starting on line 59 is "investigate whether symmetry and modularity are features of an organism's shape need [authors italics] to have for better-directed locomotion..."

      If we understood correctly the reviewer is asking for more precision in the statement. We modified the respective sentence in the following way:

      - [line 62] “... need to have for optimizing average speed on the ground,”

      Reviewer #2 (Recommendations For The Authors):

      i) a lot of details that are in the captions should be moved in the main text;

      Thank you for this comment. We reviewed all the captions and text making modifications to ensure that all the information in the captions is also present in the main text. Below, we highlighted some of the changes:

      - [line 57] “Thus, locomotion on the ground is present in phylogenetically distant species (such as the maned wolf and frogfish in Figure 1A) and depends upon … “

      - [starting on line 64] “Figure 1B shows a schematic representation of symmetry and modularity on the maned wolf and frogfish bodies.”

      - [starting on line 277] “There is a negative correlation between the proportion of feet voxels and the robot’s locomotion transference capability when the robots go to an environment with higher gravity, i.e., water to mars (dark blue in Figure 5C), water to earth (light blue), and mars to earth (red) - with a Spearman correlation coefficients of r = -0.39, r = -0.43, and r = -0.32, respectively, all with p < 1e-08.”

      ii) hypotheses should be spelled out more clearly;

      We verified the experiments and certified that every experiment had a clear hypothesis statement in the original manuscript. Before each section defining the hypothesis and describing the experiment, we added the following statement:

      - [starting on line 119] “ With this sample, we tested the hypotheses about the relationships between locomotion performance and body modularity and symmetry (Figure 1I).”

      iii) performance metrics and other features should be better defined using mathematical terms if possible (for example, instability);

      Thank you for the comment. We added a definition for instability in the text:

      - [starting on line 218] “Nonetheless, locomotion requires a minimum instability - the dynamic possibility of translating the center of mass - in the direction axis to generate the necessary forward displacement (Bruijn et al., 2013; Nagarkar et al., 2021).”

      Despite the different definitions of instability in literature (Bruijn et al., 2013, Paez and Melo, 2014; Aoi et al., 2016, Nagarkar et al., 2021), we didn’t find one mathematical definition that fits perfectly in our context.

      Following the reviewer's comment, when necessary we expanded the definition for other features:

      - [starting on line 199] “... the distribution of body weight. As the robots do not have sensory feedback abilities, the weight balance is defined as the body’s movement due to gravity forces (consequences of the weight distribution and surface contact points) (Benda et al., 1994). We hypothesized that the robots with the best directed locomotion ability would tend to have a symmetric body shape. A robot with a low XY shape symmetry (XY shape symmetry < 0.5) has a higher chance of having a poor weight balance, increasing the chance of the body tipping over, thus leading it to a lousy locomotion performance (blue dotted line in Figure 3C). “

      iv)  more details regarding the simulations should be included;

      We thank the reviewer for this comment. If we understood correctly the Reviewer 2 is asking for more details regarding: “a) the adequacy of the spatial resolution, whereby I failed to see a compelling argument regarding the completeness of 64 voxels; b) the realism of the oscillatory patterns, whereby all the voxels are set to oscillate at the same, constant, frequency of 2Hz; and c) the accuracy of simulations in water where added mass effects seem to be neglected.”. We modified the text to better satisfy these concern:

      a) [starting on line 96] “We choose to first explore exhaustively the $4^3$ space dimension, as it is the minimal possible space that allows meaningful body plans. We also did control experiments within 6^3 and 8^3 to check for dimension size effects.”

      - [starting on line 432] “We did control experiments with robots within 6³ and 8³ dimensions to check for dimension size effects - and we found that the results found in 4³ remained valid. We choose to focus our analysis in the 4³ design space because we consider it the minimum coarse-grain to approach the biological question about the contingency of shape outcomes pressured for locomotion. Smaller spaces do not allow sufficient complexity in the body structures, and increasing spatial resolution reduces the extensiveness of the investigated search space.”

      b) [starting on line 451] “… we used a fixed oscillation frequency of 𝑓 = 2 Hz (Kriegman et al.,2020). A fixed frequency value reduces the number of degrees of freedom in the search for solutions, but in return, it narrows the direct connection between the simulated organisms and animals. Exploring different frequency values in future work would be important to investigate the impact of varied oscillatory frequencies in the shape solutions for directed locomotion.”

      c) The environment we call “water” is not an accurate modeling of aquatic habitats as we didn’t simulate essential forces such as draff effects. This choice is explained in text starting on line 110: “In the water-like environment the bodies have nullifying body weight but do not have drag effects. We did not add drag in our simulations because our aim is to study just the body weight influences in locomotion independently of other forces.”

      v) a full paragraph about limitations should be included in the discussions, focusing on both simulation aspects (for example, the use of simple spring elements in the voxels) and theoretical assumptions (for example, addressing the potential role of non-locomotion-related aspects).

      We thank the reviewer for the comment. We edited some paragraphs of the discussion section to make more explicit some limitations of our work:

      [starting on line 398] “We expect that including other important aspects of an animal's body as a developmental process and sensory functions could influence the shape's outcomes with other layers of principles. Although we based our simulations on an already successful transference of \textit{in silico} behavior to organisms made of biological tissue

      \citep{kriegman_scalable_2020}, there is an intrinsic gap between spring-mass robots modeling and animal’s bodies that is worthy of exploring to ensure the generality of our results. Other methods, such as the inclusion of rigid body elements in the simulation (possible in Voxelyze), the use of finite element modeling (FEM) (Coevoet et al., 2019), and the construction of physical robots (Aguilar et al., 2016), are important complements to this work. Beyond that, principles on other scales as in the genotypes (Johnston et al., 2022) and in other behavioral phenotypes (Gomez-Marin et al., 2016) could also be investigated.”

      To address the potential role of non-locomotion-related aspects, we revised the section

      “Discussion - Contingency of evolutionary outcomes” where we discussed other functional and biological roles:

      [starting on line 354 ] “Here we investigate how a specific functional cause - optimization of average speed during directed locomotion on the ground - externally defines the phenotypic space of shape possibilities.”

      [starting on line 359] “For simplification purposes, we choose to not explicitly control other important factors of locomotion (i.e., energy consumption, maneuverability) that nonlinearly interact during locomotion. In future studies, it would be important to conduct similar studies on a wider range of factors to study the shape and dynamic principles in different conditions.“

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors developed an extension to the pairwise sequentially Markov coalescent model that allows to simultaneously analyze multiple types of polymorphism data. In this paper, they focus on SNPs and DNA methylation data. Since methylation markers mutate at a much faster rate than SNPs, this potentially gives the method better power to infer size history in the recent past. Additionally, they explored a model where there are both local and regional epimutational processes. Integrating additional types of heritable markers into SMC is a nice idea which I like in principle. However, a major caveat to this approach seems to be a strong dependence on knowing the epimutation rate. In Fig. 6 it is seen that, when the epimutation rate is known, inferences do indeed look better; but this is not necessarily true when the rate is not known. (See also major comment #1 below about the interpretation of these plots.) A roughly similar pattern emerges in Supp. Figs. 4-7; in general, results when the rates have to be estimated don't seem that much better than when focusing on SNPs alone. This carries over to the real data analysis too: the interpretation in Fig. 7 appears to hinge on whether the rates are known or estimated, and the estimated rates differ by a large amount from earlier published ones.

      Overall, this is an interesting research direction, and I think the method may hold more promise as we get more and better epigenetic data, and in particular better knowledge of the epigenetic mutational process. At the same time, I would be careful about placing too much emphasis on new findings that emerge solely by switching to SNP+SMP analysis.

      Major comments:

      - For all of the simulated demographic inference results, only plots are presented. This allows for qualitative but not quantitative comparisons to be made across different methods. It is not easy to tell which result is actually better. For example, in Supp. Fig. 5, eSMC2 seems slightly better in the ancient past, and times the trough more effectively, while SMCm seems a bit better in the very recent past. For a more rigorous approach, it would be useful to have accompanying tables that measure e.g. mean-squared error (along with confidence intervals) for each of the different scenarios, similar to what is already done in Tables 1 and 2 for estimating $r$.

      We believe this comment was addressed in the previous revision (Sup Table 6-10) by adding Root Mean Square Errors for the demographic estimates (and RMSE for recent versus past portions of the demography). 

      - 434: The discussion downplays the really odd result that inputting the true value of the mutation rate, in some cases, produces much worse estimates than when they are learned from data (SFig. 6)! I can't think of any reason why this should happen other than some sort of mathematical error or software bug. I strongly encourage the authors to pin down the cause of this puzzling behaviour. (Comment addressed in revision. Still, I find the explanation added at 449ff to be somewhat puzzling -- shouldn't the results of the regional HMM scan only improve if the true mutation rate is given?)

      We do understand that our results and explanation can appear counter-intuitive. As acknowledged by the reviewer, in the previous round of revision we have at length clarified this puzzling behaviour by the discrepancy in assessing methylation regions using the HMM method which then differs from the HMM for the SMC inference. We are happy to clarify further in response to the new question of reviewer 1:

      If the Reviewer #1 means the SNP mutations (e.g. A → T), knowing the true mutation rate does not help the HMM to recover the region level methylation status. 

      If the Reviewer #1 means the epimutations (whether it is the region, site or both), knowing the true epimutations rates could theoretically help the HMM to recover the region level methylation status. However, at present, our method does not leverage information from epimutation rates to infer the region level methylation status. As inferring the epimutations rates is one of the goals of this study in the SMC inference, and that region level methylation status is required to infer those rates, we suspect that using epimutations rates to infer the region level methylation status could be statistically inappropriate (generating some kind of circular estimations). Instead, our HMM uses only the proportion of methylated and unmethylated sites (estimated from the genome) to determine whether or not a region status is most-likely to be methylated or unmethylated. We now explicit this fact in the HMM for methylation region in the method section.

      We acknowledge that our HMM to infer region level methylation status could be improved, but this would be a complete project and study on its own (due to the underlying complexity of the finite site and the lack of a consensus model for epimutations at evolutionary time scale). We believe our HMM to have been the best compromise with what was known from methylation and our goals when the study was conducted, and future work is definitely worth conducting on the estimation of the methylation regions.

      - As noted at 580, all of the added power from integrating SMPs/DMRs should come from improved estimation of recent TMRCAs. So, another way to study how much improvement there is would be to look at the true vs. estimated/posterior TMRCAs. Although I agree that demographic inference is ultimately the most relevant task, comparing TMRCA inference would eliminate other sources of differences between the methods (different optimization schemes, algorithmic/numerical quirks, and so forth). This could be a useful addition, and may also give you more insight into why the augmented SMC methods do worse in some cases. (Comment addressed in revision via Supp. Table 7.).

      - A general remark on the derivations in Section 2 of the supplement: I checked these formulas as best I could. But a cleaner, less tedious way of calculating these probabilities would be to express the mutation processes as continuous time Markov chains. Then all that is needed is to specify the rate matrices; computing the emission probabilities needed for the SMC methods reduces to manipulating the results of some matrix exponentials. In fact, because the processes are noninteracting, the rate matrix decomposes into a Kronecker sum of the individual rate matrices for each process, which is very easy to code up. And this structure can be exploited when computing the matrix exponential, if speed is an issue.

      We believe this comment was acknowledged in the previous revision (line 649), and we thank the reviewer for this interesting insight.

      - Most (all?) of the SNP-only SMC methods allow for binning together consecutive observations to cut down on computation time. I did not see binning mentioned anywhere, did you consider it? If the method really processes every site, how long does it take to run?

      We believe this comment was addressed in the previous revision and was added to the manuscript in the methods Section (subsection :  SMC optimization function).

      - 486: The assumed site and region (de)methylation rates listed here are several OOM different from what your method estimated (Supp. Tables 5-6). Yet, on simulated data your method is usually correct to within an order of magnitude (Supp. Table 4). How are we to interpret this much larger difference between the published estimates and yours? If the published estimates are not reliable, doesn't that call into question your interpretation of the blue line in Fig. 7 at 533? (Comment addressed in revision.)

      Reviewer #2 (Public Review):

      A limitation in using SNPs to understand recent histories of genomes is their low mutation frequency. Tellier et al. explore the possibility of adding hypermutable markers to SNP based methods for better resolution over short time frames. In particular, they hypothesize that epimutations (CG methylation and demethylation) could provide a useful marker for this purpose. Individual CGs in Arabidopsis tends to be either close to 100% methylated or close to 0%, and are inherited stably enough across generations that they can be treated as genetic markers. Small regions containing multiple CGs can also be treated as genetic markers based on their cumulative methylation level. In this manuscript, Tellier et al develop computational methods to use CG methylation as a hypermutable genetic marker and test them on theoretical and real data sets. They do this both for individual CGs and small regions. My review is limited to the simple question of whether using CG methylation for this purpose makes sense at a conceptual level, not at the level of evaluating specific details of the methods. I have a small concern in that it is not clear that CG methylation measurements are nearly as binary in other plants and other eukaryotes as they are in Arabidopsis. However, I see no reason why the concept of this work is not conceptually sound. Especially in the future as new sequencing technologies provide both base calling and methylating calling capabilities, using CG methylation in addition to SNPs could become a useful and feasible tool for population genetics in situations where SNPs are insufficient.

      We thank again the reviewer #2 for his positive comments.  

      Reviewer #3 (Public Review):

      I very much like this approach and the idea of incorporating hypervariable markers. The method is intriguing, and the ability to e.g. estimate recombination rates, the size of DMRs, etc. is a really nice plus. I am not able to comment on the details of the statistical inference, but from what I can evaluate it seems reasonable and in principle the inclusion of highly mutable sties is a nice advance. This is an exciting new avenue for thinking about inference from genomic data. I remain a bit concerned about how well this will work in systems where much less is understood about methylation,

      The authors include some good caveats about applying this approach to other systems, but I think it would be helpful to empiricists outside of thaliana or perhaps mammalian systems to be given some indication of what to watch out for. In maize, for example, there is a nonbimodal distribution of CG methlyation (35% of sites are greater than 10% and less than 90%) but this may well be due to mapping issues. The authors solve many of the issues I had concerns with by using gene body methylation, but this is only briefly mentioned on line 659. I'm assuming the authors' hope is that this method will be widely used, and I think it worth providing some guidance to workers who might do so but who are not as familiar with these kind of data.

      We thank the reviewer #3 for his positive comments. And we agree with Reviewer #3 concerning the application to data and that our approach needs to be carefully thought before applied. Our results clearly show that methylation processes are not well enough understood to apply our approach as we initially (maybe naively) designed it. Further investigations need to be conducted and appropriate theoretical models need to be developed before reliable results can be obtained. And we hope that our discussion points this out. However, our approach, the theoretical models and the additional tools contained in this study can be used to help researchers in their investigations to whether or not use different genomic markers to build a common (potentially more reliable) ancestral history. We enhanced the discussion in this second revision by clarifying also the use of the methylation from genic regions to avoid  confusion (lines 700-731).

      Recommendations for the authors:  

      Reviewer #1 (Recommendations For The Authors):

      In added Supp. Table 7, I don't think these are in log10 units as stated in the caption.

      Well Spotted! Indeed, the RMSE is not in log10 scale, we corrected the caption. We also added that the TMRCA used for MRSE calculations is in generations units to avoid potential confusion.  

      Reviewer #3 (Recommendations for The Authors):

      I very much appreciate the authors' attention to previous questions. I would ask that a bit more is spent in the discussion on concerns/approaches empiricists should keep in mind -- I am wary of this being uncritically applied to data from non-model species. It was not clear to me, for example (only mentioned on line 659 in the discussion) that the thaliana data is only using gene-body methylation. This poses potential issues with background selection that the authors acknowledge appropriately, but also assuages many of my concerns about using genome-wide data. I think text with recommendations for data/filtering/etc or at least cautions of assumptions empiricists should be aware of would help.

      We apologize for the confusion at line 659. As written in the other section of the manuscript we meant CG sites in genic regions (and not only gene body methylated regions).

      Due to the manuscript’s structure, the data from Arabidopsis thaliana is only described at the very end of the manuscript (line 900+). However, a brief description could also be found line 291-296. We however added a sentence in the introduction (line 128) for clarity. 

      We however agree with the comment made by reviewer #3 concerning the application to data. We pointed in the discussion the risk of applying our approach on ill-understood (or illprepared) data and stressed the current need of studies on the epimutations processes at evolutionary time scale ( i.e. at Ne time scale) (line 700-703).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary:

      Clostridium thermocellum serves as a model for consolidated bioprocess (CBP) in lignocellulosic ethanol production, but yet faces limitations in solid contents and ethanol titers achieved by engineered strains thus far. The primary ethanol production pathway involves the enzyme aldehydealcohol dehydrogenase (AdhE), which forms long oligomeric structures known as spirosomes, previously characterized via the 3.5 Å resolution E. coli AdhE structure using single-particle cryoEM. The present study describes the cryo-EM structure of the C. thermocellum ortholog, sharing 62% sequence identity with E. coli AdhE, resolved at 3.28 Å resolution. Detailed comparative structural analysis, including the Vibrio cholerae AdhE structure, was conducted. Integrating cryoEM data with molecular dynamics simulations indicated that the aldehyde intermediate resides longer in the channel of the extended form, supporting the hypothesis that the extended spirosome represents the active form of AdhE. 

      Strengths: 

      The study conducts a comprehensive structural comparative analysis of oligomerization interfaces and the acetaldehyde channel across compact and extended conformations. Structural and computational results suggest the extended spirosome as the most likely active state of AdhE. 

      Weaknesses: 

      The overall resolution of the C. thermocellum structure is similar to the E. coli ortholog, which shares 62% sequence identity, and the oligomerization interfaces and the acetaldehyde channel were previously described. 

      Reviewer #2 (Public Review): 

      Summary: 

      The manuscript by Ziegler et al, entitled 'Structural characterization and dynamics of AdhE ultrastructure from Clostridium thermocellum: A containment strategy for toxic intermediates?" presents the atomic resolution cryo-EM structure of C. thermocellum AdhE showing that it show dominantly an extended form while E. coli AdhE shows dominantly a compact form. With comparative analysis of their C. thermocellum structure and the previous E. coli AdhE structure, they tried to reveal the mechanism by which C. thermocellum and E. coli show diXerent dominant conformations. In addition, they also analyzed the substrate channel by comparative and computational approaches. Lastly, their computational analysis using CryoDRGN reveals conformational heterogeneity in the sample. Although this manuscript suggests a potential mechanism of the diXerent features of AdhEs, this manuscript is very descriptive and does not provide suXicient data to support the authors' conclusions, which may be due to the lack of experimental data to support their findings from the computational analysis. 

      Strengths: 

      This manuscript provides the first C. thermocellum (Ct) AdhE structure and comparatively analyzed this structure with E. coli AdhE. 

      Weaknesses: 

      Their main conclusions obtained mostly by computational and comparative analysis are not supported by experimental data. 

      Reviewer #3 (Public Review): 

      This study describes the first structure of Gram-positive bacterial AdhE spirosomes that are in a native extended conformation. All the previous structures of AdhE spirosomes obtained come from Gram-negative bacterial species with native compact spirosomes (E. coli, V. cholerae). In E. coli, AdhE spirosomes can be found in two diXerent conformational states, compact and extended, depending on the substrates and cofactors they are bound to. 

      The high-resolution cryoEM structure of the extended C. thermocellum AdhE spirosomes produced in E. coli in an apo state (without any substrate or cofactors) is compared to the E. coli extended and compact AdhE spirosomes structures previously published. The authors have modeled (in Swiss-Model) the structure of compact C. thermocellum AdhE spirosomes, using E. coli compact AdhE spirosome conformation as a template, and performed molecular dynamics simulations. They have identified a channel in which the toxic reaction intermediate aldehyde could transit from the aldehyde dehydrogenase active site to the alcohol dehydrogenase active site, in an analogous manner to E. coli spirosomes. These findings are in line with the hypothesis that the extended spirosomes could correspond to the active form of the enzyme. 

      In this work, the authors speculate that the C. thermocellum AdhE spirosomes could switch from the native extended conformation to a compact conformation, in a way that is inverse of E. coli spirosomes. Although attractive, this hypothesis is not supported by the literature. Amazingly, in some Gram-positive bacterial species (S. pneumoniae, S. sanguinis or C. di8icile...), AdhE spirosomes are natively extended and have never been observed in a compact conformation. On the opposite, E. coli (and other Gram-negative bacteria) native AdhE spirosomes are compact and are able to switch to an extended conformation in the presence of the cofactors (NAD+, coA, and iron). The data presented as they are now are not convincing to confirm the existence of C. thermocellum AdhE spirosomes in a compact conformation. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Major points: 

      (1) The claim of achieving the highest resolution AdhE structure lacks strong support since the E. coli structure was solved at 3.5A, whereas the C. thermocellum was solved at 3.28A. Conducting a local resolution analysis could provide insights into distinct structural interpretations, enhancing the strength of the claim. 

      We have modified the sentence claiming this as the highest resolution AdhE structure to say, “In this study, we presented and analyzed a high-resolution structure of the AdhE spirosome from C. thermocellum.” We have included the local resolution map in Figure 2C – all structural analysis was performed in regions from the center of the molecule, where the highest resolution information was determined.

      (2) The comparative structural analysis of the oligomerization interface is thorough, yet it could benefit from greater conciseness. Focusing on highlighting major findings would streamline the presentation and enhance clarity. 

      We altered a few places in the comparative structural analysis in response to other reviewers. We also divided the main structure section into two subsections (spirosome interfaces and AdhE active sites) to enhance clarity.

      Reviewer #2 (Recommendations For The Authors): 

      (1) The authors should change the tile containing "?". Does it mean that the conclusions that the authors made are still in question? 

      We have removed the question mark to indicate that our results point to a channeling mechanism.

      (2) Figure 1B: Clarify Ct Fwd. Is this adding NADH, and Ct Rev adding NAD+? 

      This information is described in the text in lines 98-100. It is also at the bottom of figure 1B.

      (3) Line 131: Please revise accordingly for clarity: "The extended dimer interfaces" è "The extended E.coli dimer interface". 

      This has been edited for clarity. We have added the following sentence resulting to indicate which interfaces that are being discussed: “Both the E. coli and C. thermocellum extended dimer interfaces bury ~5000 Å2. While the compact C. thermocellum compact dimer interface buries a similar surface area of ~4800 Å2, the E. coli dimer interface buries ~3800 Å2.”

      (4) Line 133-136: Why that does not seem to be the case? These sentences are not clear what the authors exactly mean. 

      We altered the text to say, “One would expect the compact structure in E. coli to have a larger buried surface area due to it being the predominant form when it is examined without additives, but that is not the case; further corroborating that factors other than buried surface area must impact the apo state of the spirosome.” We hope this clarifies our intent.

      (5) Line 138-145: The authors should provide a logic for how the diXerent distribution of the charged residues would change the form of AdhE. It may just be a diXerent distribution nothing to do with the conformational change. 

      After further analysis of the interface amino acid distribution, we agree that the distribution may have nothing to do with the conformational change. We have changed this section to end with the sentence “Analysis of the residues buried in these interfaces reveals that while many of the residues are identical in the C. thermocellum and E. coli extended structures, there are some diXerences in amino acid type distribution, although nothing that directly indicates control of conformer state (Supplemental Figure 3).” 

      (6) Line 169: Kim et al. è Cho et al.

      We have corrected this error.

      (7) Line 122-235: The whole section is just describing the diXerence between Ct and Ec AdhE suggesting that this diXerence may contribute to the conformational diXerence without any evidence. The author cannot say that the diXerences in the interface, active sites cofactor pockets, etc explain why two AdhE (Ct, Ec) have diXerent domain conformers unless they provide experimental data. 

      We did not conclude that any diXerences we observed structurally were responsible for the conformation change. The purpose of this section was solely to compare the structures to determine if we could find a structural basis for the diXerence between E. coli and C. thermocellum conformation – we stated a few times throughout the section and in the discussion that there were no immediate structural reasons for this diXerence in shape. We have added a few sentences in the discussion to address whether Gram-positive vs. Gram-negative is influencing the shape, addressed in reviewer #3 comment #4. 

      (8) Line 237: The whole section "Identification..." analyzed the substrate channel by computational analysis. The author should provide experimental evidence that these residues identified are critical for channeling by generating mutants and measuring their activity. 

      We agree that mutagenesis is the next logical step for these results, however it is outside the scope of work of this paper as this study will not be that straightforward. We have included a sentence in the discussion to indicate our plans for further investigation to the channel that says, “Future mutagenesis studies will be needed to confirm whether the spirosome exists to control the reaction flux in high-reactant conditions.”

      Reviewer #3 (Recommendations For The Authors): 

      (1) The capacity of C. thermocellum AdhE spirosomes to switch from a natively extended conformation to a compact conformation is not demonstrated in this manuscript, as it is now. Because this would be the first time that Gram-positive bacterial AdhE spirosomes are observed in a compact conformation, the authors should provide a clear demonstration of their existence by presenting reliable and good images of C. thermocellum compact spirosomes. 

      We have modified Figure 1A to zoom in on one compact and extended spirosome that we have identified from each C. thermocellum sample. We have included triangles of the same size and shape to indicate the proximity of a turn of a helix, showing that the identified compact spirosomes have a tighter conformation than extended spirosomes.

      (2) The authors should show at least an image of the compact C. thermocellum spirosomes, that they claim to observe in the presence of NADH or in the forward reaction conditions mentioned in Figure 1. The authors have added diXerent reactants to the extended C. thermocellum spirosomes and visualized their conformation by negative stain. An image of each condition tested would be valuable and would nicely complete the distribution of compact versus extended spirosomes presented in Figure 1. 

      We have created a new supplemental figure with spirosomes circled for all of the experimental conditions for C. thermocellum (Supplemental figure 1). We have added a reference to supplemental figure 1 in the text to direct the reader to these images.

      (3) The cryoEM classes presented in Figure 8 are not convincing and could correspond to dimers or rosettes of AdhE or to E. coli endogenous AdhE. CryoEM classes showing longer compact C. thermocellum spirosomes should be shown. The percentage of these compact spirosomes visualized in the micrographs should be added and discussed in the text as it would increase confidence in these findings and confirm that C. thermocellum compact spirosomes exist. Heterologous production of C. thermocellum AdhE in E. coli depleted for its endogenous AdhE would be required to definitively prove that these are compact C. thermocellum AdhE spirosomes in the cryoEM. 

      We included the pictures of the theoretical compact spirosomes, as generated from the 8-mer of E. coli AdhE (6AHC) to address the possibility of rosettes. We have now indicated in the text that there were 6.7% of the particles in the compact conformation, which is less than seen by negative stain. We further mentioned that the compact spirosome is less compact than that seen in E. coli. We added a sentence to the discussion about the possibility of contaminating E. coli spirosomes (though this is very unlikely ) in our compact spirosome analysis: “While these compact spirosomes could result from expression in E. coli, though this is very unlikely, we also identified compact spirosomes in a native C. thermocellum lysate, which would not have similar contamination issues.”

      (4) The authors should include and discuss in the text previous findings (among which Laurenceau et al., 2015...) describing the diXerences between Gram-positive and Gram-negative spirosomes. AdhE spirosomes are natively extended in most Gram-positive bacterial species (S. pneumoniae, S. sanguinis or C. diXicile...), and have never been observed in a compact conformation. On the opposite, E. coli (and other Gram-negative bacteria) native AdhE spirosomes are compact and are able to switch to an extended conformation in the presence of the cofactors (NAD+, coA, and iron). 

      We have added the following sentences to the discussion to address this comment: “This could potentially be due to the diXerences between Gram-positive and Gram-negative bacteria. In previous studies, compact spirosomes have only been isolated from Gram-negatives while solely extended spirosomes have been isolated from Gram-positives. Furthermore, while the compact spirosomes can transition to extended in the presence of cofactors, the reverse has not been previously observed with an extended spirosome.”

      (5) The authors have spotted some diXerences between the E. coli and C. thermocellum structures, that they believe could explain the intrinsic capacity of these spirosomes to be natively extended or compact. It would be interesting to confirm this hypothesis by measuring C. thermocellum extended AdhE spirosome activity and comparing it to E. coli extended spirosomes. The impact of mutations in the regions proposed by the authors to be important in the capacity of C. thermocellum AdhE to be extended (especially the GxGxxG motif and the D494 position) would be appreciated to confirm this hypothesis. 

      We agree that this would be an interesting avenue of research although it is currently outside the scope of this paper. We are looking into experiments that we can perform where we can track both activity and conformation but have not found an ideal experiment at this time.

      (6) Many statements and result interpretations are overstated in several parts of the manuscript and would need to be rewritten to balance the absence of clear evidence of C. thermocellum compact spirosomes. 

      We have shown that we have identified compact spirosomes, addressed in multiple comments above. We have adjusted the language of the paper to indicate more uncertainty that will be followed up in future mutagenesis experiments. However, these mutations are not that simple to identify and this research would require a fairly large study that is better suited for a follow up manuscript.

      (7) The Figure 7 legend would need to be corrected.

      We are unsure as to what needs to be corrected in the figure 7 legend based on this comment.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Strengths:

      (1) In my assessment, the data sufficiently demonstrates that a modified version of Pertuzamab can bind both the wild-type and S310 mutant forms of ERBB2.

      (2) The engineering strategy employed is rational and effectively combines computational and experimental techniques.

      (3) Given the clinical activity of HER2-targeting ADCs, antibodies unaffected by ERBB2 mutations would be desired.

      Weaknesses:

      (1) There is no data showing that the engineered antibody is equally specific as Pertuzamab i.e. that it does not bind to other (non-ERBB2) proteins.

      Showing the specificity of the engineered antibodies is indeed important. We did not address it in the current ms, but it can be tested in the future.

      (2) There is no data showing that the engineered antibody has the desired pharmacokinetics/pharmacodynamics properties or efficacy in vivo.

      In this ms we did not conduct in-vivo experiments. When moving forward, pharmacokinetics/pharmacodynamics properties and efficacy will be tested as well.

      (3) Computational approaches are only used to design a phage-screen library, but not used to prioritize mutations that are likely to improve binding (e.g. based on predicted impact on the stability of the interaction). A demonstration of how computational pre-screening or lead optimization can improve the time-intensive process would be a welcome advance.

      Thank you for this important comment. In the present ms we indeed used a computational approach for prioritizing residues to be mutated, but we did not prioritize the mutations that are likely to improve binding. In the initial library design, we did prioritize the mutations. However, due to experimental approach limitations with codon’s selection for the library, we had decided to allow all possible residues in each position, knowing that the selection will remove non-binding variants.

      Context:

      The conflict of interest statement is inadequate. Most authors of the study (but not the first author) are employees of Biolojic, a company developing multi-specific antibodies, but the statements do not clarify whether the presented antibodies represent Biolojic IP, whether the company sponsored the research, and whether the company is further developing the specific antibodies presented.

      The Conflict-of-Interest statement will be revised as such: The Biolojic Design authors are employees of Biolojic Design and have stock options in Biolojic Design. The company did not sponsor the research, does not hold IP for the presented antibodies, and is not further developing the presented antibodies.

      Reviewer #2 (Public Review):

      Strengths:

      (1) Deep computational analyses of large datasets of clinical data provide useful information about HER2 mutations and their potential relevance to antibody therapy resistance.

      (2) There is valuable information analyzing the residues within or near the interface between the antigen HER2 and the Pertuzumab antibody (heavy chain). The experimental antibody library screening obtained 90+ clones from 3.86×1011 sequences for further functional validation.

      Weaknesses:

      (1) There is a lack of assessment for antibody variant functions in cancer cell phenotypes in vitro (proliferation, cell death, motility) or in vivo (tumor growth and animal survival). The only assay was the western blotting of phosphopho-HER3 in Figure 4. However, HER2 levels and phosphor-HER2 were not analyzed.

      We indeed did not assess the engineered antibodies function in cancer cells. While a complete signaling assessment obviously requires functional assessment as well, due to the complexity of this assay, papers in this field (for example [1-3]) measure the signaling activation following HER2-HER3 dimerization by measuring pHER3, and we relied on them in this ms.

      (2) There is a misleading impression from the title of computational engineering of a therapeutic antibody and the statement in the abstract "we designed a multi-specific version of Pertuzumab that retains original function while also bindings these HER2 variants" for a few reasons:

      a. The primary method used for variant antibody identification for HER2 mutant binding is rather traditional experimental screening based on yeast display instead of the computational design of a multi-specific version of Pertuzumab.

      b. There is insufficient or lack of computational power in the antibody design or prioritization in choosing variant residues for the library construction of 3.86×1011 sequences. It seems random combinations from 6 residues out of 4 groups with 20 amino acid options.

      c. The final version of the tri-binding variant is a combination of screened antibody clones instead of computation design from scratch.

      d. There is incomplete experimental evidence about the therapeutic values of newly obtained antibody clones.

      Thank you for this relevant comment. When addressing relevant residues to be mutated, the number of potential variants is enormous. The computational approach was aimed at identifying the most preferable residues, in which variation can improve binding and is not likely to harm important interactions. Although an initial smaller number of residues could be chosen, we decided to broaden our view and create a larger library, in the aim of combining the computational selection with an experimental selection. This indeed is not a computational design from scratch, but rather an intercourse between the computer and the lab, that yielded the presented results.

      (3) Figures can be improved with better labeling and organization. Some essential pieces of data such as Supplementary Figure 1B on HER2 mutations in S310 that abrogated its binding to Pertuzumab should be placed in the main figures.

      Thank you for this comment, the relevant figures were moved to the main text, and the labels were revised.

      (4) It is recommended to provide a clear rationale or flowchart overview into the main Figure 1. Figure 2A can be combined with Figure 1 to the list of targeted residues.

      Figures 1 and 2 were divided differently, and the rationale was moved to the main text.

      (5) The quality of Figures such as Figure 2B-C flow data needs to be improved.

      High-quality figures were submitted with the revised ms.

      Reviewer #1 (Recommendations for The Authors):

      Major:

      (1) It should be clarified whether the S310 somatic mutations represent resistance mutations to Pertuzamab (i.e. emerge post-therapy) or are general mutations that activate HER2. This is important because mutations that specifically "evade" the binding of an antibody may be substantially more difficult to overcome than mutations that only by chance occur in the antibody binding site. This concern should be addressed in the introduction and discussion as it changes the interpretation of the data.

      This is a very important note. To the best of our knowledge, these mutations were not identified as resistance mutations that emerged post-therapy. However, as mentioned in the introduction, these mutations form hydrophobic interactions that stabilize HER2 dimerization. Moreover, cells expressing these mutations show hyperphosphorylation of HER2 and an increase in the subsequent activation of signaling pathways. Thus, these mutations do not necessarily evade Pertuzumab binding, but benefit cancer growth. This point was clarified in the introduction of the revised text.

      (2) While the authors claim that S310 germline pathogenic variants exist, I could not find evidence that this is the case. The dbGAP ID does not provide any evidence (either in the form of a citation or prevalence). The variants do not exist in GnomAD. A recent article discussing pathogenic ERBB2 germline variants only mentions S310 as a somatic variant https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8268839/ and I could not find evidence for S310 being a germline variant in the references provided by the author (https://www.nature.com/articles/nbt.3391) - where it is only mentioned as a somatic mutation. I could not find evidence of a cancer predisposition syndrome associated with this variant.

      Thank you for highlighting this matter. We had assumed that the presence of the variant in dbSNP means it is also a germline mutations, what may not be correct. However, we did find some evidence of this mutation as germline in ClinVar, and this was edited in the revised ms. https://www.ncbi.nlm.nih.gov/clinvar/RCV001311879.7.

      (3) The authors should consider experiments that show that the modified Pertuzamab has the same mechanism of action as the original Pertuzamab in preventing dimerization of the ERBB2 homodimer and/or interactions with ERBB3. I cannot recommend a specific approach, but at present it is not clear whether the mechanism or just the effect (phosphorylation of ERBB3) is the same.

      As mentioned above, for the assessment of HER-HER3 binding and HER3 signaling, in this ms we relied on a previous works [1-3] that also measured the signaling activation following HER2-HER3 dimerization by measuring pHER3.

      (4) The authors should perform in vitro experiments to demonstrate that the engineered antibody has similar on-target specificity not only sensitivity. I don't know what the ideal experiments would be, but should probably probe native epitopes. Western blots, immunoprecipitation of cell lysates?

      As mentioned above, showing the specificity of the engineered antibodies is indeed important. We did not address it in the current ms, but it can be tested in future work.

      Minor:

      (1) The introduction should review better the literature on the computational/rational design of antibodies, especially multi-specific - and likely de-emphasize small molecules (and mutations associated with the resistance thereof) as the presented research does not inform the design of mutation-agnostic small molecules.

      Thank you for these comments, the introduction was revised accordingly.

      (2) The authors should better present the fact that the lack of binding of Pertuzamab to HER2 S310 was previously known, thus the whole strategy of searching COSMIC, and computationally predicting their binding impact was unnecessary. Rather it would be helpful to learn how many other COSMIC hotspots could have a similar effect on other clinical antibodies.

      The lack of binding was indeed previously known, as mentioned in the introduction. However, we did not start our analysis targeting HER2 specifically, but we rather found these mutations because they were located in the binding pocket, which enabled our strategy to compensate for these mutations with alteration of the original Pertuzumab. Regarding other potential hotspots, the numbers appeared in Supplementary Table 1, and were moved to the main text.

      Stylistic:

      (1) Avoid using the term "drug" for an antibody.

      The term was changed to “antibody therapeutics” in the revised text.

      (2) Avoid repetition in the introduction.

      Thank you, we revised the introduction with this comment in mind.

      Reviewer #2 (Recommendations For The Authors):

      The quality of Figure 2B-C flow data needs to be improved:

      a. The diagonal populations suggest inappropriate color compensation or indicate cells are derived from unhealthy populations.

      We believe there may be some confusion here. The figures you are referring to are figures of very diverse library. The selected clones show nice diagonals, as shown in Supplementary Figure 5.

      b. Additional round 3 and round 4 did not seem to improve the enrichment of targeted clones but rather had similar binding profiles to each of the three proteins over and over.

      Two sets of the fourth round of selection were done, each originated from a different sub-population in round 3: 1. Clones that bind the S310Y mutation 2. Clones that bind the S310F mutation. The aim of the R4 was to examine this binders against the second mutation and canonical HER2 in the search for multi-specificity. Additional clarification of this point will be added to the main text.

      c. Figure legends are vague with non-specific descriptions of cells and conditions, and unclear statements of "FACS results...".

      The legends were edited in the revised version.

      d. Text fonts are in low resolution.

      High-quality figures were submitted with the revised ms.

      (1) Diwanji, D., et al., Structures of the HER2-HER3-NRG1β complex reveal a dynamic dimer interface. Nature, 2021. 600(7888): p. 339-343.

      (2) Yamashita-Kashima, Y., et al., Mode of action of pertuzumab in combination with trastuzumab plus docetaxel therapy in a HER2-positive breast cancer xenograft model. Oncol Lett, 2017. 14(4): p. 4197-4205.

      (3) Kang, J.C., et al., Engineering multivalent antibodies to target heregulin-induced HER3 signaling in breast cancer cells. MAbs, 2014. 6(2): p. 340-53.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The development of effective computational methods for protein-ligand binding remains an outstanding challenge to the field of drug design. This impressive computational study combines a variety of structure prediction (AlphaFold2) and sampling (RAVE) tools to generate holo-like protein structures of three kinases (DDR1, Abl1, and Src kinases) for binding to type I and type II inhibitors. Of central importance to the work is the conformational state of the Asp-Phy-Gly "DFG motif" where the Asp points inward (DFG-in) in the active state and outward (DFG-out) in the inactive state. The kinases bind to type I or type II inhibitors when in the DFG-in or DFG-out states, respectively.

      It is noted that while AlphaFold2 can be effective in generating ligand-free apo protein structures, it is ineffective at generating holo-structures appropriate for ligand binding. Starting from the native apo structure, structural fluctuations are necessary to access holo-like structures appropriate for ligand binding. A variety of methods, including reduced multiple sequence alignment (rMSA), AF2-cluster, and AlphaFlow may be used to create decoy structures. However, those methods can be limited in the diversity of structures generated and lack a physics-based analysis of Boltzmann weight critical to their relative evaluation.

      To address this need, the authors combine AlphaFold2 with the Reweighted Autoencoded Variational Bayes for Enhanced Sampling (RAVE) method, to explore metastable states and create a Boltzmann ranking. With that variety of structures in hand, grid-based docking methods Glide and Induced-Fit Docking (IFD) were used to generate protein-ligand (kinase-inhibitor) complexes.

      The authors demonstrate that using AlphaFold2 alone, there is a failure to generate DFG-out structures needed for binding to type II inhibitors. By applying the AlphaFold2 with rMSA followed by RAVE (using short MD trajectories, SPIB-based collective variable analysis, and enhanced sampling using umbrella sampling), metastable DFG-out structures with Boltzmann weighting are generated enabling protein-ligand binding. Moreover, the authors found that the successful sampling of DFG-out states for one kinase (DDR1) could be used to model similar states for other proteins (Abl1 and Src kinase). The AF2RAVE approach is shown to result in a set of holo-like protein structures with a 50% rate of docking type II inhibitors.

      Overall, this is excellent work and a valuable contribution to the field that demonstrates the strengths and weaknesses of state-of-the-art computational methods for protein-ligand binding. The authors also suggest promising directions for future study, noting that potential enhancements in the workflow may result from the use of binding site prediction models and free energy perturbation calculations.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript explores the utility of AlphaFold2 (AF2) and the author's own AF2-RAVE method for drug discovery. As has been observed elsewhere, the predictive power of docking against AF2 structures is quite limited, particularly for proteins like kinases that have non-trivial conformational dynamics. However, using enhanced sampling methods like RAVE to explore beyond AF2 starting structures leads to a significant improvement.

      Strengths:

      This is a nice demonstration of the utility of the authors' previously published RAVE method.

      Weaknesses:

      My only concern is the authors' discussion of induced fit. I'm quite confident the structures discussed are present in the absence of ligand binding, consistent with conformational selection. It seems the author's own data also argues for an important role in conformational selection. It would be nice to acknowledge this instead of going along with the common practice in drug discovery of attributing any conformational changes to induced fit without thoughtful consideration of conformational selection.

      The reviewer is correct. We aim to highlight the significant role of conformational selection. To clarify this, we have expanded the discussion on conformational selection in the introduction.

      Reviewer #3 (Public Review):

      In this manuscript, the authors aim to enhance AlphaFold2 for protein conformation-selective drug discovery through the integration of AlphaFold2 and physics-based methods, focusing on improving the accuracy of predicting protein structures ensemble and small molecule binding of metastable protein conformations to facilitate targeted drug design.

      The major strength of the paper lies in the methodology, which includes the innovative integration of AlphaFold2 with all-atom enhanced sampling molecular dynamics and induced fit docking to produce protein ensembles with structural diversity. Moreover, the generated structures can be used as reliable crystal-like decoys to enrich metastable conformations of holo-like structures. The authors demonstrate the effectiveness of the proposed approach in producing metastable structures of three different protein kinases and perform docking with their type I and II inhibitors. The paper provides strong evidence supporting the potential impact of this technology in drug discovery. However, limitations may exist in the generalizability of the approach across other structures, especially complex structures such as protein-protein or DNA-protein complexes.

      Proteins undergo thermodynamic fluctuations and can occasionally reach metastable configurations. It can be assumed that other biomolecules, such as proteins and DNA, stabilize these metastable states when forming protein-protein or protein-DNA complexes. Since our method has the potential to identify these metastable states, it shows promise for designing drugs targeting proteins in allosteric configurations induced by other biomolecules.

      The authors largely achieved their aims by demonstrating that the AF2RAVE-Glide workflow can generate holo-like structure candidates with a 50% successful docking rate for known type II inhibitors. This work is likely to have a significant impact on the field by offering a more precise and efficient method for predicting protein structure ensemble, which is essential for designing targeted drugs. The utility of the integrated AF2RAVE-Glide approach may streamline the drug discovery process, potentially leading to the development of more effective and specific medications for various diseases.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Suggestions

      (1) The computational protocol is found to be insufficient to generate precise values of the relative free energies between structures generated. The authors note in the Conclusion that an enhancement in the workflow might result from the addition of free energy calculations. Can the authors comment on the prospects for generating more accurate estimates of the free energy that might be used to qualitatively evaluate poses and the free energy landscape surrounding putative metastable states? What are the principal challenges and what might help overcome them? What would the most effective computational protocol be?

      More accurate estimates of the free energy can theoretically be achieved by increasing the number of umbrella sampling windows and extending the simulation length until the PMF converges. However, there is always a trade-off between PMF accuracy and computational costs, so we have chosen to stick with the current setup. Metadynamics is another method to obtain a more accurate free energy profile, which we have used in previous versions of AlphaFold2-RAVE, but for the specific systems we investigated, it had issues in achieving back and forth movement given the high entropic nature of the activation loop. Research in enhanced sampling methods and dimensionality reduction techniques for reaction coordinates is continually evolving and will play a critical role in alleviating this problem.

      (2) I was surprised that there was not more correlation of a funnel-like shape in Figures S16 and S18, showing a stronger correlation between low RMSD and better docking score. This is true for both the ponatinib and imatinib applications in DDR1 and Abl1. That also seems true for the trimmed results for Src kinase in Figure S19. I was also surprised that there are structures with very large RMSD but docking scores comparable to the best structures of the lowest RMSD. Might something be done to make the docking score a more effective discriminator?

      The docking algorithm and docking score are used to filter out highly improbable docking poses. False positives in predicted docking poses are a common issue across all docking methods as described for instance in:

      Fan, Jiyu, Ailing Fu, and Le Zhang. "Progress in molecular docking." Quantitative Biology 7 (2019): 83-89.

      Ferreira, R.S., Simeonov, A., Jadhav, A., Eidam, O., Mott, B.T., Keiser, M.J., McKerrow, J.H., Maloney, D.J., Irwin, J.J. and Shoichet, B.K., 2010. "Complementarity between a docking and a high-throughput screen in discovering new cruzain inhibitors." Journal of medicinal chemistry, 53(13), pp.4891-4905.

      Moreover, there is always a trade-off between docking accuracy and computational cost. While employing more accurate docking methods may decrease false positives, it can also be resource-intensive. In such scenarios, our approach to enriching holo-structures can be impactful by reducing the number of pocket structures in the input ensembles and significantly enhancing docking efficiency.

      (3) I think that it is fine to identify one structure as "IFD winner" but also feel that its significance is overstressed, especially given that it can be identified only in a retrospective analysis rather than through de novo prediction.

      We agree with the reviewer. We did not intend to emphasize the specific structure "IFD winner". Rather, we aimed to demonstrate that our method can enrich promising candidates for holo-structures. We verified this by showing that our holo-structure candidates performed well in retrospective docking using IFD, which we previously referred to as "IFD winner". We have now revised this term to "holo-model".

      Minor Points

      p. 3 "DymanicBind" should be "DynamicBind"

      p. 3 Change "We chosen" to "We have chosen" or "we chose."

      p. 3 In identifying the Schrödinger software Glide and IFD, I recommend removing the subjective modifier "industry-leading."

      Modifications done.

      Reviewer #2 (Recommendations For The Authors):

      In the view of this reviewer, the writing is 'choppy'.

      We have tried to improve the writing.

      Reviewer #3 (Recommendations For The Authors):

      (1) In Figure 1, the workflow labels (i) to (iv) are not shown on the figures, making it difficult for readers to follow. Consider adding these labels to the figures.

      Modifications done.

      (2) Explain how Boltzmann ranks were calculated based on unbiased MD simulations to guide the enrichment of holo-like structures in metastable states.

      The Methods section is now updated for clarification.

      (3) The authors could clarify how the classical DFG-out decoys in the DDR1 rMSA AF2 ensemble are transferred to Abl1 kinase in the Methods section.

      The Methods section is now updated for clarification.

      (4) The authors can clarify the methodology section by providing more detailed explanations about how the unbiased MD simulations are performed, including which MD simulation software was used and whether energy minimization and equilibrium steps were needed as in conventional MD simulations, and other setup details.

      The Methods section is now updated for clarification.

      (5) The validation of the proposed approach in this work used three kinase proteins. The authors can enhance the discussion section by addressing other types of protein structure prediction that can use the proposed approach in drug discovery, beyond the three kinase proteins tested.

      The proposed approach is theoretically applicable to other types of proteins, such as GPCRs, where both conformational selection and the induced-fit effect are crucial. We have expanded the discussion on the generalization of our protocol in the Conclusion section.

      (6) The authors should add appropriate citations for the software and tools used in the manuscript. For example, a reference should be added for the Glide XP docking experiments that utilized the Maestro software. Double-check all related software citations.

      We have now updated the citations for docking experiments based on the instruction of the Maestro Glide User manual and IFD User manual.

      (7) The authors should consider offering a comprehensive list of software tools and databases utilized in the study to assist in replicating the experiments and further validating the results.

      We have now added a summary of tools used in the Methods section.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      The authors present evidence suggesting that MDA5 can substitute as a sensor for triphosphate RNA in a species that naturally lacks RIG-I. The key findings are potentially important for our understanding of the evolution of innate immune responses. Compared to an earlier version of the paper, the strength of evidence has improved but it is still partially incomplete due to a few key missing experiments and controls.

      We would like to thank the editorial team for their positive comments and constructive suggestions on improving our manuscript. We have made further improvements based on the valuable suggestions of the reviewers, and we are pleased to send you the revised manuscript now. After revising the manuscript and further supplementing with experiments, we think that our existing data can support our claims.

      Public Reviews:

      Reviewer #1 (Public Review):

      This study offers valuable insights into host-virus interactions, emphasizing the adaptability of the immune system. Readers should recognize the significance of MDA5 in potentially replacing RIG-I and the adversarial strategy employed by 5'ppp-RNA SCRV in degrading MDA5 mediated by m6A modification in different species, further indicating that m6A is a conservational process in the antiviral immune response.

      However, caution is warranted in extrapolating these findings universally, given the dynamic nature of host-virus dynamics. The study provides a snapshot into the complexity of these interactions, but further research is needed to validate and extend these insights, considering potential variations across viral species and environmental contexts. Additionally, it is noted that the main claims put forth in the manuscript are only partially supported by the data presented.

      After meticulous revisions of the manuscript, including adjustments to the title, abstract, results, and discussion, the main claim of our study now is the arm race between the MDA5 receptor and SCRV virus in a lower vertebrate fish, M. miiuy. This mainly includes two parts: Firstly, the MDA5 of M. miiuy can recognize virus invasion and initiate host immune response by recognizing the triphosphate structure of SCRV. Secondly, as an adversarial strategy, 5’ppp-RNA SCRV virus can utilize the m6A mechanism to degrade MDA5 in M. miiuy. Based on the reviewer's suggestions, we have further supplemented the critical experiments (Figure 3F-3G, Figure 4D, Figure 5G) and provided a more detailed and accurate explanation of the experimental conclusions, we believe that our existing manuscript can support our main claims. In addition, because virus-host coevolution complicates the derivation of universal conclusions, we will further expand our insights in future research.

      Reviewer #2 (Public Review):

      This manuscript by Geng et al. aims to demonstrate that MDA5 compensates for the loss of RIG-I in certain species, such as teleost fish miiuy croaker. The authors use siniperca cheats rhabdovirus (SCRV) and poly(I:C) to demonstrate that these RNA ligands induce an IFN response in an MDA5-dependent manner in m.miiuy derived cells. Furthermore, they show that MDA5 requires its RD domain to directly bind to SCRV RNA and to induce an IFN response. They use in vitro synthesized RNA with a 5'triphosphate (or lacking a 5'triphosphate as a control) to demonstrate that MDA5 can directly bind to 5'-triphosphorylated RNA. The second part of the paper is devoted to m6A modification of MDA5 transcripts by SCRV as an immune evasion strategy. The authors demonstrate that the modification of MDA5 with m6A is increased upon infection and that this causes increased decay of MDA5 and consequently a decreased IFN response.

      One critical caveat in this study is that it does not address whether ppp-SCRV RNA induces IRF3-dimerization and type I IFN induction in an MDA5 dependent manner. The data demonstrate that mmiMDA5 can bind to triphosphorylated RNA (Fig. 4D). In addition, triphosphorylated RNA can dimerize IRF3 (4C). However, a key experiment that ties these two observations together is missing.

      Specifically, although Fig. 4C demonstrates that 5'ppp-SCRV RNA induces dimerization (unlike its dephosphorylated or capped derivatives), this does not proof that this happens in an MDA5-dependent manner. This experiment should have been done in WT and siMDA5 MKC cells side-by-side to demonstrate that the IRF3 dimerization that is observed here is mediated by MDA5 and not by another (unknown) protein. The same holds true for Fig. 4J.

      Thank you for the referee's professional suggestions. In fact, we have transfected SCRV RNA into WT and si-MDA5 MKC cells, and subsequently assessed the dimerization of IRF3 and the IFN response (Figure 2P-2Q). The results indicated that knockdown of MDA5 prevents immune activation of SCRV RNA. However, considering the potential for SCRV RNA to activate immunity independent of the triphosphate structure, this experimental observation does not comprehensively establish the MDA5-dependent induction of IRF3 dimer by 5’ppp-RNA. Accordingly, in accordance with the referee's recommendation, we proceeded to investigate the inducible activity of 5'ppp-SCRV on IRF3 dimerization in WT and si-MDA5 MKC cells, revealing that 5'ppp-SCRV indeed elicits immunity in an MDA5-dependent manner (Figure 4D). Additionally, poly(I:C)-HMW, a known ligand for MDA5, demonstrated a residual, albeit attenuated, activation of IRF3 following MDA5 knockdown, potentially attributed to its capacity to stimulate immunity through alternative pathways such as TLR3.

      - Fig 1C-D: these experiments are not sufficiently convincing, i.e. the difference in IRF3 dimerization between VSV-RNA and VSV-RNA+CIAP transfection is minimal.

      We have reconstituted the necessary materials and repeated the pertinent experiments depicted in Fig 1C-1D. The results demonstrate that SCRV-RNA+CIAP and VSV-RNA+CIAP exhibit a mitigating effect on the induction activity of SCRV-RNA and VSV-RNA on IRF3 dimerization, albeit without complete elimination (Figure 1C and 1D). These findings suggest the presence of receptors within M. miiuy and G. gallus capable of recognizing the viral triphosphate structure; however, it is worth noting that RNA derived from SCRV and VSV viruses does not exclusively depend on the triphosphate structure to activate the host's antiviral response.

      Fig. 2N and 2O: why did the authors decide to use overexpression of MDA5 to assess the impact of STING on MDA5-mediated IFN induction? This should have been done in cells transfected with SCRV or polyIC (as in 2D-G) or in infected cells (as in 2H-K). In addition, it is a pity that the authors did not include an siMAVS condition alongside siSTING, to investigate the relative contribution of MAVS versus STING to the MDA5-mediated IFN response. Panel O suggests that the IFN response is completely dependent on STING, which is hard to envision.

      In our previous laboratory investigations, we have substantiated the induction effect of STING on IFN under SCRV infection or poly(I:C) stimulation, as documented in the relevant literature (10.1007/s11427-020-1789-5), which we have referenced in our manuscript (lines 177-178). While we did assess the impact of STING on MDA5-mediated IFN induction in SCRV-infected cells, as indicated in the figure legends, we have revised Figure 2N-2O for improved clarity, and similarly, Figure 1H-1I has also been updated. Furthermore, considering that RNA virus infection can activate the cGAS/STING axis (10.3389/fcimb.2023.1172739) and the significant role of MAVS in sensing RNA virus invasion in the NLR pathway (10.1038/ni.1782), it is challenging to ascertain the respective contributions of STING and MAVS to the immune signaling cascade mediated by MDA5 during RNA virus infection. We intend to explore this aspect further in future research endeavors.

      Fig. 3F and 3G: where are the mock-transfected/infected conditions? Given that ectopic expression of hMDA5 is known to cause autoactivation of the IFN pathway, the baseline ISG levels should be shown (ie. In absence of a stimulus or infection). Normalization of the data does not reveal whether this is the case and is therefore misleading.

      Based on the reviewer's suggestions, we have rerun the experiment. We examined the effects of MDA5 and MDA5-ΔRD on antiviral factors in both uninfected, SCRV-infected, and poly(I:C)-HMW-stimulated MKC cells. Results showed that overexpression of both MDA5 and MDA5-ΔRD stimulated the expression of antiviral genes. However, when cells were infected or stimulated with SCRV or poly(I:C)-HMW, only the overexpression of MDA5, not MDA5-ΔRD, significantly increased the expression of antiviral genes (Figure 3F-3I).

      Fig. 4F and 4G: can the authors please indicate in the figure which area of the gel is relevant here? The band that runs halfway the gel? If so, the effects described in the text are not supported by the data (i.e. the 5'OH-SCRV and 5'pppGG-SCRV appear to compete with Bio-5'ppp-SCRV as well as 5'ppp-SCRV).

      Apologies for any confusion. The relevant areas in the gel pertaining to the experimental findings were denoted with asterisks and elaborated upon in the figure legends (Figure 4G, 4H, and 4M). The findings indicated that 5'ppp-SCRV, in contrast to 5'OH-SCRV and 5'pppGG-SCRV, demonstrated the ability to compete with bio-5'ppp-SCRV.

      My concerns about Fig. 5 remain unaltered. The fact that MDA5 is an ISG explains its increased expression and increased methylation pattern. The authors should at the very least mention in their text that MDA5 is an ISG and that their observations may be partially explained by this fact.

      First, as our m6A change analysis pipeline controls for changes in gene expression, these data should represent true changes in m6A modification rather than changes in the expression of m6A-modified transcripts (10.1038/s41598-020-63355-3). Similar studies demonstrated that m6A modification in RIOK3 and CIRBP mRNAs are altered following Flaviviridae infection (10.1016/j.molcel.2019.11.007). The specific calculation method is as follows: relative m6A level for each transcript was calculated as the percent of input in each condition normalized to that of the respective positive control spike-in. Fold change of enrichment was calculated with mock samples normalized to 1. Therefore, changes in the expression level of MDA5 can partially explain the increase in m6A modification on all MDA5 mRNA in cells, but it cannot indicate changes in m6A modification on each mDA5 transcript. We have supplemented the calculation method process in the manuscript and cited relevant literature (Lines 606-608). In addition, we have elaborated on the fact that MDA5 is an ISG gene in the experimental results (lines 260-261), and emphasized its compatibility with enhanced m6A modification of MDA5 in the discussion section (lines 405-409).

      Reviewer #3 (Public Review):

      In this manuscript, the authors explored the interaction between the pattern recognition receptor MDA5 and 5'ppp-RNA in the Miiuy croaker. They found that MDA5 can serve as a substitute for RIG-I in detecting 5'ppp-RNA of Siniperca cheilinus rhabdovirus (SCRV) when RIG-I is absent in Miiuy croaker. Furthermore, they observed MDA5's recognition of 5'ppp-RNA in chickens (Gallus gallus), a species lacking RIG-I. Additionally, the authors documented that MDA5's functionality can be compromised by m6A-mediated methylation and degradation of MDA5 mRNA, orchestrated by the METTL3/14-YTHDF2/3 regulatory network in Miiuy croaker during SCRV infection. This impairment compromises the innate antiviral immunity of fish, facilitating SCRV's immune evasion. These findings offer valuable insights into the adaptation and functional diversity of innate antiviral mechanisms in vertebrates.

      We extend our sincere appreciation for your professional comments and insightful suggestions on our manuscript, as they have significantly contributed to enhancing its quality.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The interpretation of Figures 1H and I, along with the captions, seems unclear. Particularly, understanding the meaning of the X-axis in Figure I is challenging. Additionally, the designation of "H2O = 1" on the Y-axis in Figure 1E lacks clarity. It would be helpful if the author could revise and clarify these figures for better comprehension.

      We appreciate your reminder and have corrected and clarified these figures and figure legends (lines 768-772). We have replaced the Y-axis of Figure 1I with "Relative mRNA expression" instead of " Relative IFN-1 expression" (Figure 1I). In addition, we have added an explanation of "H2O=1" in the legend of Figure 1E.

      (2) The interpretation of Figure 5 in section 2.5 seems incomplete. The author mentioned that both m6A levels and MDA5 expression levels are increased (lines 256-257), prompting questions about the relationship between m6A and MDA5 expression. If higher m6A levels typically lead to MDA5 mRNA instability and lower MDA5 expression, observing both increasing simultaneously appears contradictory. Considering the dynamic changes shown in Figure 5, it would be more appropriate to propose an alteration in both m6A levels and MDA5 expression levels. Given the fluctuating nature of these changes, definitively labeling them as solely "increased" is challenging. Therefore, offering a nuanced interpretation of the results and clarifying this aspect would bolster the study's conclusions.

      While changes in m6A modification and the expression of m6A-modified transcripts are biologically relevant, identifying bona fide m6A alterations during viral infection will allow us to understand how m6A modification of cellular mRNA is regulated. As our m6A change analysis pipeline controls for changes in gene expression, these data should represent true changes in m6A modification rather than changes in the expression of m6A-modified transcripts (10.1038/s41598-020-63355-3). Similar studies demonstrated that m6A modification in RIOK3 and CIRBP mRNAs are altered following Flaviviridae infection (10.1016/j.molcel.2019.11.007). The specific calculation method is as follows: relative m6A level for each transcript was calculated as the percent of input in each condition normalized to that of the respective positive control spike-in. Fold change of enrichment was calculated with mock samples normalized to 1. Therefore, the upregulation of MDA5 expression can partially explain the increase in m6A modification on all MDA5 mRNA in cells, but it cannot indicate changes in m6A modification on each mDA5 transcript. We have supplemented the calculation method process in the manuscript and cited relevant literature. I hope to receive your understanding.

      In addition, although higher m6A levels often lead to unstable MDA5 mRNA and lower MDA5 expression, SCRV can affect MDA5 expression through multiple pathways. For example, since MDA5 is an interferon-stimulated gene, the infection of SCRV virus can cause strong expression of interferon and indirectly induce high-level expression of MDA5. Therefore, the expression of MDA5 is not contradictory to the simultaneous increase in MDA5 modification (24 h). In order to further enhance our experimental conclusions, we supplemented the dual fluorescence experiment. The results indicate that, the infection of SCRV can inhibit the fluorescence activity of MDA5-exon1 reporter plasmids containing m6A sites but not including the promoter sequence of the MDA5 gene, and this inhibitory effect can be counteracted by cycloleucine (CL, an amino acid analogue that can inhibit m6A modification) (Figure 5G). This further indicates that SCRV can reduce the expression of MDA5 through the m6A pathway.

      Finally, in light of the fluctuations in MDA5 expression levels, we have changed the subheadings of Results 2.5 section and provided a more comprehensive and precise elucidation of the experimental outcomes. We are grateful for your valuable feedback.

      (3) In the discussion section, it would indeed be advantageous for the author to explore the novelty of this work more comprehensively, moving beyond merely acknowledging the widespread loss of RIG-I and suggesting MDA5 as a compensatory mechanism. Considering the well-established roles of MDA5 and m6A in host-virus interactions, the findings of this study may seem familiar in light of previous research. To enhance the discussion, it would be valuable for the author to delve into the implications of this evolutionary model. For instance, does the compensation or loss of RIG-I impact a species' susceptibility to specific types of viruses? Exploring such questions would provide insight into the broader significance of this compensation model and its potential effects on host-virus interactions, thus adding depth to the study's contribution.

      We appreciate the expert advice provided by the referee. In response, we have expanded our discussion in the relevant section, addressing the potential influence of RIG-I deficiency and MDA5 compensation on the antiviral immune system in vertebrates (lines 371-376). Furthermore, we underscore the significance of exploring the impact of SCRV infection on MDA5 m6A modification, considering its compatibility with MDA5 as an ISG gene, in elucidating the host response to viral infection (lines 405-409).

      (4) To improve the manuscript, it would be beneficial if the editors could aid the author in refining the language. Many descriptions in the article are overly redundant, and there should be appropriate differentiation between experimental methods and results.

      We appreciate the reviewer’s comment. We have carefully revised the manuscript and removed redundant descriptions in the experimental results and methods.

      Reviewer #3 (Recommendations For The Authors):

      The authors have addressed all of my concerns.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews

      Reviewer 1 summarized that: In this revised version of the manuscript, the authors have made important modifications in the text, inserted new data analyses, and incorporated additional references, as recommended by the reviewers. These modifications have significantly improved the quality of the manuscript.

      We are grateful for the reviewer's positive recognition of our revisions.

      Reviewer 2 noted that:

      (1) The authors do not show if the PVT mediates dPAG to BLA communication with any functional behavioral assay.

      We appreciate the reviewer’s suggestion to include a functional assay to investigate the role of the PVT in mediating communication between the dPAG and BLA. Our primary objective was to confirm the upstream role of the dPAG in processing and relaying naturalistic predatory threat information to the BLA, thereby broadening our current understanding of the dPAG-BLA relationship based on Pavlovian fear conditioning paradigms.

      Given previous anatomical findings indicating the absence of direct monosynaptic projections from the dPAG to the BLA (Cameron et al. 1995, McNally, Johansen, and Blair 2011, Vianna and Brandao 2003), we employed both anterograde and retrograde tracers, supplemented by c-Fos expression analysis following predatory threats, to explore possible routes through which threat signals may be conveyed from the dPAG to the BLA. Our findings indicated significant activity within the midline thalamic regions, particularly the PVT as a mediator of dPAG-BLA interactions, corroborating the possibility of dPAGàBLA information flow.

      Investigating the PVT's functional role appropriately would require single-unit recordings, correlation analysis of PVT neuronal responses with dPAG and BLA neuronal responses, and pathway-specific causal techniques, involving other midline thalamic regions for controls. This comprehensive study would represent an independent study.

      In response to previous feedback, we have carefully revised our manuscript to moderate the emphasis on the PVT's role. Both the Abstract, Results, and Discussion refer more broadly to "midline thalamic regions" and “The midline thalamus” (subheading) rather than specifically to the PVT. In the Introduction, we mention that the PVT "may be part of a network that conveys predatory threat information from the dPAG to the BLA." Our conclusions about the functional interaction between the dPAG and BLA, which broaden the view of Pavlovian fear conditioning, are not contingent on confirming a specific intermediary role for the PVT.

      (2) The author also do not thoroughly characterize the activity of BLA cells during the predatory assay.

      Our previous studies have extensively detailed BLA cell firing characteristics, including their responsiveness to food and/or a robot predator during the predatory assay (Kim et al. 2018, Kong et al. 2021), and compared these findings to other predator studies (Amir et al. 2019, Amir et al. 2015). In the current study, out of 85 BLA cells, 3 were food-specific and 4 responded to both the pellet and the robot, with none of these 7 cells responding to dPAG stimulation.

      Given our earlier findings of the immediate responses of BLA neurons to robot activation, we specifically examined whether robot-responsive BLA neurons receive signals from the dPAG. For this analysis, we excluded all food-related cells (pellet cells and BOTH cells) and focused on the time window immediately after robot activation (within 500 ms after robot onset). This approach enabled us to avoid potential confounds from residual effects of robot-induced immediate BLA responses during the animals’ flight and nest entry behaviors.

      Furthermore, as previously described, the robot is programmed to move forward a fixed distance and then return, repeatedly triggering foraging behavior. This setup facilitates the analysis of neural changes during food approach and predator avoidance conflicts. However, animals quickly adapt to the robot, reducing freezing and stretch-attend behaviors, making time-stamped analysis of these behaviors unfeasible.

      We would like to highlight that the present study explicitly focused on demonstrating whether BLA neurons that responded to intrinsic dPAG optogenetic stimulation also responded to extrinsic predatory robot activation, and compared their firing characteristics to those BLA neurons that did not respond to dPAG stimulation (Figure 3). This targeted analysis provides insights into the responsiveness of BLA neurons to both intrinsic and extrinsic stimuli, furthering our understanding of the dPAG-BLA interaction in the context of predatory threats.

      Reviewer 3 also raised no concerns and stated that: The series of experiments provide a compelling case for supporting their conclusions. The study brings important concepts revealing dynamics of fear-related circuits particularly attractive to a broad audience, from basic scientists interested in neural circuits to psychiatrists.

      We sincerely thank the reviewer for the positive feedback on our revisions.

      Recommendations for the Authors

      Reviewer 1: There are a few minor concerns that the authors may want to fix:

      (1) Point 5) The sentence: "The complexity of targeting the dPAG, which includes its dorsomedial, dorsolateral, lateral, and ventrolateral subdivisions" is hard to follow because the ventrolateral subdivision is not part of the dPAG. The authors may want to say specific subregions of the PAG instead. It is also unclear why transgenic animals would be needed for this projection-defined manipulations. The combination of retrograde Cre-recombinase virus with inhibitory opsin or chemogenetic approach may be sufficient.

      We appreciate the reviewer’s insightful feedback regarding our description of the dPAG and the use of transgenic mice in future studies. As suggested, we have corrected the manuscript to exclude the 'ventrolateral' subdivision from the dPAG description, now accurately aligning with pioneering studies (Bandler, Carrive, and Zhang 1991, Bandler and Keay 1996, Carrive 1993) that designated dPAG as including the dorsomedial (dmPAG), dorsolateral (dlPAG) and lateral (lPAG) regions, as cited in our revised manuscript.

      We acknowledge the reviewer’s helpful suggestion regarding the use of retrograde Cre-recombinase virus with inhibitory opsins or chemogenetic approaches as viable alternatives. These methods have been incorporated into our discussion (pages 14-15): “While our findings demonstrate that opto-stimulation of the dPAG is sufficient to trigger both fleeing behavior and increased BLA activity, we have not established that the dPAG-PVT circuit is necessary for the BLA’s response to predatory threats. To establish causality and interregional relationships, future studies should employ methods such as pathway-specific optogenetic inhibition (using retrograde Cre-recombinase virus with inhibitory opsins; Lavoie and Liu 2020, Li et al. 2016, Senn et al. 2014) or chemogenetics (Boender et al. 2014, Roth 2016) in conjunction with single unit recordings to fully characterize the dPAG-PVT-BLA circuitry’s (as opposed to other midline thalamic regions for controls) role in processing predatory threat-induced escape behavior. If inactivating the dPAG-PVT circuits reduces the BLA's response to threats, this would highlight the central role of the dPAG-PVT pathway in this defense mechanism. Conversely, if the BLA's response remains unchanged despite dPAG-PVT inactivation, it could suggest the existence of multiple pathways for antipredatory defenses.”

      This revision addresses the critique by clarifying the anatomical description of the dPAG and emphasizing the feasibility of using targeted viral approaches without the necessity for transgenic animals.

      (2) Point 6e) The authors mentioned that "pellet retrieval" was indicated by the animal entering a designated zone 19 cm from the pellet, driven by hunger. Entering the area 19cm of distance should be labeled as food approaching rather then food retrieval because in many occasions the animals may be some seconds away of grabbing the pellet.

      We agree and incorporate the change (pg. 22).

      (3) Point 11) We would strongly recommend the authors to replace the terminology "looming" by "approaching" to avoid confusion with several previous studies looking at defensive behaviors in responses to looming induced by the shadow of an object moving closer to the eyes.

      Done.

      (4) Point 17) The authors mentioned that "A total of three rats were utilized for the robot testing experiments depicted in Fig. 2 G-J." However, the figure indicates a total of 9 ChR2 and 4 controls.

      We apologize for the confusion in our previous author responses. To examine the optical stimulation effects on behavior in Fig. 2G-J, we used a total of 9 ChR2 and 4 EYFP rats. The experimental sequence is detailed in the previously revised manuscript (pg. 20): “For optical stimulation and behavioral experiments, the procedure included 3 baseline trials with the pellet placed 75 cm away, followed by 3 dPAG stimulation trials with the pellet locations sequentially set at 75 cm, 50 cm, and 25 cm. During each approach to the pellet, rats received 473-nm light stimulation (1-2 s, 20-Hz, 10-ms width, 1-3 mW) through a laser (Opto Engine LLC) and a pulse generator (Master-8; A.M.P.I.). Additional testing to examine the functional response curves was conducted over multiple days, with incremental adjustments to the stimulation parameters (intensity, frequency, duration) after confirming that normal baseline foraging behavior was maintained. For these tests, one parameter was adjusted incrementally while the others were held constant (intensity curve at 20 Hz, 2 s; frequency curve at 3 mW, 2 s; duration curve at 20 Hz, 3 mW). If the rat failed to procure the pellet within 3 min, the gate was closed, and the trial was concluded.”

      This clarification ensures that the actual number of animals used is accurately reflected and aligns with the figure data, addressing the reviewer's concern.

      Reviewer 2: The authors made important changes in the text to address study limitations, including citations requested by the Reviewers and additional discussions about how this work fits into the existing literature. These changes have strengthened the manuscript.

      (1) However, the authors did not perform new experiments to address any of the issues raised in the previous round of reviews. For example, they did not make optogenetic manipulations of the pathway including the PVT, and did not add any loss of function experiments. The justification that these experiments are better suited for future reports using mice is not convincing, because hundreds of papers performing these types of circuit dissection assays have been performed in rats.

      We appreciate the reviewer's comments regarding the experimental scope of our study. Our study’s primary objective was to explore the dPAG’s upstream functional role in processing and conveying naturalistic predatory threat information to the BLA, extending our current understanding of the dPAG-BLA relationship based on Pavlovian fear conditioning paradigms. We believe that our findings effectively address this goal.

      Our use of anterograde and retrograde tracers, supplemented by c-Fos expression analysis in response to predatory threats, was primarily conducted to verify the possibility of the dPAGàBLA information flow during predator encounters. This involved exploring potential routes through which threat signals might be conveyed from the dPAG to the BLA, given the lack of direct monosynaptic projections from the dPAG to BLA neurons (Cameron et al. 1995, McNally, Johansen, and Blair 2011, Vianna and Brandao 2003). This methodology helped us identify a potential structure, PVT, for more in-depth future studies. A thorough examination of the PVT's role would require single-unit recordings and causal techniques, incorporating other midline thalamic regions as controls, representing a significant and separate study on its own.

      In response to prior feedback, we have carefully revised our manuscript to generally address the role of "midline thalamic regions" rather than focusing specifically on the PVT. We wish to emphasize that our findings, which illustrate unique functional interactions between the dPAG and BLA in response to a predatory imminence, remain compelling and informative even without definitive evidence of the PVT’s involvement.

      Reviewer 3: In the revised version of the manuscript, the authors addressed adequately all the concerns raised by the reviewers. 

      We thank the reviewer for the thoughtful feedback on the earlier version of our manuscript and for reexamining the revisions we have made.

      References

      Amir, A., P. Kyriazi, S. C. Lee, D. B. Headley, and D. Pare. 2019. "Basolateral amygdala neurons are activated during threat expectation." J Neurophysiol 121 (5):1761-1777.

      Amir, A., S. C. Lee, D. B. Headley, M. M. Herzallah, and D. Pare. 2015. "Amygdala Signaling during Foraging in a Hazardous Environment." J Neurosci 35 (38):12994-3005.

      Bandler, R., P. Carrive, and S. P. Zhang. 1991. "Integration of somatic and autonomic reactions within the midbrain periaqueductal grey: viscerotopic, somatotopic and functional organization." Prog Brain Res 87:269-305.

      Bandler, R., and K. A. Keay. 1996. "Columnar organization in the midbrain periaqueductal gray and the integration of emotional expression." Prog Brain Res 107:285-300.

      Boender, A. J., J. W. de Jong, L. Boekhoudt, M. C. Luijendijk, G. van der Plasse, and R. A. Adan. 2014. "Combined use of the canine adenovirus-2 and DREADD-technology to activate specific neural pathways in vivo." PLoS One 9 (4):e95392.

      Cameron, A. A., I. A. Khan, K. N. Westlund, and W. D. Willis. 1995. "The efferent projections of the periaqueductal gray in the rat: a Phaseolus vulgaris-leucoagglutinin study. II. Descending projections." J Comp Neurol 351 (4):585-601.

      Carrive, P. 1993. "The periaqueductal gray and defensive behavior: functional representation and neuronal organization." Behav Brain Res 58 (1-2):27-47.

      Kim, E. J., M. S. Kong, S. G. Park, S. J. Y. Mizumori, J. Cho, and J. J. Kim. 2018. "Dynamic coding of predatory information between the prelimbic cortex and lateral amygdala in foraging rats." Sci Adv 4 (4):eaar7328.

      Kong, M. S., E. J. Kim, S. Park, L. S. Zweifel, Y. Huh, J. Cho, and J. J. Kim. 2021. "'Fearful-place' coding in the amygdala-hippocampal network." Elife 10.

      Lavoie, A., and B. H. Liu. 2020. "Canine Adenovirus 2: A Natural Choice for Brain Circuit Dissection." Front Mol Neurosci 13:9.

      Li, Y., L. Hickey, R. Perrins, E. Werlen, A. A. Patel, S. Hirschberg, M. W. Jones, S. Salinas, E. J. Kremer, and A. E. Pickering. 2016. "Retrograde optogenetic characterization of the pontospinal module of the locus coeruleus with a canine adenoviral vector." Brain Res 1641 (Pt B):274-90.

      McNally, G. P., J. P. Johansen, and H. T. Blair. 2011. "Placing prediction into the fear circuit."  Trends Neurosci 34 (6):283-92.

      Roth, B. L. 2016. "DREADDs for Neuroscientists." Neuron 89 (4):683-94.

      Senn, V., S. B. Wolff, C. Herry, F. Grenier, I. Ehrlich, J. Grundemann, J. P. Fadok, C. Muller, J. J. Letzkus, and A. Luthi. 2014. "Long-range connectivity defines behavioral specificity of amygdala neurons." Neuron 81 (2):428-37.

      Vianna, D. M., and M. L. Brandao. 2003. "Anatomical connections of the periaqueductal gray: specific neural substrates for different kinds of fear." Braz J Med Biol Res 36 (5):557-66.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      The author presents the discovery and characterization of CAPSL as a potential gene linked to Familial Exudative Vitreoretinopathy (FEVR), identifying one nonsense and one missense mutation within CAPSL in two distinct patient families afflicted by FEVR. Cell transfection assays suggest that the missense mutation adversely affects protein levels when overexpressed in cell cultures. Furthermore, conditionally knocking out CAPSL in vascular endothelial cells leads to compromised vascular development. The suppression of CAPSL in human retinal microvascular endothelial cells results in hindered tube formation, a decrease in cell proliferation, and disrupted cell polarity. Additionally, transcriptomic and proteomic profiling of these cells indicates alterations in the MYC pathway. 

      Strengths: 

      The study is nicely designed with a combination of in vivo and in vitro approaches, and the experimental results are good quality. 

      We thank the reviewer for the conclusion and positive comments.

      Weaknesses: 

      My reservations lie with the main assertion that CAPSL is associated with FEVR, as the genetic evidence from human studies appears relatively weak. Further careful examination of human genetics evidence in both patient cohorts and the general population will help to clarify. In light of human genetics, more caution needs to be exercised when interpreting results from mice and cell models and how is it related to the human patient phenotype. 

      We thank the reviewer for careful reading and constructive suggestion. we added several experiments to address the concern of reviewer are as follows:

      (1) The pLI score of LOF allele of CAPSL is based of general population, among which Europeans account for ~77% and East Asians make up less than 3%. Since the FEVR families in this article all come from China, the pLI score may not be accurate. Of course, we will continue to collect FEVR pedigrees.

      (2) We evaluated the phenotype of Capsl heterozygous mice at P5, and the results showed no overt difference in vascular progression, vessel density and branchpoints with littermate wildtype controls (Fig.S4). The lack of pronounced phenotype in FEVR heterozygous mice may be due to different sensitivity between human and mice. A similar example is LRP5 mutations associated with FEVR. Heterozygous mutations in LRP5 were reported in FEVR patients in multiple populations (PMID: 16929062, 33302760, 27486893, 35918671, 36411543). However, heterozygous Lrp5 knockout mice exhibited no visible angiogenic phenotype (PMID: 18263894). Corresponding description was added in the manuscript at page 6.

      (3) We further assessed the angiogenic phenotype when angiogenesis almost complete at P21, and the resulted revealed no difference observed between Ctrl and CapsliECKO/iECKO mice (Fig.S5). And corresponding description was added in the manuscript at page 7.

      (4) We evaluated the expression of MYC downstream genes in vivo using lung tissue form P35 Ctrl and _Capsl_iECKO/iECKO mice (Fig.S8). Consistent with the results from in vitro HRECs, _Capsl_iECKO/iECKO mice showed downregulated expression of MYC targets. And corresponding description was added in the manuscript at page 11.

      Reviewer #2 (Public Review): 

      Summary: 

      This work identifies two variants in CAPSL in two-generation familial exudative vitreoretinopathy (FEVR) pedigrees, and using a knockout mouse model, they link CAPSL to retinal vascular development and endothelial proliferation. Together, these findings suggest that the identified variants may be causative and that CAPSL is a new FEVR-associated gene. 

      Strengths: 

      The authors' data provides compelling evidence that loss of the poorly understood protein CAPSL can lead to reduced endothelial proliferation in mouse retina and suppression of MYC signaling in vitro, consistent with the disease seen in FEVR patients. The study is important, providing new potential targets and mechanisms for this poorly understood disease. The paper is clearly written, and the data generally support the author's hypotheses. 

      We thank the reviewer for the conclusion and positive comments.

      Weaknesses: 

      (1) Both pedigrees described appear to suggest that heterozygosity is sufficient to cause disease, but authors have not explored the phenotype of Capsl heterozygous mice. Do these animals have reduced angiogenesis similar to KOs? Furthermore, while the p.R30X variant protein does not appear to be expressed in vitro, a substantial amount of p.L83F was detectable by western blot and appeared to be at the normal molecular weight. Given that the full knockout mouse phenotype is comparatively mild, it is unclear whether this modest reduction in protein expression would be sufficient to cause FEVR - especially as the affected individuals still have one healthy copy of the gene. Additional studies are needed to determine if these variants alter protein trafficking or localization in addition to expression, and if they can act in a dominant negative fashion. 

      We thank the reviewer for the suggestion. We evaluated the phenotype of Capsl heterozygous mice at P5 (Fig.S4), and the results showed no overt difference in angiogenesis compared with littermate control mice.

      We transfected CAPSL wild-type plasmid, p.R30X mutant plasmid and p.L83F mutant plasmid into 293T cells to assess the intracellular localization change of CAPSL mutant proteins (Fig.S1). The result showed that the point mutation did not affect the localization of the mutated protein, and corresponding description was added in the manuscript at page 5.

      (2) The manuscript nicely shows that loss of CAPSL leads to suppressed MYC signaling in vitro. However, given that endothelial MYC is regulated by numerous pathways and proteins, including FOXO1, VEGFR2, ERK, and Notch, and reduced MYC signaling is generally associated with reduced endothelial proliferation, this finding provides little insight into the mechanism of CAPSL in regulating endothelial proliferation. It would be helpful to explore the status of these other pathways in knockdown cells but as the authors provide only GSEA results and not the underlying data behind their RNA seq results, it is difficult for the reader to understand the full phenotype. Volcano plots or similar representations of the underlying expression data in Figures 6 and 7 as well as supplemental datasets showing the differentially regulated genes should be included. In addition, while the paper beautifully characterizes the delayed retinal angiogenesis phenotype in CAPSL knockout mice, the authors do not return to that model to confirm their in vitro findings. 

      We thank the reviewer for the suggestion. Although endothelial MYC can be regulated by FOXO1, VEGFR2, ERK, and Notch signaling pathway, these pathways are not enriched in the RNA seq data of CAPSL-depleted HRECs. This suggests that the down regulated MYC targets may not be influenced by the signaling pathway mentioned above. RNA-seq raw data have been uploaded to the Genome Sequence Archive (https://ngdc.cncb.ac.cn/gsa/browse/HRA010305) and proteomic profiling raw data have been uploaded to the Genome Sequence Archive (https://www.ebi.ac.uk/pride/archive), and the assigned accession number was PXD051696. Corresponding description was added in the manuscript at page 20-21. The datasets represent the differentially regulated genes in Figure 6 and 7 were listed at Dataset S1 and S2.

      (3) In Figure S2D, the result of this vascular leak experiment is unconvincing as no dye can be seen in the vessels. What are the kinetics for biocytin tracers to enter the bloodstream after IP injection? Why did the authors choose the IP instead of the IV route for this experiment? Differences in the uptake of the eye after IP injection could confound the results, especially in the context of a model with vascular dysfunction as here. 

      We thank the reviewer for suggestion. In Figure S2D (now Fig.S6D), we used a non-representative image to show vascular leakage. We replaced the images with more representative ones. We are sorry that we are not clear about the kinetics for biocytin tracers to enter the bloodstream after IP injection. Since the experiment was carried out on mice at P5, it is not feasible to do IV injection in P5 neonatal mice. We followed the methods described in the previous study involving mice of same age (PMID:35361685).

      (4) In Figure 5, it is unclear how filipodia and tip cells were identified and selected for quantification. The panels do not include nuclear or tip cell-specific markers that would allow quantification of individual tip cells, and in Figure 5C it appears that some filipodia are not highlighted in the mutant panel. 

      We thank the reviewer for the comments. In Figure 5, we used HRECs to examine the cell proliferation, migration and polarity in vitro, and therefore there is no distinction between tip cells and stalk cells. The quantification of filopodia/lamellipodia was performed as previous studies (PMID: 30783090, PMID: 28805663). In briefly, wound scratch was performed on confluent layers of transfected HRECs, and 9 hours after initiating cell migration by scratch, cells were fixed and stained with phalloidin. Cells at the edge of wound were considered as leader cells and quantified for number of filopodia/lamellipodia.

      Reviewer #3 (Public Review): 

      Summary: 

      This manuscript by Liu et al. presents a case that CAPSL mutations are a cause of familial exudative vitreoretinopathy (FEVR). Attention was initially focused on the CAPSL gene from whole exome sequence analysis of two small families. The follow-up analyses included studies in which CAPSL was manipulated in endothelial cells of mice and multiple iterations of molecular and cellular analyses. Together, the data show that CAPSL influences endothelial cell proliferation and migration. Molecularly, transcriptomic and proteomic analyses suggest that CAPSL influences many genes/proteins that are also downstream targets of MYC and may be important to the mechanisms. 

      Strengths: 

      This multi-pronged approach found a previously unknown function for CAPSLs in endothelial cells and pointed at MYC pathways as high-quality candidates in the mechanism. 

      Weaknesses: 

      Two issues shape the overall impact for me. First, the unreported population frequency of the variants in the manuscript makes it unclear if CAPSL should be considered an interesting candidate possibly contributing to FEVR, or possibly a cause. Second, it is unclear if the identified variants act dominantly, as indicated in the pedigrees. The studies in mice utilized homozygotes for an endothelial cell-specific knockout, leaving uncertainty about what phenotypes might be observed if mice heterozygous for a ubiquitous knockout had instead been studied. 

      In my opinion, the following scientific issues are specific weaknesses that should be addressed: 

      (1) Please state in the manuscript the number of FEVR families that were studied by WES. Please also describe if the families had been selected for the absence of known mutations, and/or what percentage lack known pathogenic variants. 

      We thank the reviewer for thoughtful comments. 120 FEVR families were studied by WES and we added corresponding description in the manuscript at page 4.

      (2) A better clinical description of family 3104 would enhance the manuscript, especially the father. It is unclear what "manifested with FEVR symptoms, according to the medical records" means. Was the father diagnosed with FEVR? If the father has some iteration of a mild case, please describe it in more detail. If the lack of clinical images in the figure is indicative of a lack of medical documentation, please note this in the manuscript. 

      We thank the reviewer for thoughtful comments. The father of family 3104 has also been identified as a carrier of this heterozygous variant, manifested with FEVR symptoms, according to the medical records. Nevertheless, clinical examination images are presently unavailable. We added corresponding description in the manuscript at page 5.

      (3) The TGA stop codon can in some instances also influence splicing (PMID: 38012313). Please add a bioinformatic assessment of splicing prediction to the assays and report its output in the manuscript. 

      We thank the reviewer for thoughtful comments. We predicted the splicing of c.88C>T variant of CAPSL using MaxEntScan (http://hollywood.mit.edu/burgelab/maxent/Xmaxentscan_scoreseq.html) and SpliceTool (https://rddc.tsinghua-gd.org/ai) (Fig.S2). MaxEntScan and SpliceTool were used to predict the impact of TGA stop codon of c.88C>T variant on the formation of a cryptic donor splice site.

      (4) More details regarding utilizing a "loxp-flanked allele of CAPSL" are needed. Is this an existing allele, if so, what is the allele and citation? If new (as suggested by S1), the newly generated CAPSL mutant mouse strain needs to be entered into the MGI database and assigned an official allele name - which should then be utilized in the manuscript and who generated the strain (presumably a core or company?) must be described. 

      We added detailed description of Capsl flxoed allele to Method section on page 14-15: “Capslloxp/+ model was generated using the CRISPR/Cas9 nickase technique by Viewsolid Biotechology (Beijing, China) in C57BL/6J background and named Capslem1zxj. The genomic RNA (gRNA) sequence was as follows: Capsl-L gRNA: 5’-CTATCCCAA TTGTGCTCCTGG-3’; Capsl-R gRNA: 5’-TGGGACTCATGGTTCTAGAGG-3’. ”

      (5) The statement in the methods "All mice used in the study were on a C57BL/6J genetic background," should be better defined. Was the new allele generated on a pure C57BL/6J genetic background, or bred to be some level of congenic? If congenic, to what generation? If unknown, please either test and report the homogeneity of the background, or consult with nomenclature experts (such as available through MGI) to adopt the appropriate F?+NX type designation. This also pertains to the Pdgfb-iCreER mice, which reference 43 describes as having been generated in an F2 population of C57BL/6 X CBA and did not designate the sub-strain of C57BL/6 mice. It is important because one of the explanations for missing heritability in FEVR may be a high level of dependence on genetic background. From the information in the current description, it is also not inherently obvious that the mice studied did not harbor confounding mutations such as rd1 or rd8. 

      We thank the reviewer for suggestion. We added the following description to “Mouse model and genotyping” method section on page 14. “Capslloxp/+ model was generated using the CRISPR/Cas9 nickase technique by Viewsolid Biotechology (Beijing, China) in C57BL/6J background and named Capslem1zxj. The genomic RNA (gRNA) sequence was as follows: Capsl-L gRNA: 5’-CTATCCCAA TTGTGCTCCTGG-3’; Capsl-R gRNA: 5’-TGGGACTCATGGTTCTAGAGG-3’. Pdgfb-iCreER[43] transgenic mice on a mixed background of C57BL/6 and CBA was obtainted from Dr. Marcus Fruttiger and backcrossed to background for 6 generations. Capslloxp/+ mice were bred with Pdgfb-iCreER[43] transgenic mice to generate Capslloxp/loxp, Pdgfb-iCreER mice.” Sanger sequencing was performed on experimental mice to identify whether they harbor confounding mutations such as Pde6b or Crb1. The results showed the mice did not harbor confounding mutations (Fig.S9) and corresponding description was added in the manuscript at page 15.

      (6) In my opinion, more experimental detail is needed regarding Figures 2 and 3. How many fields, of how many retinas and mice were analyzed in Figure 2? How many mice were assessed in Figure 3? 

      We thank the reviewer for thoughtful comments. We have already presented the detailed information in the manuscript, please refer to the “Methods-Quantification of retinal parameters” section for experimental details.

      (7) I suggest adding into the methods whether P-values were corrected for multiple tests. 

      We thank the reviewer for suggestion. Actually, the statistical analysis was performed using unpaired Student’s t-test for comparison between two groups or one-way ANOVA followed by Dunnett multiple comparison test for comparison of multiple groups. The above description was added to “Methods-Image acquisition and statistical analysis” section to make it clear.

      Recommendations for the authors:

      Reviewing Editor (Recommendations For The Authors): 

      In summary, the following concerns should addressing reviewers' concerns as outlined below could bolster the evidence from "solid" to "convincing" and further strengthen the study's impact. 

      (1) Analysis of the phenotype in CAPSLheterozygous mice, as highlighted by all 3 reviews. 

      We thank the editor for thoughtful comments. The phenotype analysis of Capsl heterozygous mice was added to Fig.S4, with the corresponding description provided at page 6.

      (2) Analysis of Capsl KO mice to determine if the pathways identified in vitro are modified (as suggested by reviewers 1 & 2). 

      We thank the editor for suggestion. In Fig.S7, RT-qPCR was performed on lung tissues from Capsl Ctrl and KO mice to validate the expression of MYC targets in vivo. And the result indicated that the downstream targets of MYC signaling were also downregulated in vivo, consistent with the in vitro findings.

      (3) Additional description of the genetic pedigrees and variants to address the points raised by reviewer #3. 

      We thank the editor for suggestion. The father of family 3104 has also been identified as a carrier of this heterozygous variant, manifested with FEVR symptoms, according to the medical records. Nevertheless, clinical examination data are presently unavailable. We added corresponding description in the manuscript page 5.

      (4) Validation of the identified protein variants, especially L83F which appears to be expressed at a near normal level. Are these proteins mislocalized, do the variants to interfere with sites of known or predicted protein-protein interactions, could they act in a dominant-negative fashion by aggregation with co-expressed WT protein etc. Given the comparatively weak genetic data, additional validation is required to establish plausibility of CAPSL as a FEVR gene. 

      We thank the editor for suggestion. As substantial amount of p.L83F was detectable at normal molecular weight, we further investigated whether this variant affects protein localization. Fig.S1, immunocytochemistry results indicated that this variant does not affect the subcellular localization of the protein.

      (5) Improved description of experimental details and statistical analyses as outlined by reviewer #3. 

      We thank the editor for suggestion. The more detailed information about Capsl mice was added in the manuscript at page 14-15. The experimental details regarding Figure 2 and Figure 3 have already presented in the “Methods-Quantification of retina parameters” section in the manuscript at page 19-20. And the statistical analysis was performed using unpaired Student’s t-test for comparison between two groups or one-way ANOVA followed by Dunnett multiple comparison test for comparison of multiple groups. The above description was added to “Methods-Image acquisition and statistical analysis” section at page 21 to make it clear.

      Reviewer #1 (Recommendations For The Authors): 

      My reservations lie with the main assertion that CAPSL is associated with FEVR, as the genetic evidence from human studies appears relatively weak. My concerns are as follows: 

      (1) The molecular characterization of the identified mutations suggests a loss of function (LOF). Notably, in one family, both the father and son exhibit the FEVR phenotype and share the LOF mutation, suggesting a dominant mode of inheritance. However, the prevalence of the LOF allele of CAPSL in the general population is high, and its pLI score is 0, according to the GNOMAD database. This raises doubts about the LOF variant of CAPSL being causative for FEVR. 

      We thank the reviewer for recommendation. The pLI score of LOF allele of CAPSL is based of general population, among which Europeans account for ~77% and East Asians make up less than 3%. Since the FEVR families in this article all come from China, the pLI score may not be accurate. Of course, we will continue to collect FEVR pedigrees and screen for CAPSL mutations.

      (2) In the conditional knockout study, a delay in vascular development is observed in the retina up to P14. What the phenotype looks like in adult mice and whether it replicates the human FEVR phenotype? 

      We thank the reviewer for recommendation. We further assessed the phenotype when angiogenesis almost complete at P21, the resulted showed no difference in Ctrl and CapsliECKO/iECKO mice (Fig.S5). And corresponding description was added in the manuscript at page 7.

      (3) The conditional knockout mice lack both alleles of CAPSL. The phenotype resulting from the knockout of a single allele needs investigation to align with observed human phenotypes and genetic data. 

      We thank the reviewer for recommendation. The phenotype of Capsl heterozygous mice at P5 showed no overt difference in vascular progression, vessel density and branchpoints with littermate wildtype controls (Fig.S4). The lack of pronounced phenotype in FEVR heterozygous mice may be due to different sensitivity between human and mice. A similar example is LRP5 mutations associated with FEVR. Heterozygous mutations in LRP5 were reported in FEVR patients in multiple populations. However, heterozygous Lrp5 mice exhibited no visible angiogenic phenotype (PMID: 18263894).

      (4) The MYC pathway has been identified as influenced by CAPSL. Whether MYC downregulation is observed in the mouse model in vivo? 

      We thank the reviewer for recommendation. MYC expression was identified at both mRNA and protein level in Figure S8, and corresponding description was added in the manuscript at page 11.

      Reviewer #2 (Recommendations For The Authors): 

      Minor comments: 

      (1) While authors note that little is known about CAPSL protein function, more introductory detail about the protein (structure, domains intracellular localization etc) and additional discussion on potential mechanisms would aid the reader in interpreting the findings and model.

      We thank the reviewer for recommendation. The subcellular localization of the CAPSL protein is distributed in both the nucleus and cytoplasm (https://www.proteinatlas.org/). The immunochemistry analysis confirmed that CAPSL protein is expressed in both the cell nucleus and cytoplasm (Fig.S1). And corresponding description was added in the manuscript at page 5.

      (2) Pg 7 states that Capsl knockout mainly leads to "...defects in retinal vascular ECs rather than other vascular cells.". Consider rephrasing to describe "other vasculature-associated cells", as no vascular cells outside the retina were examined in the manuscript. 

      We thank the reviewer for recommendation. We rephrased the "...defects in retinal vascular ECs rather than other vascular cells." into "...defects in retinal vascular ECs rather than other vasculature-associated cells" at page 8.

      (3) The manuscript is well written but contains numerous typos. E.g. "" (Pg 14), "MCY signaling axis" (figure 6 legend), "shCAPAL" (figure 5 K). Please correct these, and search carefully for others. 

      We are sorry for the careless mistakes we made, and we have checked the manuscript and correct these mistakes.

      Reviewer #3 (Recommendations For The Authors): 

      The following are somewhat grammatical, but significant issues, that I feel should be addressed before making the pre-print final: 

      (1) Perhaps the largest issue with the manuscript to me is whether CAPSL is an interesting candidate (as stated repeatedly) or causative of FEVR. Within the scope of what is feasible, this is a challenging problem. Since the publication of the pre-print, it would be great if another group independently reported the detection of mutations specifically in FEVR patients. That lacking, meaningful additions to the manuscript that I'd recommend are the inclusion of a paragraph on caveats of the study and reporting the allele frequencies based on public databases. As the authors know the data better than anyone and will have invested thought into the implications, they are the ones best positioned to alert the field to the study's limitations - amongst them- the factors that might practically distinguish whether CAPSL is a candidate or cause.

      We thank the reviewer for recommendation. We will collect more samples from FEVR families and screen for other mutation sites within the CAPSL gene in further studies.

      (2) It is unclear why the modeling with mice did not attempt to recapitulate the observations in humans, i.e., why were heterozygotes for a ubiquitous knockout not studied? Any data with heterozygotes, or ubiquitous alleles (which would be easier to generate than the strain studied) should be shared in the manuscript. If no such data exists, this reviewer would find it a worthwhile new experiment to add, but it is appreciated that new experiments are sometimes beyond the scope of what is possible. At the least, this would be worthwhile to discuss in the requested caveats paragraph of the discussion. 

      We thank the reviewer for recommendation. We evaluated the phenotype of Capsl heterozygous mice at P5, and the results showed no overt difference in vascular progression, vessel density and branchpoints with littermate wildtype controls (Fig.S4). The lack of pronounced phenotype in FEVR heterozygous mice may be due to different sensitivity between human and mice. For example, heterozygous Lrp5 mice exhibited no visible angiogenic phenotype (PMID: 18263894). Corresponding description was added in the manuscript at page 6.

      (3) The statement in the Abstract "which provides invaluable information for genetic counseling and prenatal diagnosis of FEVR" should be toned down, better supported, or rephrased. This appears to be the 18th disease-associated gene for FEVR, with variants identified in 4 patients of the same ethnicity. In my opinion, the word "invaluable" is currently overstated. 

      We thank the reviewer for recommendation. We have changed "which provides invaluable information for genetic counseling and prenatal diagnosis of FEVR" into "which provides valuable information for genetic counseling and prenatal diagnosis of FEVR" in the abstract.

      (4) The transcriptomic and proteomic data should be deposited into a public repository and accession numbers added to the manuscript. 

      We thank the reviewer for recommendation. We have uploaded the raw data of transcriptomic and proteomic to the Genome Sequence Archive (https://ngdc.cncb.ac.cn/gsa/browse/HRA010305) and the Genome Sequence Archive (https://www.ebi.ac.uk/pride/archive), respectively.

      (5) The links to MYC are over-stated in the title "through the MYC axis", the abstract "CAPSL function causes FEVR through MYC axis", and the discussion "we demonstrated that the defects in CAPSL affect EC function by down-regulating the MYC signaling cascade". The links to MYC are entirely by association, there were no experiments testing that the transcriptomic and proteomic changes observed were determinative of the CAPSL-mediated phenotype. It seems appropriate to conjecture that these changes are important, but the above statements all need to be altered and conjectures need to be clearly identified as such. 

      We are sorry to overstate the link between CAPSL-mediated phenotype and MYC axis in the abstract and discussion sections, and we have altered the statements in these sections to make it more logical. For example, we changed “This study also reveals that compromised CAPSL function causes FEVR through MYC axis, shedding light on the potential involvement of MYC signaling in the pathogenesis of FEVR.” into “This study also reveals that compromised CAPSL function causes FEVR may through MYC axis, shedding light on the potential involvement of MYC signaling in the pathogenesis of FEVR.” in the abstract. And in the discussion we changed “…cause FEVR through inactivating MYC signaling, expanding FEVR-involved signaling pathway and providing a potential therapeutic target for the intervention of FEVR” to “…cause FEVR may through inactivating MYC signaling, expanding FEVR-involved signaling pathway and providing a potential therapeutic target for the intervention of FEVR”.

      (6) Finally, I suggest that the following grammatical issues in the pre-print be corrected before making the pre-print final: 

      We have checked the manuscript and correct these mistakes.

      (a) p2. Suggest rewriting the sentence "Nevertheless, the molecular mechanisms by which CAPSL regulates cell processes and signaling cascades have yet to be elucidated." The preceding sentences only state that CASPL is a candidate in another disease - the word "nevertheless" seems to reflect a logic that isn't described. 

      We have checked the manuscript and correct these mistakes.

      (b) p5. Please correct the grammar "We, generated an inducible" 

      We corrected this mistake.

      (c) p5. Suggest rephrasing "impairing CAPSL expression." The word "expression" is often used in reference to transcription. To avoid confusion, something such as "eliminating or reducing protein abundance" might be better. 

      We corrected this mistake.

      (d) p6. Please correct the grammar "As expected, the radial vascular growth, as well as vessel density and vascular branching, are dramatically reduced in..." - note subject-verb agreement issue 

      We corrected this mistake.

      (e) Figure 3 legend - correct "(A) Hyloaid vessels"

      We corrected this mistake.

  2. Jul 2024
    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      Fita-Torró et al. study the toxic effects of the intermediary lipid degradation product trans-2-hexadecenal (t-2-hex) on yeast mitochondria and suggest a mechanism by which Hfd1 safeguards Tom40 from lipidation by t-2-hex and its consequences, such as mitochondrial protein import inhibition, cellular proteostasis deregulation, and stress-responses. 

      The authors aimed to dissect a mechanism for t-2-hex' apoptotic consequences in yeast and they suggest it is via lipidation of Tom40 but really under the tested conditions everything seems lipidated. Thus, it is unclear whether Tom40 is the crucial causal target. They also do not provide much biochemical experiments to investigate this phenomenon further functionally. Tom40 is one possible and perhaps, given the cellular consequences, a reasonable candidate but not validated beyond in vitro lipidation by exogenous t-2-hex. 

      In the revised version of our manuscript, we have now included extensive new experimentation, which shows that protein import at the TOM complex is a physiologically important target of the pro-apoptotic lipid t-2-hex and that enzymes such as the Hfd1 dehydrogenase sensitively regulate this inhibition. In vitro chemoproteomic experiments have now been performed at more physiological t-2hex concentrations of 10µM, which is lower than published data in human cell models. Consistently, several TOM and TIM subunits are enriched in these in vitro lipidation studies (new Fig. 8B). Tom40 lipidation alone is not sufficient to explain t2-hex toxicity, as a cysteine-free version of Tom40 does not confer tolerance to the apoptotic lipid (new Fig. 8D). Importantly however, the loss of function of nonessential accessory Tom subunits 70 or 20 confers t-2-hex tolerance (new Fig. 8D) indicating that pre-protein import at the TOM complex is a physiological target of t2-hex most likely dependent on lipidation of more Tom subunits than just the essential Tom40 pore. Moreover, we now show that mitochondrial protein import is inhibited by the lipid at low physiological doses of 10µM and that this inhibition is modulated by the gene dose of the t-2-hex degrading Hfd1 enzyme (new Fig. 5G).

      Strengths: 

      The effects of lipids and their metabolic intermediates on protein function are understudied thus the authors' research contributing to elucidating direct effects of a single lipid is appreciated. It is particularly unknown by which mechanism t-2hex causes cell death in yeast. The authors elegantly use modulation of the levels of enzyme Hfd1 that endogenously catabolizes t-2-hex as an approach to studying t2-hex stress. Understanding the cause and consequences of this stress is relevant for understanding fundamental regulation mechanisms, and also to human health since the human homolog of Hfd1, ALDH3A2, is mutated in Sjögren-Larsson Syndrome. The application of a variety of global transcriptomic, functional genomic, and chemoproteomic approaches to study t-2-hex stress targets in the yeast model is laudable. 

      Weaknesses: 

      -  The extent of the contribution of Tom40 lipidation to the general t-2-hex stress phenotype is unclear. Is Tom40 lipidation alone enough to cause the phenotype? An alteration of the cysteine residue in question could help answer this key question. 

      Deletion of all four cysteine residues in Tom40 is not sufficient to confer resistance to t-2-hex stress. This result had been included in the original manuscript, but was somehow hidden in the Discussion. The revised manuscript now includes t-2hex tolerance assays for the Tom40 cysteine free mutant in new Figure 8. As a result, cysteine lipidation of Tom40 alone is not sufficient to confer t-2-hex toxicity. This implies most likely other lipidation targets within the TOM and TIM complexes, as indicated by our in vitro lipidation studies. We therefore included the non-essential adaptor proteins Tom70 and Tom20 of the TOM complex and tested the tolerance of the respective deletion mutants in t-2-hex tolerance assays. As shown in new Figure 8, the absence of Tom70 and Tom20 function significantly increases tolerance to t-2hex and the tom20 mutant accumulates less Aim17 pre-protein upon t-2-he stress, indicating that the TOM complex is a physiologically important target of the proapoptotic lipid, which acts most likely via lipidation of more subunits than the Tom40 import channel.

      -  It is unclear whether the exogenously applied amounts of t-2-hex (concentrations chosen between 25-200 uM) are physiologically relevant in yeast cells. For comparison, Chipuk et al. (2012) used at most 1 uM on mitochondria of human cells, while Jarugumilli et al. (2018) considered 25 uM a 'lower dose' on human cells. Since the authors saw responses below 10 uM (Fig. 3B) and at the lowest selected concentration of 25 uM (Fig. 8), why were no lower, likely more specific, concentrations applied for the global transcriptomic and chemoproteomic experiments? Key experiments have to be repeated with the lower concentrations. 

      We have now performed several experiments with lower t-2-hex concentrations. A new chemoproteomic study with 10µM t-2-hex-alkyne has been conducted and the new results added to the supplementary information, combining 10µM and 100µM in vitro lipidation studies (Suppl. Table 6). Many subunits of the TOM and TIM complexes consistently are enriched significantly in both chemoproteomic experiments. These new data are summarized in revised Figure 8. Additionally we have performed in vivo pre-protein assays with lower t-2-hex concentrations. As shown in new Figure 5, Aim17 mitochondrial import is already inhibited by t-2-hex doses as low as 10µM in a wild type strain, and that this inhibition is enhanced in a hfd1 mutant and alleviated in a Hfd1 overexpressor. It is important to note that a dose of 10µM of external t-2-hex addition is significantly lower than doses applied to human cell cultures such as in Jarugumilli et al. (2018). It proves that mitochondrial protein import is a sensitive and physiologically relevant t2-hex target in our yeast models and that t-2-hex detoxification by enzymes such as the Hfd1 dehydrogenase sensitively regulates this specific inhibition.

      -  The amount of t-2-hex applied is especially important to consider in light of over 1300 proteins lipidated to an extent equal to or greater than Tom40 (Supp. Table 6). This chemoproteomic experiment (Fig. 8B, Supp. Table 6) is also weakened by the inclusion of only 2 replicates, thus precluding assessment of statistical significance. The selection of targets in Fig. 8B as "among the best hits" is neither immediately comprehensible nor further explained and represents at best cherrypicking. Further evidence based on statistical significance or validation by other means should be provided.

      We performed the chemoproteomic screens as described by Jarugumilli et al. (2018) with 2 replicates of mock treated versus 2 replicates of t-2-hex-alkyne treated cell extracts.  A new chemoproteomic study with 10µM t-2-hex-alkyne has been conducted and the new results added to the supplementary information combining 10µM and 100µM in vitro lipidation studies (Suppl. Table 6). Differential enrichment analysis of the proteomic data was performed with the amica software (Didusch et al., 2022). Proteins were ranked according to their log2 fold induction comparing lipid- and mock-treated samples with a threshold of ≥1.5, and the adjusted p-value was calculated. Several TOM and TIM subunits were consistently identified as differentially enriched proteins, which is summarized in new Figure 8B.

      - The authors unfortunately also underuse the possible contribution of mass spectrometry technology to in addition determine the extent and localization of lipidation on a global scale (especially relevant since Cohen et al. (2020) suggest site-specific mechanisms). 

      We agree that site-specific modifications of t-2-hex will be most likely important in the inhibition or other type of regulation of specific target proteins. Our collective data show that in the case of the inhibition of mitochondrial protein import, several lipidation events on TOM and TIM are involved. Dissection of individual cysteine lipidations on those subunits will be interesting, but we feel that this is out of the scope of the present work.

      - The general novelty of studying t-2-hex stress is lowered in light of existing literature in humans (see e. g. Chipuk et al., 2012; Cohen et al., 2020; Jarugumilli et al., 2018), and in yeast by the same authors (Manzanares-Estreder et al., 2017) and as the authors comment themselves, a significant part of the manuscript may represent rather a confirmation of the already described consequences of t-2-hex stress 

      We do not agree and we have not commented that our present study is a mere confirmation of t-2-hex stress previously applied in yeast and human models. In humans, t-2-hex has been identified as an efficient pro-apoptotic lipid, which causes mitochondrial dysfunction via direct lipidation of Bax, however the studies of Jarugumilli et al. (2018) revealed that many other direct t-2-hex targets exist, which remained uninvestigated to date. This work continues our previous studies (Manzanares-Estreder et al., 2017), where we show that t-2-hex is a universal proapoptotic lipid applicable in yeast models and contributes important novel findings, such as the massive transcriptional response resembling proteostatic defects caused by t-2-hex, mitochondrial protein import as a physiologically important and direct target of t-2-hex, the function of detoxifying enzymes such as Hfd1 in modulating lipid-mediated inhibition of mitochondrial protein import and general proteostasis. Additionally, we provide transcriptomic, chemoproteomic and functional genomic data to the scientific community, which will be a rich source for future studies on yet undiscovered pro-apoptotic mechanisms employed by t-2-hex. 

      Reviewer #2 (Public Review): 

      This study elucidates the toxic effects of the lipid aldehyde trans-2-hexadecenal (t-2-hex). The authors show convincingly that t-2-hex induces a strong transcriptional response, leads to proteotoxic stress, and causes the accumulation of mitochondrial precursor proteins in the cytosol. 

      The data shown are of high quality and well controlled. The genetic screen for mutants that are hyper-and hypo-sensitive to t-2-hex is elegant and interesting, even if the mechanistic insights from the screen are rather limited. The last part of the study is less convincing. The authors show evidence that t-2-hex affects subunits of the TOM complex. However, they do not formally demonstrate that the lipidation of a TOM subunit is responsible for the toxic effect of t-2-hex. A t-2-hexresistant TOM mutant was not identified. Moreover, it is not clear whether the concentrations of t-2-hex in this study are physiological. This is, however, a critical aspect. The literature is full of studies claiming the toxic effects of compounds such as H2O2; even if such studies are technically sound, they are misleading if nonphysiological concentrations of such compounds were used. 

      Nevertheless, this is an interesting study of high quality. A few specific aspects should be addressed.

      We have now performed t-2-hex toxicity assays using several mutants in Tom subunits, the cysteine free mutant of the essential Tom40 core channel and deletion mutants in the accessory subunits Tom70 and Tom20 (new Figure 8). As a result, cysteine lipidation of Tom40 alone is not sufficient to confer t-2-hex toxicity. This implies most likely other lipidation targets within the TOM and TIM complexes, as indicated by our in vitro lipidation studies. Indeed, as shown in new Figure 8, the absence of Tom70 and Tom20 function significantly increases tolerance to t-2-hex indicating that the TOM complex is a physiologically important target of the proapoptotic lipid, which acts most likely via lipidation of more subunits than the Tom40 import channel.

      We have now performed several experiments with lower t-2-hex concentrations. A new chemoproteomic study with 10µM t-2-hex-alkyne has been conducted and the new results added to the supplementary information combining 10µM and 100µM in vitro lipidation studies (Suppl. Table 6). Many subunits of the TOM and TIM complexes consistently are enriched significantly in both chemoproteomic experiments. These new data are summarized in revised Figure 8.

      Additionally we have performed in vivo pre-protein assays with lower t-2-hex concentrations. As shown in new Figure 5, Aim17 mitochondrial import is already inhibited by t-2-hex doses as low as 10µM in a wild type strain, and that this inhibition is enhanced in a hfd1 mutant and alleviated in a Hfd1 overexpressor. It is important to note that a dose of 10µM of external t-2-hex addition is significantly lower than doses applied to human cell cultures such as in Jarugumilli et al. (2018). It proves that mitochondrial protein import is a sensitive and physiologically relevant t2-hex target in our yeast models and that t-2-hex detoxification by enzymes such as the Hfd1 dehydrogenase sensitively regulates this specific inhibition.

      Reviewer #3 (Public Review): 

      Summary: The authors investigate the effect of the lipid aldehyde trans-2hexadecenal (t-2-hex) in yeast using multiple omic analyses that show that a large range of cellular functions across all compartments are affected, e.g. transcriptomic changes affect 1/3 of all genes. The authors provide additional analyses, from which they built a model that mitochondrial protein import caused by modification of Tom40 is blocked. 

      Strengths: Global analyses (transcriptomic and functional genomics approach) to obtain an unbiased overview of changes upon t-2-hex treatment. 

      Weaknesses: It is not clear why the authors decided to focus on mitochondria, as only 30 genes assigned to the GO term "mitochondria" are increasing, and also the follow-up analyses using SATAY is not showing a predominance for mitochondrial proteins (only 4 genes are identified as hits). The provided additional experimental data do not support the main claims as neither protein import is investigated nor is there experimental evidence that lipidation of Tom40 occurs in vivo and impacts on protein translocation. 

      30 mitochondrial gene functions are very strongly (>10 fold) up-regulated by t-2-hex. However, when genes up-regulated (>2 log2FC) or down-regulated (<-2 log2FC) by t-2-hex were selected and subjected to GO category enrichment analysis, we found that “Mitochondrial organization” was the most numerous GO group activated by t-2-hex, while it was “Ribosomal subunit biogenesis” for t-2-hex repression (new data in Suppl. Tables 1 and 2). 

      In the revised version of our manuscript, we have now included extensive new experimentation, which shows that protein import at the TOM complex is a physiologically important target of the pro-apoptotic lipid t-2-hex and that enzymes such as the Hfd1 dehydrogenase sensitively regulate this inhibition. In vitro chemoproteomic experiments have now been performed at more physiological t-2hex concentrations of 10µM, which is lower than published data in human cell models. Consistently, several TOM and TIM subunits are enriched in these in vitro lipidation studies (new Fig. 8B). Tom40 lipidation alone is not sufficient to explain t2-hex toxicity, as a cysteine-free version of Tom40 does not confer tolerance to the apoptotic lipid (new Fig. 8D). Importantly however, the loss of function of nonessential accessory Tom subunits 70 or 20 confers t-2-hex tolerance (new Fig. 8D) indicating that pre-protein import at the TOM complex is a physiological target of t2-hex most likely dependent on lipidation of more Tom subunits than just the essential Tom40 pore. Moreover, we now show that mitochondrial protein import is inhibited by the lipid at low physiological doses of 10µM and that this inhibition is modulated by the gene dose of the t-2-hex degrading Hfd1 enzyme (new Fig. 5G).

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Private recommendations for the authors 

      - On the existing data from Supp. Table 6, the authors may include a global assessment of whether or not the protein included a cysteine (the likely site for lipidation). 

      Although free cysteines in target proteins are the most frequent sites of modification by LDEs such as t-2-hex, other amino acids such as lysines or histidines can be lipidated by these lipid derivatives. Therefore we would like to exclude this information from our chemoproteomic data.

      - What determines whether a gene is labeled in Fig. 6B other than fold change? Why is MAC1 with the highest FC not shown? 

      We analyzed the potential anti-apoptotic SATAY hits with a log2 < -0.75 according to expected detoxification pathways (heat shock response, pleiotropic drug response), to their function in the ER (the intracellular site where t-2-hex is generated) or in mitochondria (the major t-2-hex target identified so far). This is now better described in the text. As for the potential pro-apoptotic SATAY hits, we analyzed gene functions with a log2 > 1.5 and marked the predominant groups “Cytosolic ribosome and translation” and “Amino acid metabolism”. In any case, the interested reader has all SATAY data available in supplemental tables 4 and 5 to find alternative gene functions with a potential role in cellular adaptation to t-2-hex.

      - Supplementary Table numbering should be double-checked.

      Ok, numbering has been double-checked.

      Reviewer #2 (Recommendations For The Authors): 

      Major points 

      (1) Identification of the t-2-hex target. Neither Tom70, Tom20 nor the cysteine in Tom40 is essential. If one of these components is critical for the t-2-hex-mediated toxicity, mutants should be t-2-hex-resistant. This is a straight-forward, simple, and critical experiment. 

      We have now performed t-2-hex toxicity assays in the cysteine free Tom40 mutant, and tom20 and tom70 deletion mutants. As shown in new Figure 8, cysteine lipidation of Tom40 alone is not sufficient to confer t-2-hex toxicity. However, the absence of Tom70 and Tom20 function significantly increases tolerance to t-2-hex indicating that the TOM complex is a physiologically important target of the proapoptotic lipid, which acts most likely via lipidation of more subunits than the Tom40 import channel.

      (2) The authors claim that t-2-hex blocks the TOM complex. Since in vitro import assays with yeast mitochondria are a well established and simple technique, the authors should isolate mitochondria from their cells and perform import experiments. It is expected that those mitochondria show reduced import rates, however, swelling of these mitochondria to mitoplasts should suppress the import defect. 

      We agree that our study does not investigate a direct effect of t-2-hex on the import capacity of purified mitochondria. However, we determine the in vivo accumulation of several mitochondrial precursor proteins, which is widely used to assay for the efficiency of mitochondrial protein import, for example the recent hallmark paper discovering the mitoCPR protein import surveillance pathway exclusively uses epitope-tagged mitochondrial precursors to determine the regulation of mitochondrial protein import (Weidberg and Amon, Science 2018 360(6385)). Additionally, our new results that mutants in accessory TOM subunits 20 and 70 are hyperresistant to t-2-hex (Figure 8D) and that deletion of TOM20 decreases the t-2-hex induced pre-protein accumulation (Suppl. Figure 1) identify the TOM complex and hence protein import at the outer mitochondrial membrane as a physiologically important t-2-hex target.

      (3) The first part of the study is very strong. The last figure is also of good quality, however, it is not clear whether the effects on TOM subunits are really causal for the observed t-2-hex effect on gene expression. The authors might cure this by improved data or by avoiding bold statements such as: 'Hfd1 associates with the Tom70 subunit of the TOM complex and t-2-hex covalently lipidates the central Tom40 channel, which altogether indicates that transport of mitochondrial precursor proteins through the outer mitochondrial membrane is directly inhibited by the pro-apoptotic lipid and thus represents a hotspot for pro- and anti-apoptotic signaling.' (Abstract). 

      We now show that several TOM and TIM subunits are lipidated in vitro by physiological low t-2-hex concentrations, that loss of function of accessory subunits Tom20 or Tom70 rescues t-2-hex toxicity (new Figure 8) and that the gene dose of Hfd1 determines the degree of mitoprotein import block (new Figure 5). These data identify the TOM complex as a physiologically important target of the pro-apoptotic lipid. The Abstract has been modified accordingly.

      (4) If the t-2-hex levels are in a physiological range, one would expect that overexpression of Hfd1 prevents the t-2-hex-induced import arrest.

      We have now confirmed that overexpression of Hfd1 indeed prevents inhibition of mitochondrial protein import by t-2-hex. As shown in new Figure 5, Aim17 mitochondrial import is already inhibited by t-2-hex doses as low as 10µM in a wild type strain, and that this inhibition is enhanced in a hfd1 mutant and alleviated in a Hfd1 overexpressor.

      (5) The authors claim that Fmp52 is a t-2-hex-detoxifying enzyme, but do not show evidence. They should rewrite this sentence and be more cautious, or they should show that increased Fmp52 levels indeed deplete t-2-hex from mitochondria.  

      We show that loss of Fmp52 function leads to a strong t-2-hex sensitivity. Fmp52 belongs to the NAD-binding short-chain dehydrogenase/reductase (SDR) family and localizes to highly purified mitochondrial outer membranes (Zahedi et al, 2006). These are the indications that suggest that Fmp52 participates in the enzymatic detoxification of t-2-hex in addition to Hfd1. The Results section has been modified accordingly.

      Minor points: 

      (6) Aim17 was recently identified as a characteristic constituent of cytosolic protein aggregates named MitoStores (Krämer et al., 2023, EMBO J). The authors might test whether the cytosolic Aim17 protein colocalizes with the Hsp104-GFP granules that accumulate upon t-2-hex exposure as shown in Fig. 4A. 

      We agree that determining the fate of unimported mitochondrial precursors upon t-2-hex stress would be interesting. We have made some attempts to co-visualize Aim17-dsRed and Hsp104-GFP upon t-2-hex treatment, but we still have some technical issues. While we clearly see that Aim17 accumulates in cytoplasmic foci upon prolonged t-2-hex exposure, we are not able to determine colocalization with Hsp104, in great part because t-2-hex causes mitochondrial fragmentation, which leads to the appearance of Aim17-stained foci in the cytosol independently of protein aggregates. While so far we are not able to localize Aim17 unambiguously in Hsp104 containing aggregates (mitoStores) upon lipid stress, we would like to move the manuscript farther without those experiments.

      (7) In Fig. 1A, the figures of the different lines are difficult to distinguish. Lines of one color with different intensities would be better suited. 

      We have been working before with dose-response profiles generated by the destabilized luciferase system and found that the color-coded representation of the plots is the most effective way to represent the data, see for example Fita-Torró et al. Mol Ecol. 2023 32(13):3557-3574, Pascual-Ahuir et al. BBA 2019 1862(4):457-471, Rienzo et al., Mol Cell Biol. 2015 35(21):3669-83, and several other publications. Therefore we want to keep the format of the Figure.

      (8) A title page should be added to each of the supplemental data files with short descriptions of the information that is provided in the columns of the tables.  Response: Explanatory title pages have been now added to the supplemental data files.

      Reviewer #3 (Recommendations For The Authors): 

      Figure 5A: The authors aim to assess protein import, however, their experimental set-up is not suited and does not allow conclusions about protein translocation into mitochondria. The authors monitor protein steady state levels, which does not reflect import capacity. For this e.g. pulse-chase experiments coupled to coIP or in organello import assays with radiolabeled substrate proteins would be required. In addition, the authors lack a non-treated control to show that no precursor accumulates in the absence of CCCP and t-2-hex. At the moment, the conclusion of blocked import cannot be made, as there are many other explanations for the observed steady state levels, e.g. the TAP tag interfered with the import competence of the precursor or t-2-hex could impact on MPP function (in particular as Figure 8B shows that also intra-mitochondrial proteins undergo modification by t-2-hex). 

      We agree that our study does not investigate a direct effect of t-2-hex on the import capacity of purified mitochondria. However, we determine the in vivo accumulation of several mitochondrial precursor proteins, which is widely used to assay for the efficiency of mitochondrial protein import, for example the recent hallmark paper discovering the mitoCPR protein import surveillance pathway exclusively uses epitope-tagged mitochondrial precursors to determine the regulation of mitochondrial protein import (Weidberg and Amon, Science 2018 360(6385)). Figure 5 contains several non-treated control experiments, which show that no (or less in the case of Ilv6) precursors of Tap-tagged Aim17, Cox5a, Ilv6, or Sdh4 accumulate in the absence of CCCP or t-2-hex. This is shown in Figure 5A for untreated cells or in Figure 5B and new Figure 5G for solvent (DMSO) treated cells. This demonstrates that the Tap-tag does not interfere with the import of the respective precursors. Additionally, our new results that mutants in accessory TOM subunits 20 and 70 are hyperresistant to t-2-hex (Figure 8D) identify the TOM complex and hence protein import at the outer mitochondrial membrane as a physiologically important t-2-hex target.

      Figure 8: The conclusion that Tom40 is directly lipidated comes from an in vitro assay, with the conclusion that Tom40 is the main target, because it is the only Tom protein with a cysteine (Tom70 as not being part of the Tom core is excluded, however, lack of Tom70 function would also have detrimental consequences for mitochondrial protein import). However, there is no experiment showing a modification of Tom40 and a consequence for protein import. The proposed model is therefore very far-fetched and several aspects are speculation but not supported by experimental data. To propose such a model, the author needs to show experimental evidence, e.g. by generating a yeast strain in which the cysteine i Tom40 are replaced by e.g. Serine residues, and then assess if protein import (e.g. pulse-chase assays) are not affected anymore upon addition of t-2-hex. 

      Deletion of all four cysteine residues in Tom40 is not sufficient to confer resistance to t-2-hex stress. This result had been included in the original manuscript, but was somehow hidden in the Discussion. The revised manuscript now includes t-2hex tolerance assays for the Tom40 cysteine free mutant in new Figure 8D. As a result, cysteine lipidation of Tom40 alone is not sufficient to confer t-2-hex toxicity. This implies most likely other lipidation targets within the TOM and TIM complexes, as indicated by our in vitro lipidation studies. We therefore included the non-essential adaptor proteins Tom70 and Tom20 of the TOM complex and tested the tolerance of the respective deletion mutants in t-2-hex tolerance assays. As shown in new Figure 8D, the absence of Tom70 and Tom20 function significantly increases tolerance to t2-hex indicating that the TOM complex is a physiologically important target of the pro-apoptotic lipid, which acts most likely via lipidation of more subunits than the Tom40 import channel.

      Figure 8A: The pulldown experiments lack positive (other Tom subunits) and negative controls and were performed with (large) tags on all proteins, which can easily result in false positive interactions. The conclusion that Hfd1 interacts with Tom70 and Tom22 cannot be made. Also, the conclusion if an interaction is robust or not cannot be made as the pull-down lacks control fractions, it is also not clear how much of the eluate was loaded. Finally, Hfd1-HA was not expressed from its endogenous promoter, likely resulting in over-expression, which again strongly hampers conclusions about bona fide interaction partners. 

      We agree that our pulldown studies are done in an artificial context, such as Hfd1 overexpression needed for sufficient protein level for detection or use of Tapfusion proteins. However, the conclusion that Tom70 is a potential interactor of Hfd1 can be made based on the following observations: Hfd1-HA is preferentially pulled down from total protein extracts containing Tom70-Tap, but not from extracts containing no Tap-protein and significantly less from extracts containing Tom22-Tap, another TOM associated subunit. The pulldown assay has been repeated now several times and the efficiency of Hfd1 pulldown has been quantified and statistically analyzed with respect to the quantity of purified Tom protein, which is shown in modified Figure 8A. 

      Figure 4A and C: Depletion of proteasomal activity results in larger aggregates in Figure 4A. However, the addition of t-2-hex blocks proteasomal activity (Figure 4C). How can proteasome inhibition result in bigger aggregates if the proteasomal activity is lost upon t-2-hex addition?

      The negative effect of t-2-hex on proteasomal activity is most likely an indirect effect caused by protein aggregation (Bence et al., Science 2001 292-1552) and occurs in wild type and rpn4 mutant cells with reduced proteasomal activity (Fig. 4C). t-2-hex causes cytosolic protein aggregation in wild type cells, which is aggravated (more and larger protein aggregates) in rpn4 mutants because of their lower levels of active proteasome (Fig. 4A). The observed protein aggregates will further diminish proteasomal activity, which is confirmed in Fig. 4C. 

      Figure 1B: The authors use a reporter to determine HFD1 expression that consists of the promoter region of HFD1 fused to luciferase. These fusion constructs have been shown to often not reflect the bona fide expression levels of genes (Yoneda et al., J Cell Sci 2004). qPCR analysis of transcript levels should be included to support the induction of HFD1. 

      We agree that the live cell luciferase reporters used here are not suitable for the determination of absolute mRNA levels. However, the aim of these reporter experiments is to quantify the inducibility of different genes (HFD1, GRE2) dependent on increasing stress doses. These dose response profiles cannot be obtained by qPCR analysis, while the destabilized reporters are an excellent tool for this, which have been used to accurately describe numerous dynamic stress responses (for example: Dolz-Edo et al. 2013 MCB 33:2228-40, Rienzo et al. 2015 MCB 35:3669-83, PascualAhuir et al. 2019 BBA 862:457-471). Additionally, the induction of HFD1 mRNA levels by salt (NaCl) and oxidative (menadione) stress determined by qPCR has been published before (Manzanares-Estreder et al. 2017 Oxid Med Cell Longevity 2017:2708345).

      The authors conclude from Figure 1 that entry into apoptotic cell death is modulated by efficient t-2-hex detoxification. However, this is based on growth curves and no analysis of apoptotic cell death is performed. The data show that the addition of hexadecenal results in a growth arrest, that is overcome likely upon degradation of t-2-hex (depending on the amount of Hfd1). 

      We agree that our experiments measure growth inhibition and not specifically apoptotic cell death. The text has been changed accordingly.  

      Figure 4A: Microscopy images show between 1-2 yeast cells. Either more cells need to be shown or quantifications of the aggregates are required. In addition, it is not clear if the control received the same DMSO concentration as the treated cells and also the time point for the control is not specified. 

      We have now quantified the number of aggregates across cell populations in new Figure 4A in DMSO, t-2-hex and t-2-hex-H2 treated wt and rpn4 mutants. These data show specific aggregate induction by t-2-hex and not by DMSO or the saturated t-2-hex-H2 control, which is aggravated in rpn4 mutants and avoided by CHX pretreatment.

      Figure 5: Western blots in figure 5A, B, D, E and F lack a loading control. Without this, conclusions about increases in protein abundance cannot be made.  Response: We have now included additional panels with the loading controls for the Western blots in new figure 5, except figure 5B, where the appearance or not of the pre-protein can be compared to the amount of mature protein in the same blot.

      Figure 2B: Complex II assembly factors SDH5,6,9 are described here as ETC complexes. As the proteins are not part of the mature complex II, the heading should be modified into ETC complexes and ETC assembly.

      Figure 2B has been revised and the classification of ETC proteins changed accordingly.

    1. Author response

      Reviewer #1 (Public Review):

      The authors use neural recordings from three different brain areas to assess whether the type of evidence accumulation dynamics in those regions are (1) similar to one another, and (2) similar to best-fitting evidence accumulation dynamics to behavioral choice alone. This is an important theoretical question because it relates to the 'linking hypothesis' that relates neurophysiological data to psychological phenomena. Although the standard evidence accumulation dynamic in describing choice has been the gradual accumulation of evidence, the authors find that those dynamics are not represented equally in all brain regions. Such results suggest that more nuanced computational models are needed to explain how brain areas interact to produce decisions, and the focus of theoretical development should shift away from explaining behavioral patterns alone and more toward explaining both brain and behavioral interactions. Given that the authors simply test the assumption that the same dynamics that best explain behavior should also explain neural data, they accomplish their objective using a sophisticated methodology and find evidence *against* this assumption: they find that each region was best described by a distinct accumulation model, which all differed from the model that best described the rat's choices.

      I thought this was an excellent paper with a clear scientific objective, direct analysis to achieve that objective, and a very strong methodological approach to leave little doubt that the conclusions they drew from their analyses were as reasonable and accurate as possible.

      We thank the reviewer for their time and appreciate their generous comments.

      Reviewer #2 (Public Review):

      The neural dynamics underlying decision-making have long been studied across different species (e.g., primates and rodents) and brain areas (e.g., parietal cortex, frontal eye fields, striatum). The key question is to what extent neural firing rates covary with evidence accumulation processes as proposed by evidence accumulation models. It is often assumed that the evidence-accumulation process at the neural level should mirror the evidence-accumulation process at the behavioral level. The current paper shows that the neural dynamics of three rat brain regions (the FOF, ADS, and PCC) all show signatures of evidence accumulation, but in distinct ways. Especially the role of the FOF appears to be distinct, due to its dependence on early evidence compared to the other regions. This sheds new light and a new interpretation of the role of the FOF in decision-making - previously, it has been described as a region encoding the choice that is currently being committed to; this new analysis suggests it is instead strongly influenced by early evidence.

      A major strength of the paper is that the results are achieved through joint modelling of the behavioral and neural data, combined with information on the physical stimulus at hand. Joint models were shown to provide more information on the underlying processes compared to behavioral or neural models alone. Especially the inclusion of the neural data seemed to have greatly improved the quality of inferences. This is a key contribution that illustrates that the sophisticated modelling of multiple sources of data at the same time, pays off in terms of the quality of inferences. Yet, it should be added here, that due to the nature of the task, the behavioral data contained only choices, and not response times, which tend to contain more information regarding the evidence accumulation process than choice alone. It would be interesting to additionally discuss how choice decision times can be modeled with the proposed modelling framework.

      We thank the reviewer for their generous views on our work. We agree that adding decision times, which could readily be added to our framework, will likely further constrain the inference of the latent model. We are currently pursuing such topics using this framework and appropriate data. We have altered a passage in our Discussion, where we note the various extensions of our model one could pursue, to include response time within the set of behavioral measurements one might include.

      A main limitation of the paper is that it does not appear to address a seemingly logical follow-up question: If these three brain regions individually accumulate evidence in distinct manners, how do these multiple brain regions then each contribute to a final choice? The joint models fit each region's data separately, so how well does each region individually 'explain' or 'predict' behavior, and how does the combined neural activity of the regions lead to manifest behavior? I would be very interested in the authors' perspectives on these questions.

      We could not share the reviewers view and interest in this question with any more excitement than we already do! Unfortunately, the experiments necessary for answering this question in a satisfying way have not yet been performed (e.g. simultaneous multi-region population recordings). Additionally, our analysis approach, as presented currently, would require some technical alterations to deal with data at that scale. Both efforts are underway, but we feel as though the current manuscript describes the basic modeling framework one would need to use to address these questions if/when such data exists. We have added some text to the Discussion to highlight these exciting future directions:

      “An exciting future application of our modeling framework is to model multiple, independent accumulators in several brain regions which collectively give rise to the animal’s behavior. Such a model would provide incredible insight into how the brain collectively gives rise to behavioral choices.”

      There are some remaining questions regarding the specific models used, that I was hoping the authors could clarify. Specifically, in equations 10-11, I was wondering to what extent there might be a collinearity issue. Equation 10 proposes that the firing rates of neurons can vary across time due to two mechanisms: (1) The dependence of the firing rate on the accumulated evidence, and (2) a time-varying trial average (as detailed in Equation 11). If firing rates of the neuron indeed covary with the accumulated evidence and therefore increase across time, how can the effects of mechanisms 1 and 2 be disentangled? Relatedly, the independent noise models model each neuron separately and thereby include many more parameters, each informed by less data. Is it possible that the relatively poor cross-validation of the independent noise model may be a consequence of the overfitting of the independent noise model?

      Thank you for this important observation. Please see our response to the essential revisions above which addresses this issue. In short, although it is true that firing rates increase with time (with accumulating evidence) they do so in a way that depends on the stimulus, and so just as often as they increase with time, they decrease.

      Regarding the poor cross-validation of the independent noise model, we apologize for confusion here — both the shared and independent noise model have exactly the same number of parameters. They only differ in that the latent process for a trial contains unique noise instantiation per trial for the independent noise model and the same instantiating for the shared model. The number of parameters is the same. See above for our response to this issue, and how the manuscript was modified in light of this confusion.

      Another related question is how robust the parameter recovery properties of these models are under a wider range of data-generating parameter settings. I greatly appreciate the inclusion of a parameter recovery study (Figure S1C) using a single synthetic dataset, but it could be made even stronger by simulating multiple datasets with a wider range of parameter settings. Such a simulation study would help understand how robust and reliable the estimated parameters of all models are. Similarly, it would be helpful if also the \theta_{y} parameters are shown, which aren't shown in Figure S1C.

      We agree that understanding the model fitting behavior under a wider set of parameter settings is valuable. We fit our model to additional sets of parameter settings and included an additional supplemental figure (Figure 1 — figure supplement 2) to illustrate these results. In short, we found that parameter recovery was robust across the different parameter settings we tested. We also updated Figure S1C with the neural parameters. We included the following in the Results to note that parameter recovery was robust:

      “We verified that our method was able to recover the parameters that generated synthetic physiologically-relevant spiking and choices data (Figure 1 — figure supplement 1), and that parameter recovery was robust across a range of parameter values (Figure 1 — figure supplement 2)).” 

      An aspect of the paper that initially raised confusion with me is that the models fit on the choice data and stimulus information alone, make different predictions for the evidence accumulation dynamics in different regions (e.g., Figure 5A, 6A) and also led to different best-fitting parameters in different regions (Figure S9A). It took me a while to realize that this is due to the data being pooled across different rats and sessions - as such, the behavioral choice data are not the same across regions, and neither is the resulting fit models. This could easily be clarified by adding a few notes in the captions of the relevant figures.

      Thanks for pointing this out. We agree that this tends to be a point of confusion, and we have added clarification prior to Fig 3, where the choice model is first introduced:

      “We stress that because of this, each fitted choice model uses different behavioral choice data, and thus the fitted parameters vary from fitted model to fitted model.”

      Combined, this manuscript represents an interesting and welcome contribution to an ongoing debate on the neural dynamics of decision-making across different brain regions. It also introduced new joint modelling techniques that can be used in the field and raised new questions on how the concurrent activity of neurons across different brain regions combined leads to behavior.

      We appreciate the very generous views on our work!

    1. Author response:

      eLife assessment

      This useful study reports on the discovery of an antimicrobial agent that kills Neisseria gonorrhoeae. Sensitivity is attributed to a combination of DedA assisted uptake of oxydifficidin into the cytoplasm and the presence of a oxydifficidin-sensitive RplL ribosomal protein. Due to the narrow scope, the broader antibacterial spectrum remains unclear and therefore the evidence supporting the conclusions is incomplete with key methods and data lacking. This work will be of interest to microbiologists and synthetic biologists.

      General comment about narrow scope: The broader antibacterial spectrum of oxydifficidin has been reported previously (S B Zimmerman et al., 1987). The main focus of this study is on its previously unreported potent anti-gonococcal activity and mode of action. While it is true that broad-spectrum antibiotics have historically played a role in effectively controlling a wide range of infections, we and others believe that narrow-spectrum antibiotics have an overlooked importance in addressing bacterial infections. Their advantage lies in their ability to target specific pathogens without markedly disrupting the human microbiota.

      We are troubled by the statement that our paper is narrow in scope and that evidence supporting our conclusions is incomplete. We do not feel the reviews as presented substantiate drawing this conclusion about our work.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Kan et al. report the serendipitous discovery of a Bacillus amyloliquefaciens strain that kills N. gonorrhoeae. They use TnSeq to identify that the anti-gonococcal agent is oxydifficidin and show that it acts at the ribosome and that one of the dedA gene products in N. gonorrhoeae MS11 is important for moving the oxydifficidin across the membrane.

      Strengths:

      This is an impressive amount of work, moving from a serendipitous observation through TnSeq to characterize the mechanism by which Oxydifficidin works.

      Weaknesses:

      (1) There are important gaps in the manuscript's methods.

      The requested additions to the method describing bacterial sequencing and anti-gonococcal activity screening will be made. However, we do not think the absence of these generic methods reduces the significance of our findings.

      (2) The work should evaluate antibiotics relevant to N. gonorrhoeae.

      (1) It is not clear to us why reevaluating the activity of well characterized antibiotics against known gonorrhoeae clinical strains would add value to this manuscript. The activity of clinically relevant antibiotics against antibiotic-resistant N. gonorrhoeae clinical isolates is well described in the literature. Our use of antibiotics in this study was intended to aid in the identification of oxydifficidin’s mode of action. This is true for both Tables 1 and 2.

      (2) If the reviewer insists, we would be happy to include MIC data for the following clinically relevant antibiotics: ceftriaxone (cephalosporin/beta-lactam), gentamicin (aminoglycoside), azithromycin (macrolide), and ciprofloxacin (fluoroquinolone).

      (3) The genetic diversity of dedA and rplL in N. gonorrhoeae is not clear, neither is it clear whether oxydifficidin is active against more relevant strains and species than tested so far.

      (1) We thank the reviewer for this suggestion. We aligned the DedA sequence from strain MS11 with DedA proteins from 220 N. gonorrhoeae strains that have high-quality assemblies in NCBI. The result showed that there are no amino acid changes in this protein. Using the same method, we observed several single amino acid changes in RplL. This included changes at A64, G25 and S82 in 4 strains with one change per strain. These sites differ from R76 and K84, where we identified changes that provide resistance to oxydifficidin. Notably, in a similar search of representative Escherichia, Chlamydia, Vibrio, and Pseudomonas NCBI deposited genomes, we did not identify changes in RplL at position R76 or K84.

      (2) While the usefulness of screening more clinically relevant antibiotics against clinical isolates as suggested in comment 2 was not clear to us, we agree that screening these strains for oxydifficidin activity would be beneficial. We have ordered Neisseria gonorrhoeae strain AR1280, AR1281 (CDC), and Neisseria meningitidis ATCC 13090. They will be tested when they arrive.

      Reviewer #2 (Public Review):

      Summary:

      Kan et al. present the discovery of oxydifficidin as a potential antimicrobial against N. gonorrhoeae, including multi-drug resistant strains. The authors show the role of DedA flippase-assisted uptake and the specificity of RplL in the mechanism of action for oxydifficidin. This novel mode of action could potentially offer a new therapeutic avenue, providing a critical addition to the limited arsenal of antibiotics effective against gonorrhea.

      Strengths:

      This study underscores the potential of revisiting natural products for antibiotic discovery of modern-day-concerning pathogens and highlights a new target mechanism that could inform future drug development. Indeed there is a recent growing body of research utilizing AI and predictive computational informatics to revisit potential antimicrobial agents and metabolites from cultured bacterial species. The discovery of oxydifficidin interaction with RplL and its DedA-assisted uptake mechanism opens new research directions in understanding and combating antibiotic-resistant N. gonorrhoeae. Methodologically, the study is rigorous employing various experimental techniques such as genome sequencing, bioassay-guided fractionation, LCMS, NMR, and Tn-mutagenesis.

      Weaknesses:

      The scope is somewhat narrow, focusing primarily on N. gonorrhoeae. This limits the generalizability of the findings and leaves questions about its broader antibacterial spectrum. Moreover, while the study demonstrates the in vitro effectiveness of oxydifficidin, there is a lack of in vivo validation (i.e., animal models) for assessing pre-clinical potential of oxydifficidin. Potential SNPs within dedA or RplL raise concerns about how quickly resistance could emerge in clinical settings.

      (1) Spectrum/narrow scope: The broader antibacterial spectrum of oxydifficidin has been reported previously (S B Zimmerman et al., 1987). The focus of this study is on its previously unreported potent anti-gonococcal activity and its mode of action. While it is true that broad-spectrum antibiotics have historically played a role in effectively controlling a wide range of infections, we and others believe that narrow-spectrum antibiotics have an overlooked importance in addressing bacterial infections. Their advantage lies in their ability to target specific pathogens without markedly disrupting the human microbiota.

      (2) Animal models: We acknowledge the reviewer’s insight regarding the importance of in vivo validation to enhance oxydifficidin’s pre-clinical potential. However, due to the labor-intensive process needed to isolate oxydifficidin, obtaining a sufficient quantity for animal studies is beyond the scope of this study. Our future work will focus on optimizing the yield of oxydifficidin and developing a topical mouse model for subsequent investigations.

      (3) Potential SNPs: Please see our response to Reviewer #1’s comment 3. We acknowledge that potential SNPs within dedA and rplL raise concerns regarding clinical resistance, which is a common issue for protein-targeting antibiotics. Yet, as pointed out in the manuscript, obtaining mutants in the lab was a very low yield endeavor.

      Reviewer #3 (Public Review):

      Summary:

      The authors have shown that oxydifficidin is a potent inhibitor of Neisseria gonorrhoeae. They were able to identify the target of action to rplL and showed that resistance could occur via mutation in the DedA flippase and RplL.

      Strengths:

      This was a very thorough and clearly argued set of experiments that supported their conclusions.

      Weaknesses:

      There was no obvious weakness in the experimental design. Although it is promising that the DedA mutations resulted in attenuation of fitness, it remains an open question whether secondary rounds of mutation could overcome this selective disadvantage which was untried in this study.

      We thank the reviewer for the positive comment. We agree that investigating factors that could compensate for the fitness attenuation caused by DedA mutation would enhance our understanding of the role of DedA.

    1. Author response:

      We thank you for the opportunity to provide a concise response. The criticisms are accurately summarized in the eLife assessment:

      the study fails to engage prior literature that has extensively examined the impact of variance in offspring number, implying that some of the paradoxes presented might be resolved within existing frameworks.

      The essence of our study is to propose the adoption of the Haldane model of genetic drift, based on the branching process, in lieu of the Wright-Fisher (WF) model, based on sampling, usually binomial.  In addition to some extensions of the Haldane model, we present 4 paradoxes that cannot be resolved by the WF model. The reviews suggest that some of the paradoxes could be resolved by the WF model, if we engage prior literature sufficiently.

      We certainly could not review all the literature on genetic drift as there must be thousands of them. Nevertheless, the literature we do not cover is based on the WF model, which has the general properties that all modifications of the WF model share.  (We should note that all such modifications share the sampling aspect of the WF model. To model such sampling, N is imposed from outside of the model, rather than self-generating within the model.  Most important, these modifications are mathematically valid but biologically untenable, as will be elaborated below. Thus, in concept, the WF and Haldane models are fundamentally different.)

      In short, our proposal is general with the key point that the WF model cannot resolve these (and many other) paradoxes.  The reviewers disagree (apparently only partially) and we shall be specific in our response below.

      We shall first present the 4th paradox, which is about multi-copy gene systems (such as rRNA genes and viruses, see the companion paper). Viruses evolve both within and between hosts. In both stages, there are severe bottlenecks.  How does one address the genetic drift in viral evolution? How can we model the effective population sizes both within- and between- hosts?  The inability of the WF model in dealing with such multi-copy gene systems may explain the difficulties in accounting for the SARS-CoV-2 evolution. Given the small number of virions transmitted between hosts, drift is strong which we have shown by using the Haldane model (Ruan, Luo, et al. 2021; Ruan, Wen, et al. 2021; Hou, et al. 2023). 

      As the reviewers suggest, it is possible to modify the WF model to account for some of these paradoxes. However, the modifications are often mathematically convenient but biologically dubious. Much of the debate is about the progeny number, K.  (We shall use haploid model for this purpose but diploidy does not pose a problem as stated in the main text.) The modifications relax the constraint of V(k) = E(k) inherent in the WF sampling.  One would then ask how V(k) can be different from E(k) in the WF sampling even though it is mathematically feasible (but biologically dubious)?  Kimura and Crow (1963) may be the first to offer a biological explanation.  If one reads it carefully, Kimura's modification is to make the WF model like the Haldane model. Then, why don't we use the Haldane model in the first place by having two parameters, E(k) and V(k), instead of the one-parameter WF model?

      The Haldane model is conceptually simpler. It allows the variation in population size, N, to be generated from within the model, rather than artificially imposed from outside of the model.  This brings us to the first paradox, the density-dependent Haldane model. When N is increasing exponentially as in bacterial or yeast cultures, there is almost no drift when N is very low and drift becomes intense as N grows to near the carrying capacity.  We do not see how the WF model can resolve this paradox, which can otherwise be resolved by the Haldane model.

      The second and third paradoxes are about how much mathematical models of population genetic can be detached from biological mechanisms. The second paradox about sex chromosomes is rooted in the realization of V(k) ≠ E(k).  Since E(k) is the same between sexes but V(k) is different, how does the WF sampling give rise to V(k) ≠ E(k)? We are asking a biological question that troubled Kimura and Crow (1963) alluded above. The third paradox is acknowledged by two reviewers. Genetic drift manifested in the fixation probability of an advantageous mutation is 2s/V(k).  It is thus strange that the fundamental parameter of drift in the WF model, N (or Ne), is missing.  In the Haldane model, drift is determined by V(k) with N being a scaling factor; hence 2s/V(k) makes perfect biological sense,

      We now answer the obvious question: If the model is fundamentally about the Haldane model, why do we call it the WF-Haldane model? The reason is that most results obtained by the WF model are pretty good approximations and the branching process may not need to constantly re-derive the results.  At least, one can use the WF results to see how well they fit into the Haldane model. In our earlier study (Chen, et al. (2017); Fig. 3), we show that the approximations can be very good in many (or most) settings.

      We would like to use the modern analogy of gas-engine cars vs. electric-motor ones. The Haldane model and the WF model are as fundamentally different concepts as the driving mechanisms of gas-powered vs electric cars.  The old model is now facing many problems and the fixes are often not possible.  Some fixes are so complicated that one starts thinking about simpler solutions. The reservations are that we have invested so much in the old models which might be wasted by the switch. However, we are suggesting the integration of the WF and Haldane models. In this sense, the WF model has had many contributions which the new model gratefully inherits. This is true with the legacy of gas-engine cars inherited by EVs.

      The editors also issue the instruction: while the modified model yields intriguing theoretical predictions, the simulations and empirical analyses are incomplete to support the authors' claims. 

      We are thankful to the editors and reviewers for the thoughtful comments and constructive criticisms. We also appreciate the publishing philosophy of eLife that allows exchanges, debates and improvements, which are the true spirits of science publishing.

      References for the provisional author responses

      Chen Y, Tong D, Wu CI. 2017. A New Formulation of Random Genetic Drift and Its Application to the Evolution of Cell Populations. Mol. Biol. Evol. 34:2057-2064.

      Hou M, Shi J, Gong Z, Wen H, Lan Y, Deng X, Fan Q, Li J, Jiang M, Tang X, et al. 2023. Intra- vs. Interhost Evolution of SARS-CoV-2 Driven by Uncorrelated Selection-The Evolution Thwarted. Mol. Biol. Evol. 40.

      Kimura M, Crow JF. 1963. The measurement of effective population number. Evolution:279-288.

      Ruan Y, Luo Z, Tang X, Li G, Wen H, He X, Lu X, Lu J, Wu CI. 2021. On the founder effect in COVID-19 outbreaks: how many infected travelers may have started them all? Natl. Sci. Rev. 8:nwaa246.

      Ruan Y, Wen H, He X, Wu CI. 2021. A theoretical exploration of the origin and early evolution of a pandemic. Sci Bull (Beijing) 66:1022-1029.

      Review comments

      eLife assessment 

      This study presents a useful modification of a standard model of genetic drift by incorporating variance in offspring numbers, claiming to address several paradoxes in molecular evolution.

      It is unfortunate that the study fails to engage prior literature that has extensively examined the impact of variance in offspring number, implying that some of the paradoxes presented might be resolved within existing frameworks.

      We do not believe that the paradoxes can be resolved.

      In addition, while the modified model yields intriguing theoretical predictions, the simulations and empirical analyses are incomplete to support the authors' claims. 

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      The authors present a theoretical treatment of what they term the "Wright-Fisher-Haldane" model, a claimed modification of the standard model of genetic drift that accounts for variability in offspring number, and argue that it resolves a number of paradoxes in molecular evolution. Ultimately, I found this manuscript quite strange.

      The notion of effective population size as inversely related to the variance in offspring number is well known in the literature, and not exclusive to Haldane's branching process treatment. However, I found the authors' point about variance in offspring changing over the course of, e.g. exponential growth fairly interesting, and I'm not sure I'd seen that pointed out before.

      Nonetheless, I don't think the authors' modeling, simulations, or empirical data analysis are sufficient to justify their claims. 

      Weaknesses: 

      I have several outstanding issues. First of all, the authors really do not engage with the literature regarding different notions of an effective population. Most strikingly, the authors don't talk about Cannings models at all, which are a broad class of models with non-Poisson offspring distributions that nonetheless converge to the standard Wright-Fisher diffusion under many circumstances, and to "jumpy" diffusions/coalescents otherwise (see e.g. Mohle 1998, Sagitov (2003), Der et al (2011), etc.). Moreover, there is extensive literature on effective population sizes in populations whose sizes vary with time, such as Sano et al (2004) and Sjodin et al (2005).

      Of course in many cases here the discussion is under neutrality, but it seems like the authors really need to engage with this literature more. 

      The most interesting part of the manuscript, I think, is the discussion of the Density Dependent Haldane model (DDH). However, I feel like I did not fully understand some of the derivation presented in this section, which might be my own fault. For instance, I can't tell if Equation 5 is a result or an assumption - when I attempted a naive derivation of Equation 5, I obtained E(K_t) = 1 + r/c*(c-n)*dt. It's unclear where the parameter z comes from, for example. Similarly, is equation 6 a derivation or an assumption? Finally, I'm not 100% sure how to interpret equation 7. I that a variance effective size at time t? Is it possible to obtain something like a coalescent Ne or an expected number of segregating sites or something from this? 

      Similarly, I don't understand their simulations. I expected that the authors would do individual-based simulations under a stochastic model of logistic growth, and show that you naturally get variance in offspring number that changes over time. But it seems that they simply used their equations 5 and 6 to fix those values. Moreover, I don't understand how they enforce population regulation in their simulations---is N_t random and determined by the (independent) draws from K_t for each individual? In that case, there's no "interaction" between individuals (except abstractly, since logistic growth arises from a model that assumes interactions between individuals). This seems problematic for their model, which is essentially motivated by the fact that early during logistic growth, there are basically no interactions, and later there are, which increases variance in reproduction. But their simulations assume no interactions throughout! 

      The authors also attempt to show that changing variance in reproductive success occurs naturally during exponential growth using a yeast experiment. However, the authors are not counting the offspring of individual yeast during growth (which I'm sure is quite hard). Instead, they use an equation that estimates the variance in offspring number based on the observed population size, as shown in the section "Estimation of V(K) and E(K) in yeast cells". This is fairly clever, however, I am not sure it is right, because the authors neglect covariance in offspring between individuals. My attempt at this derivation assumes that I_t | I_{t-1} = \sum_{I=1}^{I_{t-1}} K_{i,t-1} where K_{i,t-1} is the number of offspring of individual i at time t-1. Then, for example, E(V(I_t | I_{t-1})) = E(V(\sum_{i=1}^{I_{t-1}} K_{i,t-1})) = E(I_{t-1})V(K_{t-1}) + E(I_{k-1}(I_{k-1}-1))*Cov(K_{i,t-1},K_{j,t-1}). The authors have the first term, but not the second, and I'm not sure the second can be neglected (in fact, I believe it's the second term that's actually important, as early on during growth there is very little covariance because resources aren't constrained, but at carrying capacity, an individual having offspring means that another individuals has to have fewer offspring - this is the whole notion of exchangeability, also neglected in this manuscript). As such, I don't believe that their analysis of the empirical data supports their claim. 

      Thus, while I think there are some interesting ideas in this manuscript, I believe it has some fundamental issues:

      first, it fails to engage thoroughly with the literature on a very important topic that has been studied extensively. Second, I do not believe their simulations are appropriate to show what they want to show. And finally, I don't think their empirical analysis shows what they want to show. 

      References: 

      Möhle M. Robustness results for the coalescent. Journal of Applied Probability. 1998;35(2):438-447. doi:10.1239/jap/1032192859 

      Sagitov S. Convergence to the coalescent with simultaneous multiple mergers. Journal of Applied Probability. 2003;40(4):839-854. doi:10.1239/jap/1067436085 

      Der, Ricky, Charles L. Epstein, and Joshua B. Plotkin. "Generalized population models and the nature of genetic drift." Theoretical population biology 80.2 (2011): 80-99 

      Sano, Akinori, Akinobu Shimizu, and Masaru Iizuka. "Coalescent process with fluctuating population size and its effective size." Theoretical population biology 65.1 (2004): 39-48 

      Sjodin, P., et al. "On the meaning and existence of an effective population size." Genetics 169.2 (2005): 1061-1070 

      Reviewer #2 (Public Review): 

      Summary: 

      This theoretical paper examines genetic drift in scenarios deviating from the standard Wright-Fisher model. The authors discuss Haldane's branching process model, highlighting that the variance in reproductive success equates to genetic drift. By integrating the Wright-Fisher model with the Haldane model, the authors derive theoretical results that resolve paradoxes related to effective population size. 

      Strengths: 

      The most significant and compelling result from this paper is perhaps that the probability of fixing a new beneficial mutation is 2s/V(K). This is an intriguing and potentially generalizable discovery that could be applied to many different study systems. 

      The authors also made a lot of effort to connect theory with various real-world examples, such as genetic diversity in sex chromosomes and reproductive variance across different species. 

      Weaknesses: 

      One way to define effective population size is by the inverse of the coalescent rate. This is where the geometric mean of Ne comes from. If Ne is defined this way, many of the paradoxes mentioned seem to resolve naturally. If we take this approach, one could easily show that a large N population can still have a low coalescent rate depending on the reproduction model. However, the authors did not discuss Ne in light of the coalescent theory. This is surprising given that Eldon and Wakeley's 2006 paper is cited in the introduction, and the multiple mergers coalescent was introduced to explain the discrepancy between census size and effective population size, superspreaders, and reproduction variance - that said, there is no explicit discussion or introduction of the multiple mergers coalescent. 

      The Wright-Fisher model is often treated as a special case of the Cannings 1974 model, which incorporates the variance in reproductive success. This model should be discussed. It is unclear to me whether the results here have to be explained by the newly introduced WFH model, or could have been explained by the existing Cannings model. 

      The abstract makes it difficult to discern the main focus of the paper. It spends most of the space introducing "paradoxes". 

      The standard Wright-Fisher model makes several assumptions, including hermaphroditism, non-overlapping generations, random mating, and no selection. It will be more helpful to clarify which assumptions are being violated in each tested scenario, as V(K) is often not the only assumption being violated. For example, the logistic growth model assumes no cell death at the exponential growth phase, so it also violates the assumption about non-overlapping generations. 

      The theory and data regarding sex chromosomes do not align. The fact that \hat{alpha'} can be negative does not make sense. The authors claim that a negative \hat{alpha'} is equivalent to infinity, but why is that? It is also unclear how theta is defined. It seems to me that one should take the first principle approach e.g., define theta as pairwise genetic diversity, and start with deriving the expected pair-wise coalescence time under the MMC model, rather than starting with assuming theta = 4Neu. Overall, the theory in this section is not well supported by the data, and the explanation is insufficient. 

      {Alpha and alpha' can both be negative.  X^2 = 0.47 would yield x = -0.7}

      Reviewer #3 (Public Review): 

      Summary: 

      Ruan and colleagues consider a branching process model (in their terminology the "Haldane model") and the most basic Wright-Fisher model. They convincingly show that offspring distributions are usually non-Poissonian (as opposed to what's assumed in the Wright-Fisher model), and can depend on short-term ecological dynamics (e.g., variance in offspring number may be smaller during exponential growth). The authors discuss branching processes and the Wright-Fisher model in the context of 3 "paradoxes": (1) how Ne depends on N might depend on population dynamics; (2) how Ne is different on the X chromosome, the Y chromosome, and the autosomes, and these differences do match the expectations base on simple counts of the number of chromosomes in the populations; (3) how genetic drift interacts with selection. The authors provide some theoretical explanations for the role of variance in the offspring distribution in each of these three paradoxes. They also perform some experiments to directly measure the variance in offspring number, as well as perform some analyses of published data. 

      Strengths: 

      (1) The theoretical results are well-described and easy to follow. 

      (2) The analyses of different variances in offspring number (both experimentally and analyzing public data) are convincing that non-Poissonian offspring distributions are the norm. 

      (3) The point that this variance can change as the population size (or population dynamics) change is also very interesting and important to keep in mind. 

      (4) I enjoyed the Density-Dependent Haldane model. It was a nice example of the decoupling of census size and effective size. 

      Weaknesses: 

      (1) I am not convinced that these types of effects cannot just be absorbed into some time-varying Ne and still be well-modeled by the Wright-Fisher process. 

      (2) Along these lines, there is well-established literature showing that a broad class of processes (a large subset of Cannings' Exchangeable Models) converge to the Wright-Fisher diffusion, even those with non-Poissonian offspring distributions (e.g., Mohle and Sagitov 2001). E.g., equation (4) in Mohle and Sagitov 2001 shows that in such cases the "coalescent Ne" should be (N-1) / Var(K), essentially matching equation (3) in the present paper. 

      (3) Beyond this, I would imagine that branching processes with heavy-tailed offspring distributions could result in deviations that are not well captured by the authors' WFH model. In this case, the processes are known to converge (backward-in-time) to Lambda or Xi coalescents (e.g., Eldon and Wakely 2006 or again in Mohle and Sagitov 2001 and subsequent papers), which have well-defined forward-in-time processes. 

      (4) These results that Ne in the Wright-Fisher process might not be related to N in any straightforward (or even one-to-one) way are well-known (e.g., Neher and Hallatschek 2012; Spence, Kamm, and Song 2016; Matuszewski, Hildebrandt, Achaz, and Jensen 2018; Rice, Novembre, and Desai 2018; the work of Lounès Chikhi on how Ne can be affected by population structure; etc...) 

      (5) I was also missing some discussion of the relationship between the branching process and the Wright-Fisher model (or more generally Cannings' Exchangeable Models) when conditioning on the total population size. In particular, if the offspring distribution is Poisson, then conditioned on the total population size, the branching process is identical to the Wright-Fisher model. 

      (6) In the discussion, it is claimed that the last glacial maximum could have caused the bottleneck observed in human populations currently residing outside of Africa. Compelling evidence has been amassed that this bottleneck is due to serial founder events associated with the out-of-Africa migration (see e.g., Henn, Cavalli-Sforza, and Feldman 2012 for an older review - subsequent work has only strengthened this view). For me, a more compelling example of changes in carrying capacity would be the advent of agriculture ~11kya and other more recent technological advances. 

      Recommendations for the authors: 

      Reviewing Editor Comments: 

      The reviewers recognize the value of this model and some of the findings, particularly results from the density-dependent Haldane model. However, they expressed considerable concerns with the model and overall framing of this manuscript.

      First, all reviewers pointed out that the manuscript does not sufficiently engage with the extensive literature on various models of effective population size and genetic drift, notably lacking discussion on Cannings models and related works.

      Second, there is a disproportionate discussion on the paradoxes, yet some of the paradoxes might already be resolved within current theoretical frameworks. All three reviewers found the modeling and simulation of the yeast growth experiment hard to follow or lacking justification for certain choices. The analysis approach of sex chromosomes is also questioned. 

      The reviewers recommend a more thorough review of relevant prior literature to better contextualize their findings. The authors need to clarify and/or modify their derivations and simulations of the yeast growth experiment to address the identified caveats and ensure robustness. Additionally, the empirical analysis of the sex chromosome should be revisited, considering alternative scenarios rather than relying solely on the MSE, which only provides a superficial solution. Furthermore, the manuscript's overall framing should be adjusted to emphasize the conclusions drawn from the WFH model, rather than focusing on the "unresolved paradoxes", as some of these may be more readily explained by existing frameworks. Please see the reviewers' overall assessment and specific comments. 

      Reviewer #2 (Recommendations For The Authors): 

      In the introduction -- "Genetic drift is simply V(K)" -- this is a very strong statement. You can say it is inversely proportional to V(K), but drift is often defined based on changes in allele frequency. 

      Page 3 line 86. "sexes is a sufficient explanation."--> "sex could be a sufficient explanation" 

      The strongest line of new results is about 2s/V(K). Perhaps, the paper could put more emphasis on this part and demonstrate the generality of this result with a different example. 

      The math notations in the supplement are not intuitive. e.g., using i_k and j_k as probabilities. I also recommend using E[X] and V[X]for expectation and variance rather than \italic{E(X)} to improve the readability of many equations. 

      Eq A6, A7, While I manage to follow, P_{10}(t) and P_{10} are not defined anywhere in the text. 

      Supplement page 7, the term "probability of fixation" is confusing in a branching model. 

      E.q. A 28. It is unclear eq. A.1 could be used here directly. Some justification would be nice. 

      Supplement page 17. "the biological meaning of negative..". There is no clear justification for this claim. As a reader, I don't have any intuition as to why that is the case.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment:

      Franke et al. explore and characterize the color response properties in the mouse primary visual cortex, revealing specific color opponent encoding strategies across the visual field. The data is solid; however, the evidence supporting some conclusions is incomplete. In its current form, the paper makes a useful contribution to how color is coded in mouse V1. Significance would be enhanced with some additional analyses and a clearer discussion of the limitations of the data presented.

      We thank the reviewers for appreciating our manuscript. We have rewritten the conclusions of the paper to be more conservative and now more explicitly focus on color processing in mouse V1, rather than comparing V1 to the retina. Additionally, we discuss the limitations of our approach in detail in the Discussion section. Finally, we have addressed all comments from the reviewers below.

      Referee 1 (Remarks to the Author):

      In this study, Franke et al. explore and characterize color response properties across primary visual cortex, revealing specific color opponent encoding strategies across the visual field. The authors use awake 2P imaging to define the spectral response properties of visual interneurons in layer 2/3. They find that opponent responses are more pronounced at photopic light levels, and that diversity in color opponent responses exists across the visual field, with green ON/ UV OFF responses more strongly represented in the upper visual field. This is argued to be relevant for the detection of certain features that are more salient when using chromatic space, possibly due to noise reduction. In the revised version, Franke et al. have addressed the potential pitfalls in the discussion, which is an important point for the non-expert reader. Thus, this study provides a solid characterization of the color properties of V1 and is a valuable addition to visual neuroscience research.

      My remaining concerns are based more on the interpretation. I’m still not convinced by the statement "This type of color-opponency in the receptive field center of V1 neurons was not present in the receptive field center of retinal ganglion cells and, therefore, is likely computed by integrating center and surround information downstream of the retina." and I would suggest rewording it in the abstract.

      As discussed previously and now nicely added to the discussion, it is difficult to make a direct comparison given the different stimulus types used to characterize the retina and V1 recordings and the different levels of adaptation in both tissues. I will leave this point to the discussion, which allows for a more nuanced description of the phenomenon. Why do I think this is important? In the introduction, the authors argue that "the discrepancy [of previous studies] may be due to differences in stimulus design or light levels." However, while different light levels can be tested in V1, this cannot be done properly in the retina with 2P experiments. To address this, one would have to examine color-opponency in RGC terminals in vivo, which is beyond the scope of this study. Addressing these latter points directly in the discussion would, in my opinion, only strengthen the study.

      We thank the reviewer for the feedback. We removed the sentence mentioned by the reviewer from the abstract, as well as from the summary of our results in the Introduction. Additionally, we now phrase the interpretation of the retinal results more conservatively and specifically highlight in the Discussion that comparing ex-vivo retinal to in-vivo cortical data is challenging. With these changes, we believe that the focus of the paper is explicitly defined to be on the neuronal representation of color in mouse visual cortex, rather than on the comparison of retinal and cortical color processing.

      Minor:

      In the abstract, the second sentence says that we already know the mechanisms in primates.

      Unfortunately, I do not think this is true. First, primates refers to an order with several species, which might have adaptations to their color-processing. Second, I’m aware of several characterizations in "primates" that have led to convincing models (as referenced), but in my opinion, this is far from a true understanding the mechanisms, especially since very little is known about foveal color processing due to the difficulties of these experiments. Similarly in the introduction. "Primates" is indirectly defined as a species. Perhaps some rewording is needed here as well, since we know how different cone distributions can be in rodents (see Peichl’s work).

      Thanks. We have reworded the Abstract and Introduction towards indicating that many studies have been performed in primate species, without suggesting that the mechanisms are described.

      The legend in Fig. 2 has a "Fig. ???"

      Fixed.

      Referee 2 (Remarks to the Author):

      Franke et al. characterize the representation of color in the primary visual cortex of mice, highlighting how this changes across the visual field. Using calcium imaging in awake, head-fixed mice, they characterize the properties of V1 neurons (layer 2/3) using a large center-surround stimulation where green and ultra-violet colors were presented in random combinations. Clustering of responses revealed a set of functional cell-types based on their preference to different combinations of green and UV in their center and surround. These functional types were demonstrated to have different spatial distributions across V1, including one neuronal type (Green-ON/UV-OFF) that was much more prominent in the posterior V1 (i.e. upper visual field). Modelling work suggests that these neurons likely support the detection of predator-like objects in the sky.

      Strengths: The large-scale single-cell resolution imaging used in this work allows the authors to map the responses of individual neurons across large regions of the visual cortex. Combining this large dataset with clustering analysis enabled the authors to group V1 neurons into distinct functional cell types and demonstrate their relative distribution in the upper and lower visual fields. Modelling work demonstrated the different capacity of each functional type to detect objects in the sky, providing insight into the ethological relevance of color opponent neurons in V1.

      We thank the reviewer for appreciating our study.

      Weaknesses: While the study presents convincing evidence about the asymmetric distribution of color-opponent neurons in V1, the paper would greatly benefit from a more in-depth discussion of the caveats related to the conclusions drawn about their origin. This is particularly relevant regarding the conclusion drawn about the contribution of color opponent neurons in the retina. The mismatch between retinal color opponency and V1 color opponency could imply that this feature is not solely inherited from the retina, however, there are other plausible explanations that are not discussed here. Direct evidence for this statement remains weak.

      Thanks for this comment. We removed the retinal findings from the abstract, as well as from the summary of our results in the Introduction. In addition, we now phrase the interpretation of the retinal results more conservatively and specifically highlight in the Discussion that comparing ex-vivo retinal to in-vivo cortical data is challenging. With these changes, we believe that the focus of the paper is explicitly defined to be on the neuronal representation of color in mouse visual cortex, rather than on the comparison of retinal and cortical color processing.

      In addition, the paper would benefit from adding explicit neuron counts or percentages to the quadrants of each of the density plots in Figures 2-5. The variance explained by the principal components does not capture the percentage of color opponent cells. Additionally, there appear to be some remaining errors in the figure legend and labels that have not been addressed (e.g. ’??’ in Fig 2 legend).

      Thank you for this suggestion. We believe that adding the numbers or percentages to the figure panels would make them too crowded. Instead, we have now mentioned in the Results section and the legends that the percentages of variance explained by the color (off-diagonal) and luminance axis (diagonal) correlate with the number of neurons located in the color (top left and bottom right) and luminance contrast quadrants (top right and bottom left), respectively. Together with the number of neurons in each plot stated in the legends and the scale bar indicating the number of neurons per gray level, we hope this approach provides clarity for the reader to interpret the panels. Additionally, we have fixed the broken reference in the legend of Fig. 2.

      Overall, this study will be a valuable resource for researchers studying color vision, cortical processing, and the processing of ethologically relevant information. It provides a useful basis for future work on the origin of color opponency in V1 and its ethological relevance.

      General Suggestions:

      -  Please add possible caveats of using ETA method to the discussion section. For example, it is unclear to what extent ON/OFF cells are being overlooked by using ETA method.

      We now discuss the limitations of the ETA approach in the Discussion section.

      - The caveats of using the percentage of variance explained in the retina as evidence against V1 solely inheriting color-opponency from retinal output neurons are not adequately addressed. For example, could the mismatch in explained variance of the color axis between V1 and RGCs be explained by a subset of non-color opponent RGCs projecting elsewhere (not dLGN-V1) or that color opponent cells project to a larger number of neurons in V1 than non-color opponent cells? We suggest adding a paragraph to the discussion to address this issue.

      We have removed these conclusions from the paper, more carefully interpret the retinal results and mention that comparing ex-vivo retina data with in-vivo cortical data is challenging.

      - Please clarify how the different response types shown in Figure 5e-f lead to differences in noise detection and thereby differences in predator discriminability. For example, why does Gon/UVoff not respond to the noise scene while Goff/UVoff does?

      We added this to the Results section.

      - Please clarify the relationship between ETA amplitude, neural response probability, and neural response amplitude. For example, do color-opponent cells have equal absolute neural response amplitudes to the different colors?

      Thank you for bringing up this point. The ETA is obtained by summing the stimulus sequences that elicit an event (i.e., response), weighted by the amplitude of the response. Consequently, the absolute amplitude of the ETA correlates with the calcium amplitude. Importantly, the ETA amplitudes of different stimulus conditions are comparable because they were estimated on the same normalized calcium trace. Therefore, comparing the absolute amplitudes of ETAs of color-opponent neurons reveals the response magnitude of the cells to different colors. We have now included this information in the Results section.

      Abstract: - "more than a third of neurons in mouse V1 are color-opponent in their receptive field center". It is unclear what data supports this statement. Can you please provide a statement in the manuscript that supports this directly using the number of neurons?

      We added the following sentence to the Results section: Nevertheless, a substantial fraction of neurons (33.1%) preferred color-opponent stimuli and scattered along the off-diagonal in the upper left and lower right quadrants, especially for the RF center.

      Figure 2: - There is a ?? in the figure legend. Which figure should this refer to? - please provide explicit neuron counts/percentages for each quadrant in b.

      We fixed the figure reference. We believe that adding the numbers or percentages to the figure panels would make them too crowded. Instead, we have now mentioned in the Results section and the legends that the percentages of variance explained by the color (off-diagonal) and luminance axis (diagonal) correlate with the number of neurons located in the color (top left and bottom right) and luminance contrast quadrants (top right and bottom left), respectively. Together with the number of neurons in each plot stated in the legends and the scale bar indicating the number of neurons per gray level, we hope this approach provides clarity for the reader to interpret the panels.

      Figure 3: - Fig 3: Color scheme makes it very difficult to differentiate the different conditions, especially when printed.

      Thanks we changed the color scheme.

      - Add explicit neuron counts/percentages for each quadrant in b.

      See above.

      Figure 4: - Add explicit neuron counts/percentages for each quadrant in b.

      See above.

      Figure 5: - Add explicit neuron counts/percentages for each quadrant in c.

      See above.

      Methods: - "we modeled each response type to have a square RF with 10 degrees visual angle in diameter". There appears to be a mismatch between this statement and Figure 5e where 18 degrees is reported.

      Thanks we fixed that.

      Referee 3 (Remarks to the Author):

      This paper studies chromatic coding in mouse primary visual cortex. Calcium responses of a large collection of cells are measured in response to a simple spot stimulus. These responses are used to estimate chromatic tuning properties - specifically sensitivity to UV and green stimuli presented in a large central spot or a larger still surrounding region. Cells are divided based on their responses to these stimuli into luminance or chromatic sensitive groups. The results are interesting and many aspects of the experiments and conclusions are well done; several technical concerns, however, limit the support for several main conclusions,

      Limitations of stimulus choice The paper relies on responses to a large (37.5 degree diameter) modulated spot and surround region. This spot is considerably larger than the receptive fields of both V1 cells and retinal ganglion cells (it is twice the area of the average V1 receptive field). As a result, the spot itself is very likely to strongly activate both center and surround mechanisms, and responses of cells are likely to depend on where the receptive fields are located within the spot

      (and, e.g., how much of the true neural surround samples the center spot vs the surround region). Most importantly, the surrounds of most of the recorded cells will be strongly activated by the central spot. This brings into question statements in the paper about selective activation of center and surround (e.g. page 2, right column). This in turn raises questions about several subsequent analyses that rely on selective center and surround activation.

      Thank you for this comment. A similar point was raised by a reviewer in the first round of revision. We agree with the reviewers that it is critical to discuss both the rationale behind our stimulus design and its limitations to facilitate better interpretation by the reader.

      To be able to record from many V1 neurons simultaneously, we used a stimulus size of 37.5 degree visual angle in diameter, which is slightly larger than center RFs of single V1 neurons (between 20 - 30 degrees visual angle depending on the stimulus, see here). The disadvantage of this approach is that the stimulus is only roughly centered on the neurons’ center RFs. To reduce the impact of potential stimulus misalignment on our results, we used the following steps: { For each recording, we positioned the monitor such that the mean RF across all neurons lies within the center of the stimulus field of view.

      We confirmed that this procedure results in good stimulus alignment for the large majority of recorded neurons within individual recording fields by using a sparse noise stimulus (Suppl. Fig. 1a-c). Specifically, we found that for 83% of tested neurons, more than two thirds of their center RF, determined by the sparse noise stimulus, overlapped with the center spot of the color noise stimulus.

      For analysis, we excluded neurons without a significant center STA, which may be caused by misalignment of the stimulus.

      Together, we believe these points strongly suggest that the center spot and the surround annulus of the noise stimulus predominantly drive center (i.e. classical RF) and surround (i.e. extraclassical RF), respectively, of the recorded V1 neurons. This is further supported by the fact that color response types identified using an automated clustering method were robust across mice (Suppl. Fig. 6c), indicating consistent stimulus centering.

      Nevertheless, we cannot exclude the possibility that the stimulus was misaligned for a subset of the recorded neurons used in our analysis. We agree with the reviewer that such misalignment might have caused the center stimulus to partially activate the surround. To further address this issue beyond the controls we have already implemented, one could compare the results of our approach with an approach that centers the stimulus on individual neurons. However, we believe that performing these additional experiments is beyond the scope of the current study.

      To acknowledge the experimental limitations of our study and the concerns brought up by the reviewer, we have added the steps we perform to reduce the effects of stimulus misalignment in the Results section and discuss the problem of stimulus alignment in the Discussion in a separate section. With this, we believe our manuscript explains both the rationale behind our stimulus design as well as important limitations of the approach.

      Comparison with retina A key conclusion of the paper is that the chromatic tuning in V1 is not inherited from retinal ganglion cells. This conclusion comes from comparing chromatic tuning in a previously-collected data set from retina with the present results. But the retina recordings were made using a considerably smaller spot, and hence it is not clear that the comparison made in the paper is accurate. For example, the stimulus used for the V1 experiments almost certainly strongly stimulates both center and surround of retinal ganglion cells. The text focuses on color opponency in the receptive field centers of retinal ganglion cells, but center-surround opponency seems at least as relevant for such large spots. This issue needs to be described more clearly and earlier in the paper.

      Thanks for this comment. We removed the retinal findings from the abstract, as well as from the summary of our results in the Introduction. In addition, we now phrase the interpretation of the retinal results more conservatively and specifically highlight in the Discussion that comparing ex-vivo retinal to in-vivo cortical data is challenging. With these changes, we believe that the focus of the paper is explicitly defined to be on the neuronal representation of color in mouse visual cortex, rather than on the comparison of retinal and cortical color processing.

      Limitations associated with ETA analysis One of the reviewers in the previous round of reviews raised the concern that the ETA analysis may not accurately capture responses of cells with nonlinear receptive field properties such as On/Off cells. This possibility and whether it is a concern should be discussed.

      Thanks for this comment. We now discuss the limitation of using an ETA analysis in the

      Discussion section.

      Discrimination performance poor Discriminability of color or luminance is used as a measure of population coding. The discrimination performance appears to be quite poor - with 500-1000 neurons needed to reliably distinguish light from dark or green from UV. Intuitively I would expect that a single cell would provide such discrimination. Is this intuition wrong? If not, how do we interpret the discrimination analyses?

      Thank you for raising this point. The plots in Fig. 2c (and Figs. 3-5) show discriminability in bits, with the discrimination accuracy in % highlighted by the dotted horizontal lines. For 500 neurons, the discriminability is approx. 0.8 bits, corresponding to 95% accuracy. Even for 50 neurons, the accuracy is significantly above chance level. We now mention in the legends that the dotted lines indicate decoding accuracy in %.

    1. Author response:

      The following is the authors’ response to the current reviews.

      (1) Though we cannot survey all mutants, our observation that 774 genetically diverse adaptive mutants converge at the level of phenotype is important. It adds to growing evidence (see PMID33263280, PMID37437111, PMID22282810, PMID25806684) that the genetic basis of adaptation is not as diverse as the phenotypic basis. This convergence could make evolution more predictable.

      (2) Previous fitness competitions using this specific barcode system have been run for greater than 25 generations (PMID33263280, PMID27594428, PMID37861305, PMID27594428). We measure fitness per cycle, rather than per generation, so our fitness advantages are comparable to those in the aforementioned studies, including Venkataram and Dunn et al. (PMID27594428).

      (3) Our results remain the same upon removing the ~150 lineages with the noisiest fitness inferences, including those the reviewer mentions (see Figure S7).

      (4) We agree that there are likely more than the 6 clusters that we validated with follow-up studies (see Discussion). The important point is that we see a great deal of convergence in the behavior of diverse adaptive mutants.

      (5) The growth curves requested by the reviewer were included in our original manuscript; several more were added in the revision (see Figures 5D, 5E, 7D, S11B, S11C).


      The following is the authors’ response to the original reviews.

      Public Reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      In their manuscript, Schmidlin, Apodaca, et al try to answer fundamental questions about the evolution of new phenotypes and the trade-offs associated with this process. As a model, they use yeast resistance to two drugs, fluconazole and radicicol. They use barcoded libraries of isogenic yeasts to evolve thousands of strains in 12 different environments. They then measure the fitness of evolved strains in all environments and use these measurements to examine patterns in fitness trade-offs. They identify only six major clusters corresponding to different trade-off profiles, suggesting the vast genotypic landscape of evolved mutants translates to a highly constrained phenotypic space. They sequence over a hundred evolved strains and find that mutations in the same gene can result in different phenotypic profiles.  

      Overall, the authors deploy innovative methods to scale up experimental evolution experiments, and in many aspects of their approach tried to minimize experimental variation. 

      We thank the reviewer for this positive assessment of our work. We are happy that the reviewer noted what we feel is a unique strength of our approach: we scaled up experimental evolution by using DNA barcodes and by exploring 12 related selection pressures.  Despite this scaling up, we still see phenotypic convergence among the 744 adaptive mutants we study. 

      Weaknesses: 

      (1) One of the objectives of the authors is to characterize the extent of phenotypic diversity in terms of resistance trade-offs between fluconazole and radicicol. To minimize noise in the measurement of relative fitness, the authors only included strains with at least 500 barcode counts across all time points in all 12 experimental conditions, resulting in a set of 774 lineages passing this threshold. This corresponds to a very small fraction of the starting set of ~21 000 lineages that were combined after experimental evolution for fitness measurements. 

      This is a misunderstanding that we clarified in this revision. Our starting set did not include 21,000 adaptive lineages. The total number of unique adaptive lineages in this starting set is much lower than 21,000 for two reasons. 

      First, ~21,000 represents the number of single colonies we isolated in total from our evolution experiments. Many of these isolates possess the same barcode, meaning they are duplicates. Second, and perhaps more importantly, most evolved lineages do not acquire adaptive mutations, meaning that many of the 21,000 isolates are genetically identical to their ancestor. In our revised manuscript, we explicitly stated that these 21,000 isolated lineages do not all represent unique, adaptive lineages. We changed the word “lineages” to “isolates” where relevant in Figure 2 and the accompanying legend. And we have added the following sentence to the figure 2 legend (line 212), “These ~21,000 isolates do not represent as many unique, adaptive lineages because many either have the same barcode or do not possess adaptive mutations.”

      More broadly speaking, several previous studies have demonstrated that diverse genetic mutations converge at the level of phenotype and have suggested that this convergence makes adaptation more predictable (PMID33263280, PMID37437111, PMID22282810, PMID25806684). Most of these studies survey fewer than 774 mutants. Further, our study captures mutants that are overlooked in previous studies, such as those that emerge across subtly different selection pressures (e.g., 4 𝜇g/ml vs. 8 𝜇g/ml flu) and those that are undetectable in evolutions lacking DNA barcodes. Thus, while our experimental design misses some mutants (see next comment), it captures many others. Thus, we feel that “our work – showing that 774 mutants fall into a much smaller number of groups” is important because it “contributes to growing literature suggesting that the phenotypic basis of adaptation is not as diverse as the genetic basis (lines 176 - 178).”

      As the authors briefly remark, this will bias their datasets for lineages with high fitness in all 12 environments, as all these strains must be fit enough to maintain a high abundance. 

      We now devote 19 lines of text to discussing this bias (on lines 160 - 162, 278-284, and in more detail on 758 - 767).

      We walk through an example of a class of mutants that our study misses. One lines 759 - 763, we say, “our study is underpowered to detect adaptive lineages that have low fitness in any of the 12 environments. This is bound to exclude large numbers of adaptive mutants. For example, previous work has shown some FLU resistant mutants have strong tradeoffs in RAD (Cowen and Lindquist 2005). Perhaps we are unable to detect these mutants because their barcodes are at too low a frequency in RAD environments, thus they are excluded from our collection of 774.”

      In our revised version, we added more text earlier in the manuscript that explicitly discusses this bias. Lines 278 – 283 now read, “The 774 lineages we focus on are biased towards those that are reproducibly adaptive in multiple environments we study. This is because lineages that have low fitness in a particular environment are rarely observed >500 times in that environment (Figure S4). By requiring lineages to have high-coverage fitness measurements in all 12 conditions, we may be excluding adaptive mutants that have severe tradeoffs in one or more environments, consequently blinding ourselves to mutants that act via unique underlying mechanisms.”

      Note that while we “miss” some classes of mutants, we “catch” other classes that may have been missed in previous studies of convergence. For example, we observe a unique class of FLU-resistant mutants that primarily emerged in evolution experiments that lack FLU (Figure 3). Thus, we think that the unique design of our study, surveying 12 environments, allows us to make a novel contribution to the study of phenotypic convergence.

      One of the main observations of the authors is phenotypic space is constrained to a few clusters of roughly similar relative fitness patterns, giving hope that such clusters could be enumerated and considered to design antimicrobial treatment strategies. However, by excluding all lineages that fit in only one or a few environments, they conceal much of the diversity that might exist in terms of trade-offs and set up an inclusion threshold that might present only a small fraction of phenotypic space with characteristics consistent with generalist resistance mechanisms or broadly increased fitness. This has important implications regarding the general conclusions of the authors regarding the evolution of trade-offs. 

      We agree and discussed exactly the reviewer’s point about our inclusion threshold in the 19 lines of text mentioned previously (lines 160 - 162, 278-284, and 758 - 767). To add to this discussion, and avoid the misunderstanding the reviewer mentions, we added the following strongly-worded sentence to the end of the paragraph on lines 749 – 767 in our revised manuscript: “This could complicate (or even make impossible) endeavors to design antimicrobial treatment strategies that thwart resistance”. 

      More generally speaking, we set up our study around Figure 1, which depicts a treatment strategy that works best if there exists but a single type of adaptive mutant. Despite our inclusion threshold, we find there are at least 6 types of mutants. This diminishes hopes of designing simple multidrug strategies like Figure 1. Our goal is to present a tempered and nuanced discussion of whether and how to move forward with designing multidrug strategies, given our observations. On one hand, we point out how the phenotypic convergence we observe is promising. But on the other hand, we also point out how there may be less convergence than meets the eye for various reasons including the inclusion threshold the reviewer mentions (lines 749 - 767).

      We have made several minor edits to the text with the goal of providing a more balanced discussion of both sides. For example, we added the words, “may yet” to the following sentences on lines 32 – 36 of the abstract: “These findings, on one hand, demonstrate the difficulty in relying on consistent or intuitive tradeoffs when designing multidrug treatments. On the other hand, by demonstrating that hundreds of adaptive mutations can be reduced to a few groups with characteristic tradeoffs, our findings may yet empower multidrug strategies that leverage tradeoffs to combat resistance.”

      (2) Most large-scale pooled competition assays using barcodes are usually stopped after ~25 to avoid noise due to the emergence of secondary mutations. 

      The rate at which new mutations enter a population is driven by various factors such as the mutation rate and population size, so choosing an arbitrary threshold like 25 generations is difficult. 

      We conducted our fitness competition following previous work using the Levy/Blundell yeast barcode system, in which the number of generations reported varies from 32 to 40 (PMID33263280, PMID27594428, PMID37861305, see PMID27594428 for detailed calculation of the fraction of lineages biased by secondary mutations in this system). 

      The authors measure fitness across ~40 generations, which is almost the same number of generations as in the evolution experiment. This raises the possibility of secondary mutations biasing abundance values, which would not have been detected by the whole genome sequencing as it was performed before the competition assay. 

      Previous work has demonstrated that in this evolution platform, most mutations occur during the transformation that introduces the DNA barcodes (Levy et al. 2015). In other words, these mutations are already present and do not accumulate during the 40 generations of evolution. Therefore, the observation that we collect a genetically diverse pool of adaptive mutants after 40 generations of evolution is not evidence that 40 generations is enough time for secondary mutations to bias abundance values.

      We have added the following sentence to the main text to highlight this issue (lines 247 - 249): “This happens because the barcoding process is slightly mutagenic, thus there is less need to wait for DNA replication errors to introduce mutations (Levy et al. 2015; Venkataram et al. 2016).

      We also elaborate on this in the method section entitled, “Performing barcoded fitness competition experiments,” where we added a full paragraph to clarify this issue (lines 972 - 980).

      (3) The approach used by the authors to identify and visualize clusters of phenotypes among lineages does not seem to consider the uncertainty in the measurement of their relative fitness. As can be seen from Figure S4, the inter-replicate difference in measured fitness can often be quite large. From these graphs, it is also possible to see that some of the fitness measurements do not correlate linearly (ex.: Med Flu, Hi Rad Low Flu), meaning that taking the average of both replicates might not be the best approach.  Because the clustering approach used does not seem to take this variability into account, it becomes difficult to evaluate the strength of the clustering, especially because the UMAP projection does not include any representation of uncertainty around the position of lineages. This might paint a misleading picture where clusters appear well separate and well defined but are in fact much fuzzier, which would impact the conclusion that the phenotypic space is constricted. 

      Our noisiest fitness measurements correspond to barcodes that are the least abundant and thus suffer the most from stochastic sampling noise. These are also the barcodes that introduce the nonlinearity the reviewer mentions. We removed these from our dataset by increasing our coverage threshold from 500 reads to 5,000 reads. The clusters did not collapse, which suggests that they were not capturing this noise (Figure S7B).

      More importantly, we devoted 4 figures and 200 lines of text to demonstrating that the clusters we identified capture biologically meaningful differences between mutants (and not noise). We have modified the main text to point readers to figures 5 through 8 earlier, such that it is more apparent that the clustering analysis is just the first piece of our data demonstrating convergence at the level of phenotype.

      (4) The authors make the decision to use UMAP and a gaussian mixed model to cluster and represent the different fitness landscapes of their lineages of interest. Their approach has many caveats. First, compared to PCA, the axis does not provide any information about the actual dissimilarities between clusters. Using PCA would have allowed a better understanding of the amount of variance explained by components that separate clusters, as well as more interpretable components. 

      The components derived from PCA are often not interpretable. It’s not obvious that each one, or even the first one, will represent an intuitive phenotype, like resistance to fluconazole.  Moreover, we see many non-linearities in our data. For example, fitness in a double drug environment is not predicted by adding up fitness in the relevant single drug environments. Also, there are mutants that have high fitness when fluconazole is absent or abundant, but low fitness when mild concentrations are present. These types of nonlinearities can make the axes in PCA very difficult to interpret, plus these nonlinearities can be missed by PCA, thus we prefer other clustering methods. 

      Still, we agree that confirming our clusters are robust to different clustering methods is helpful. We have included PCA in the revised manuscript, plotting PC1 vs PC2 as Figure S9 with points colored according to the cluster assignment in figure 4 (i.e. using a gaussian mixture model). It appears the clusters are largely preserved.

      Second, the advantages of dimensional reduction are not clear. In the competition experiment, 11/12 conditions (all but the no drug, no DMSO conditions) can be mapped to only three dimensions: concentration of fluconazole, concentration of radicicol, and relative fitness. Each lineage would have its own fitness landscape as defined by the plane formed by relative fitness values in this space, which can then be examined and compared between lineages. 

      We worry that the idea stems from apriori notions of what the important dimensions should be. The biology of our system is unfortunately not intuitive. For example, it seems like this idea would miss important nonlinearities such as our observation that low fluconazole behaves more like a novel selection pressure than a dialed down version of high fluconazole. 

      Third, the choice of 7 clusters as the cutoff for the multiple Gaussian model is not well explained. Based on Figure S6A, BIC starts leveling off at 6 clusters, not 7, and going to 8 clusters would provide the same reduction as going from 6 to 7. This choice also appears arbitrary in Figure S6B, where BIC levels off at 9 clusters when only highly abundant lineages are considered. 

      We agree. We did not rely on the results of BIC alone to make final decisions about how many clusters to include. Another factor we considered were follow-up genotyping and phenotyping studies that confirm biologically meaningful differences between the mutants in each cluster (Figures 5 – 8). We now state this explicitly. Here is the modified paragraph where we describe how we chose a model with 7 clusters, from lines 436 – 446 of the revised manuscript:

      “Beyond the obvious divide between the top and bottom clusters of mutants on the UMAP, we used a gaussian mixture model (GMM) (Fraley and Raftery, 2003) to identify clusters. A common problem in this type of analysis is the risk of dividing the data into clusters based on variation that represents measurement noise rather than reproducible differences between mutants (Mirkin, 2011; Zhao et al., 2008). One way we avoided this was by using a GMM quality control metric (BIC score) to establish how splitting out additional clusters affected model performance (Figure S6). Another factor we considered were follow-up genotyping and phenotyping studies that demonstrate biologically meaningful differences between mutants in different clusters (Figures 5 – 8). Using this information, we identified seven clusters of distinct mutants, including one pertaining to the control strains, and six others pertaining to presumed different classes of adaptive mutant (Figure 4D). It is possible that there exist additional clusters, beyond those we are able to tease apart in this study.”

      This directly contradicts the statement in the main text that clusters are robust to noise, as more a stringent inclusion threshold appears to increase and not decrease the optimal number of clusters. Additional criteria to BIC could have been used to help choose the optimal number of clusters or even if mixed Gaussian modeling is appropriate for this dataset. 

      We are under the following impression: If our clustering method was overfitting, i.e. capturing noise, the optimal number of clusters should decrease when we eliminate noise. It increased. In other words, the observation that our clusters did not collapse (i.e.

      merge) when we removed noise suggests these clusters were not capturing noise. 

      Most importantly, our validation experiments, described below, provide additional evidence that our clusters capture meaningful differences between mutants (and not noise).  

      (5) Large-scale barcode sequencing assays can often be noisy and are generally validated using growth curves or competition assays. 

      Some types of bar-seq methods, in particular those that look at fold change across two time points, are noisier than others that look at how frequency changes across multiple timepoints (PMID30391162). Here, we use the less noisy method. We also reduce noise by using a stricter coverage threshold than previous work (e.g., PMID33263280), and by excluding batch effects by performing all experiments simultaneously, since we found this to be effective in our previous work (PMID37237236). 

      Perhaps also relevant is that the main assay we use to measure fitness has been previously validated (PMID27594428) and no subsequent study using this assay validates using the methods suggested above (see PMID37861305, PMID33263280, PMID31611676, PMID29429618, PMID37192196, PMID34465770, PMID33493203). Similarly, bar-seq has been used, without the suggested validation, to demonstrate that the way some mutant’s fitness changes across environments is different from other mutants (PMID33263280, PMID37861305, PMID31611676, PMID33493203, PMID34596043). This is the same thing that we use bar-seq to demonstrate. 

      For all of these reasons above, we are hesitant to confirm bar-seq itself as a valid way to infer fitness. It seems this is already accepted as a standard in our field. However, please see below.

      Having these types of results would help support the accuracy of the main assay in the manuscript and thus better support the claims of the authors. 

      While we don’t agree that fitness measurements obtained from this bar-seq assay generally require validation, we do agree that it is important to validate whether the mutants in each of our 6 clusters indeed are different from one another in meaningful ways.

      Our manuscript has 4 figures (5 - 8) and over 200 lines of text dedicated to validating whether our clusters capture reproducible and biologically meaningful differences between mutants. In the revised manuscript, we added additional validation experiments, such that three figures (Figures 5, 7 and S11) now involve growth curves, as the reviewer requested. 

      Below, we walk through the different types of validation experiments that are present in our manuscript, including those that were added in this revision.

      (1) Mutants from different clusters have different growth curves: In our original manuscript, we measured growth curves corresponding to a fitness tradeoff that we thought was surprising. Mutants in clusters 4 and 5 both have fitness advantages in single drug conditions. While mutants from cluster 4 also are advantageous in the relevant double drug conditions, mutants from cluster 5 are not! We validated these different behaviors by studying growth curves for a mutant from each cluster (Figures 7 and S11), finding that mutants from different clusters have different growth curves. In the revised manuscript, we added growth curves for 6 additional mutants (3 from cluster 1 and 3 from cluster 3), demonstrating that only the cluster 1 mutants have a tradeoff in high concentrations of fluconazole (see Figure 5D & 5E). In sum, this work demonstrates that mutants from different clusters have predictable differences in their growth phenotypes.

      (2) Mutants from different clusters have different evolutionary origins: In our original manuscript, we came up with a novel way to ask whether the clusters capture different types of adaptive mutants. We asked whether the mutants in each cluster originate from different evolution experiments. They often do (see pie charts in Figures 5, 6, 7, 8). In the revised manuscript, we extended this analysis to include mutants from cluster 1. Cluster 1 is defined by high fitness in low fluconazole that declines with increasing fluconazole. In our revised manuscript, we show that cluster 1 lineages were overwhelmingly sampled from evolutions conducted in our lowest concentration of fluconazole (see pie chart in new Figure 5A). No other cluster’s evolutionary history shows this pattern (compare to pie charts in figures 6, 7, and 8).

      **These pie charts also provide independent confirmation supporting the fitness tradeoffs observed for each cluster in figure 4E. For example, mutants in cluster 5 appear to have a tradeoff in a particular double drug condition (HRLF), and the pie charts confirm that they rarely originate from that evolution condition. This differs from cluster 4 mutants, which do not have a fitness tradeoff in HRLF, and are more likely to originate from that environment (see purple pie slice in figure 7). Additional cases where results of evolution experiments (pie charts) confirm observed fitness tradeoffs are discussed in the manuscript on lines 320 – 326, 594 – 598, 681 – 685.

      (3) Mutants from each cluster often fall into different genes: We sequenced many of these mutants and show that mutants in the same gene are often found in the same cluster. For example, all 3 IRA1 mutants are in cluster 6 (Fig 8), both GPB2 mutants are in cluster 4 (Figs 7 & 8), and 35/36 PDR mutants are in either cluster 2 or 3 (Figs 5 & 6). 

      (4) Mutants from each cluster have behaviors previously observed in the literature: We compared our sequencing results to the literature and found congruence. For example, PDR mutants are known to provide a fitness benefit in fluconazole and are found in clusters that have high fitness in fluconazole (lines 485 - 491). Previous work suggests that some mutations to PDR have different tradeoffs than others, which corresponds to our finding that PDR mutants fall into two separate clusters (lines 610 - 612). IRA1 mutants were previously observed to have high fitness in our “no drug” condition and are found in the cluster that has the highest fitness in the “no drug” condition (lines 691 - 696). Previous work even confirms the unusual fitness tradeoff we observe where IRA1 and other cluster 6 mutants have low fitness only in low concentrations of fluconazole (lines 702 - 704).

      (5) Mutants largely remain in their clusters when we use alternate clustering methods:  In our original manuscript, we performed various different re-clustering and/or normalization approaches on our data (Fig 6, S5, S7, S8, S10). The clusters of mutants that we observe in figure 4 do not change substantially when we re-cluster the data. In our revised manuscript, we added another clustering method: principal component analysis (PCA) (Fig S9).  Again, we found that our clusters are largely preserved.

      While these experiments demonstrate meaningful differences between the mutants in each cluster, important questions remain. For example, a long-standing question in biology centers on the extent to which every mutation has unique phenotypic effects versus the extent to which scientists can predict the effects of some mutations from other similar mutations. Additional studies on the clusters of mutants discovered here will be useful in deepening our understanding of this topic and more generally of the degree of pleiotropy in the genotype-phenotype map.

      Reviewer #2 (Public Review): 

      Summary: 

      Schmidlin & Apodaca et al. aim to distinguish mutants that resist drugs via different mechanisms by examining fitness tradeoffs across hundreds of fluconazole-resistant yeast strains. They barcoded a collection of fluconazole-resistant isolates and evolved them in different environments with a view to having relevance for evolutionary theory, medicine, and genotypephenotype mapping. 

      Strengths: 

      There are multiple strengths to this paper, the first of which is pointing out how much work has gone into it; the quality of the experiments (the thought process, the data, the figures) is excellent. Here, the authors seek to induce mutations in multiple environments, which is a really large-scale task. I particularly like the attention paid to isolates with are resistant to low concentrations of FLU. So often these are overlooked in favour of those conferring MIC values >64/128 etc. What was seen is different genotype and fitness profiles. I think there's a wealth of information here that will actually be of interest to more than just the fields mentioned (evolutionary medicine/theory). 

      We are grateful for this positive review. This was indeed a lot of work! We are happy that the reviewer noted what we feel is a unique strength of our manuscript: that we survey adaptive isolates across multiple environments, including low drug concentrations.  

      Weaknesses: 

      Not picking up low fitness lineages - which the authors discuss and provide a rationale as to why. I can completely see how this has occurred during this research, and whilst it is a shame I do not think this takes away from the findings of this paper. Maybe in the next one! 

      We thank the reviewer for these words of encouragement and will work towards catching more low fitness lineages in our next project.

      In the abstract the authors focus on 'tradeoffs' yet in the discussion they say the purpose of the study is to see how many different mechanisms of FLU resistance may exist (lines 679-680), followed up by "We distinguish mutants that likely act via different mechanisms by identifying those with different fitness tradeoffs across 12 environments". Whilst I do see their point, and this is entirely feasible, I would like a bit more explanation around this (perhaps in the intro) to help lay-readers make this jump. The remainder of my comments on 'weaknesses' are relatively fixable, I think: 

      We have expanded the introduction, in particular lines 129 – 157 of the revised manuscript, to walk readers through the connection between fitness tradeoffs and molecular mechanisms. For example, here is one relevant section of new text from lines 131 - 136: “The intuition here is as follows. If two groups of drug resistant mutants have different fitness tradeoffs, it could mean that they provide resistance through different underlying mechanisms. Alternatively, both could provide drug resistance via the same mechanism, but some mutations might also affect fitness via additional mechanisms (i.e. they might have unique “side-effects” at the molecular level) resulting in unique fitness tradeoffs in some environments.”

      In the introduction I struggle to see how this body of research fits in with the current literature, as the literature cited is a hodge-podge of bacterial and fungal evolution studies, which are very different! So example, the authors state "previous work suggests that mutants with different fitness tradeoffs may affect fitness through different molecular mechanisms" (lines 129-131) and then cite three papers, only one of which is a fungal research output. However, the next sentence focuses solely on literature from fungal research. Citing bacterial work as a foundation is fine, but as you're using yeast for this I think tailoring the introduction more to what is and isn't known in fungi would be more appropriate. It would also be great to then circle back around and mention monotherapy vs combination drug therapy for fungal infections as a rationale for this study. The study seems to be focused on FLU-resistant mutants, which is the first-line drug of choice, but many (yeast) infections have acquired resistance to this and combination therapy is the norm. 

      We ourselves are broadly interested in the structure of the genotype-phenotype-fitness map (PMID33263280, PMID32804946). For example, we are interested in whether diverse mutations converge at the level of phenotype and fitness. Figure 1A depicts a scenario with a lot of convergence in that all adaptive mutations have the same fitness tradeoffs.

      The reason we cite papers from yeast, as well as bacteria and cancer, is that we believe general conclusions about the structure of the genotype-phenotype-fitness map apply broadly. For example, the sentence the reviewer highlights, “previous work suggests that mutants with different fitness tradeoffs may affect fitness through different molecular mechanisms” is a general observation about the way genotype maps to fitness. So, we cited papers from across the tree of life to support this sentence.  And in the next sentence, where we cite 3 papers focusing solely on fungal research, we cite them because they are studies about the complexity of this map. Their conclusions, in theory, should also apply broadly, beyond yeast.

      On the other hand, because we study drug resistant mutations, we hope that our dataset and observations are of use to scientists studying the evolution of resistance. We use our introduction to explain how the structure of the genotype-phenotype-fitness map might influence whether a multidrug strategy is successful (Figure 1).

      We are hesitant to rework our introduction to focus more specifically on fungal infections as this is not our primary area of expertise.

      Methods: Line 769 - which yeast? I haven't even seen mention of which species is being used in this study; different yeast employ different mechanisms of adaptation for resistance, so could greatly impact the results seen. This could help with some background context if the species is mentioned (although I assume S. cerevisiae). 

      In the revised manuscript, we have edited several lines (line 95, 186, 822) to state the organism this work was done with is Saccharomyces cerevisiae. 

      In which case, should aneuploidy be considered as a mechanism? This is mentioned briefly on line 556, but with all the sequencing data acquired this could be checked quickly? 

      We like this idea and we are working on it, but it is not straightforward. The reviewer is correct in that we can use the sequencing data that we already have. But calling aneuploidy with certainty is tough because its signal can be masked by noise. In other words, some regions of the genome may be sequenced more than others by chance.

      Given this is not straightforward, at least not for us, this analysis will likely have to wait for a subsequent paper. 

      I think the authors could be bolder and try and link this to other (pathogenic) yeasts. What are the implications of this work on say, Candida infections? 

      Perhaps because our background lies in general study of the genotype-phenotype map, we are hesitant about making bold assertions about how our work might apply to pathogenic yeasts. We are hopeful that our work will serve as a stepping-stone such that scientists from that community can perhaps make (and test) such statements.   

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      I found the ideas and the questions asked in this manuscript to be interesting and ambitious. The setup of the evolution and fitness competition experiments was well poised to answer them, but the analysis of the data is not currently enough to properly support the claims made. I would suggest revising the analysis to address the weaknesses raised in the public review and if possible, adding some more experimental validations. As you already have genome sequencing data showing the causal mutation for many mutants across the different clusters, it should be possible for you to reconstruct some of the strains and test validate their phenotypes and cluster identity. 

      Yes, this is possible. We added more validation experiments (see figure 5). We already had quite a few validation experiments (figures 5 - 8 and lines 479 - 718), but we did not clearly highlight the significance of these analyses in our original manuscript. Therefore, we modified the text in our revised manuscript in various places to do so. For example, we now make clearer that we jointly use BIC scores as well as validation experiments to decide how many clusters to describe (lines 436 - 446). We also make clearer that our clustering analysis is only the first step towards identifying groups of mutants with similar tradeoffs by using words and phrases like, “we start by” (line 411) and “preliminarily” (line 448) when discussing the clustering analysis.  We also point readers to all the figures describing our validation experiments earlier (line 443), and list these experiments out in the discussion (lines 738 - 741).

      Also, please deposit your genome sequencing data in a public database (I am not sure I saw it mentioned anywhere). 

      We have updated line 1088 of the methods section to include this sentence: “Whole genome sequences were deposited in GenBank under SRA reference PRJNA1023288.”

      Reviewer #2 (Recommendations For The Authors):

      I don't think the figures or experiments can be improved upon, they are excellent. There are a few times I feel things are written in a rather confusing way and could be explained better, but also I feel there are places the authors jump from one thing to another really quickly and the reader (who might not be an expert in this area) will struggle to keep up. For example: 

      Explaining what RAD is - it is introduced in the methods, but what it is, is not really explained. 

      Since the introduction is already very long, we chose not to explain radicicol’s mechanism of action here. Instead, we bring this up later on lines 614 – 621 when it becomes relevant.

      More generally, in response to this advice and that from reviewer 1, we also added text to various places in the manuscript to help explain our work more clearly. In particular, we clarified the significance of our validation experiments and various important methodological details (see above). We also better explained the connection between fitness tradeoffs and mechanisms (see above) and added more details about the potential use cases of our approach (lines 142 – 150).

      The abstract states "some of the groupings we find are surprising. For example, we find some mutants that resist single drugs do not resist their combination, and some mutants to the same gene have different tradeoffs than others". Firstly, this sentence is a bit confusing to read but if I've read it as intended, then is it really surprising? It's difficult for organisms (bacteria and fungi) to develop multiple beneficial mutations conferring drug resistance on the same background, hence why combination antifungal drug therapy is often used to treat infections. 

      This is a place where brevity got in the way of clarity. We added a bit of text to make clear why we were surprised. Specifically, we were surprised because not all mutants behave the same. Some resist single drugs AND their combination. Some resist single drugs but not their combination. The sentence in the abstract now reads, “For example, we find some mutants that resist single drugs do not resist their combination, while others do. And some mutants to the same gene have different tradeoffs than others.”

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Responses to recommendations for the authors: 

      Reviewer #1 (Recommendations For The Authors):

      The manuscript would be strengthened with the following key revisions mostly having to do with image quality: 

      (1) It is very difficult in Figure 4B to see which nuclei actually have evidence of mitochondrial transcripts. It might be helpful to provide arrows to specific cells and also to provide some estimate of the percentage of cells with nuclear mt-transcripts as measured by ISH compared to the 3-6% of cortex cell estimate seen in the snRNAseq analysis. 

      As suggested, now we have added arrows to help readers to see the signals in nuclei. The detection threshold of ISH and single-nucleus RNA-seq should be different, and therefore, measuring estimates of PT-Mito by ISH would not be reliable.

      (2) The phospho-PKR images provided as evidence of C16 activity (Supplemental Figure 1) are too dim to be very useful. Could brighter images be provided? 

      We have now adjusted the LUTs of images in Supplemental Figure 1.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      This study is convincing because they performed time-resolved X-ray crystallography under different pH conditions using active/inactive metal ions and PpoI mutants, as with the activity measurements in solution in conventional enzymatic studies. Although the reaction mechanism is simple and may be a little predictable, the strength of this study is that they were able to validate that PpoI catalyzes DNA hydrolysis through "a single divalent cation" because time-resolved X-ray study often observes transient metal ions which are important for catalysis but are not predictable in previous studies with static structures such as enzyme-substrate analog-metal ion complexes. The discussion of this study is well supported by their data. This study visualized the catalytic process and mutational effects on catalysis, providing new insight into the catalytic mechanism of I-PpoI through a single divalent cation. The authors found that His98, a candidate of proton acceptor in the previous experiments, also affects the Mg2+ binding for catalysis without the direct interaction between His98 and the Mg2+ ion, suggesting that "Without a proper proton acceptor, the metal ion may be prone for dissociation without the reaction proceeding, and thus stable Mg2+ binding was not observed in crystallo without His98". In future, this interesting feature observed in I-PpoI should be investigated by biochemical, structural, and computational analyses using other metal-ion dependent nucleases. 

      We appreciate the reviewer for the positive assessment as well as all the comments and suggestions.

      Reviewer #2 (Public Review): 

      Summary: 

      Most polymerases and nucleases use two or three divalent metal ions in their catalytic functions. The family of His-Me nucleases, however, use only one divalent metal ion, along with a conserved histidine, to catalyze DNA hydrolysis. The mechanism has been studied previously but, according to the authors, it remained unclear. By use of a time resolved X-ray crystallography, this work convincingly demonstrated that only one M2+ ion is involved in the catalysis of the His-Me I-PpoI 19 nuclease, and proposed concerted functions of the metal and the histidine. 

      Strengths: 

      This work performs mechanistic studies, including the number and roles of metal ion, pH dependence, and activation mechanism, all by structural analyses, coupled with some kinetics and mutagenesis. Overall, it is a highly rigorous work. This approach was first developed in Science (2016) for a DNA polymerase, in which Yang Cao was the first author. It has subsequently been applied to just 5 to 10 enzymes by different labs, mainly to clarify two versus three metal ion mechanisms. The present study is the first one to demonstrate a single metal ion mechanism by this approach. 

      Furthermore, on the basis of the quantitative correlation between the fraction of metal ion binding and the formation of product, as well as the pH dependence, and the data from site-specific mutants, the authors concluded that the functions of Mg2+ and His are a concerted process. A detailed mechanism is proposed in Figure 6. 

      Even though there are no major surprises in the results and conclusions, the time-resolved structural approach and the overall quality of the results represent a significant step forward for the Me-His family of nucleases. In addition, since the mechanism is unique among different classes of nucleases and polymerases, the work should be of interest to readers in DNA enzymology, or even mechanistic enzymology in general. 

      Thank you very much for your comments and suggestions.

      Weaknesses: 

      Two relatively minor issues are raised here for consideration: 

      p. 4, last para, lines 1-2: "we next visualized the entire reaction process by soaking I-PpoI crystals in buffer....". This is a little over-stated. The structures being observed are not reaction intermediates. They are mixtures of substrates and products in the enzyme-bound state. The progress of the reaction is limited by the progress of the soaking of the metal ion. Crystallography has just been used as a tool to monitor the reaction (and provide structural information about the product). It would be more accurate to say that "we next monitored the reaction progress by soaking....". 

      We appreciate the clarification regarding the description of our experimental approach. We agree that our structures do not represent reaction intermediates but rather mixtures of substrate and product states within the enzyme-bound environment. We have revised the text accordingly to more accurately reflect our methodology.

      p. 5, the beginning of the section. The authors on one hand emphasized the quantitative correlation between Mg ion density and the product density. On the other hand, they raised the uncertainty in the quantitation of Mg2+ density versus Na+ density, thus they repeated the study with Mn2+ which has distinct anomalous signals. This is a very good approach. However, there is still no metal ion density shown in the key Figure 2A. It will be clearer to show the progress of metal ion density in a figure (in addition to just plots), whether it is Mg or Mn. 

      Thank you for your insightful comments. We recognize the importance of visualizing metal ion density alongside product density data. To address this, we included in Figure S4 to present Mg2+/Mn2+ and product densities concurrently.

      Reviewer #1 (Recommendations For The Authors): 

      (1) Figure 6. I understand that pre-reaction state (left panel) and Metal-binding state (two middle panels) are in equilibrium. But can we state that the Metal-binding state (two middle panels) and the product state (right panel) are in equilibrium and connected by two arrows? 

      Thank you for your comments. We agree that the DNA hydrolysis reaction process may not be reversible within I-Ppo1 active site. To clarify, we removed the backward arrows between the metal-binding state and product state. In addition, we thank the reviewer for giving a name for the middle state and think it would be better to label the middle state. We added the metal-binding state label in the revised Figure 6 and also added “on the other hand, optimal alignment of a deprotonated water and Mg2+ within the active site, labeled as metal-binding state, leads to irreversible bond breakage (Fig. 6a)” within the text.

      (2) The section on DNA hydrolysis assay (Materials and Methods) is not well described. In this section, the authors should summarize the methods for the experiments in Figure 4 AC, Figure 5BC, Figure S3C, Figure S4EF, and Figure S6AB. The authors presented some graphs for the reactions. For clarity, the author should state in the legends which experiments the results are from (in crystallo or in solution). Please check and modify them. 

      Thank you for the suggestion. We have added four paragraphs to detail the experimental procedures for experiments in these figures. In addition, we have checked all of the figure legends and labeled them as “in crystallo or in solution.” To clarify, we also added “in crystallo” or “solution” in the corresponding panels.

      (3) The authors showed the anomalous signals of Mn2+ and Tl+. The authors should mention which wavelength of X-rays was used in the data collections to calculate the anomalous signals. 

      Thank you for the suggestion. We have included the wavelength of the X-ray in the figure legends that include anomalous maps, which were all determined at an X-ray wavelength of 0.9765 Å.

      (4) The full names of "His-Me" and "HNH" are necessary for a wide range of readers. 

      Thank you for the suggestion. We have included the full nomenclature for His-Me (histidine-metal) nucleases and HNH (histidine-asparagine-histidine) nuclease.

      (5) The authors should add the side chain of Arg61 in Figure 1E because it is mentioned in the main text. 

      Thank you for the suggestion. We have added Arg61 to Figure 1E.

      (6) Figure 5D. For clarity, the electron densities should cover the Na+ ion. The same request applies to WatN in Figure S3B.

      Thank you for catching this detail. We have added the electron density for the Na+ ion in Figure 5D and WatN in Figure S3B.

      (7) At line 269 on page 8, what is "previous H98A I-PpoI structure with Mn2+"? Is the structure 1CYQ? If so, it is a complex with Mg2+. 

      Thank you for catching this detail. We have edited the text to “previous H98A I-PpoI structure with Mg2+.”

      (8) At line 294 on page 9, "and substrate alignment or rotation in MutT (66)." I think "alignment of the substrate and nucleophilic water" is preferred rather than "substrate alignment or rotation". 

      Thank you for the suggestion. We have edited the text to “alignment of the substrate and nucleophilic water.”

      (9) At line 305 on page 9, "Second, (58, 69-71) single metal ion binding is strictly correlated with product formation in all conditions, at different pH and with different mutants (Figure 3a and Supplementary Figure 4a-c) (58)". The references should be cited in the correct positions. 

      Thank you for catching this typo. We have removed the references.

      (10) At line 347 on page 10, "Grown in a buffer that contained (50 g/L glucose, 200 g/L α-lactose, 10% glycerol) for 24 hrs." Is this sentence correct? 

      Thank you for catching this detail. We have corrected the sentence.

      (11) At line 395 on page 11, "The His98Ala I-PpoI crystals of first transferred and incubated in a pre-reaction buffer containing 0.1M MES (pH 6.0), 0.2 M NaCl, 1 mM MgCl2 or MnCl2, and 20% (w/v) PEG3350 for 30 min." In the experiments using this mutant, does a pre-reaction buffer contain MgCl2 or MnCl2? 

      Thank you for bringing this to our attention. We have performed two sets of experiments: 1) metal ion soaking in 1 mM Mn2+, which is performed similarly as WT and does not have Mn2+ in the pre-reaction buffer; 2) imidazole soaking, 1 mM Mn2+ was included in the pre-reaction buffer. We reasoned that the Mn2+ will not bind or promote reaction with His98Ala I-PpoI, but pre-incubation may help populate Mn2+ within the lattice for better imidazole binding. However, neither Mn2+ nor imidazole were observed. We have added experimental details for both experiments with His98Ala I-PpoI.

      (12) In the figure legends of Figure 1, is the Fo-Fc omit map shown in yellow not in green? Please remove (F) in the legends. 

      We have changed the Fo-Fc map to be shown in violet. We have also removed (f) from the figure legends.

      (13) I found descriptions of "MgCl". Please modify them to "MgCl2". 

      Thank you for catching these details. We have modified all “MgCl” to “MgCl2.”

      (14) References 72 and 73 are duplicated. 

      We have removed the duplicated reference.

      Reviewer #2 (Recommendations For The Authors): 

      p. 9, first paragraph, last three lines: "Thus, we suspect that the metal ion may play a crucial role in the chemistry step to stabilize the transition state and reduce the electronegative buildup of DNA, similar to the third metal ion in DNA polymerases and RNaseH." This point is significant but the statement seems a little uncertain. You are saying that the single metal plays the role of two metals in polymerase, in both the ground state and the transition state. I believe the sentence can be stronger and more explicit. 

      Thank you for raising this point. We suspect the single metal ion in I-PpoI is different from the A-site or B-site metal ion in DNA polymerases and RNaseH, but similar to the third metal ion in DNA polymerases and nucleases. As we stated in the text,

      (1) the metal ion in I-PpoI is not required for substrate alignment. The water molecule and substrate can be observed in place even in the presence of the metal ion. In contrast, the A-site or B-site metal ion in DNA polymerases and RNaseH are required for aligning the substrates.

      (2) Moreover, the appearance of the metal ion is strictly correlated with product formation, similar as the third metal ion in DNA polymerase and RNaseH.

      To emphasize our point, we have revised the sentence as

      “Thus, similar to the third metal ion in DNA polymerases and RNaseH, the metal ion in I-PpoI is not required for substrate alignment but is essential for catalysis. We suspect that the single metal ion helps stabilize the transition state and reduce the electronegative buildup of DNA, thereby promoting DNA hydrolysis.”

      Minor typos: 

      p. 2, line 4 from bottom: due to the relatively low resolution... 

      Thank you for catching this. We have edited the text to “due to the relatively low resolution.”

      Figure 4F: What is represented by the pink color? 

      The structures are color-coded as 320 s at pH 6 (violet), 160 s at pH 7 (yellow), and 20 s at pH 8 (green). We have included the color information in figure legend and make the labeling clearer in the panel.

      p. 9, first paragraph, last line: ...similar to the third... 

      Thank you for catching this. We have edited the text.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      The study answers the important question of whether the conformational dynamics of proteins are slaved by the motion of solvent water or are intrinsic to the polypeptide. The results from neutron scattering experiments, involving isotopic labelling, carried out on a set of four structurally different proteins are convincing, showing that protein motions are not coupled to the solvent. A strength of this work is the study of a set of proteins using spectroscopy covering a range of resolutions. A minor weakness is the limited description of computational methods and analysis of data. The work is of broad interest to researchers in the fields of protein biophysics and biochemistry.

      We thank the editors and reviewers for the positive and encouraging comments.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Zheng et al. study the 'glass' transitions that occurs in proteins at ca. 200K using neutron diffraction and differential isotopic labeling (hydrogen/deuterium) of the protein and solvent. To overcome limitations in previous studies, this work is conducted in parallel with 4 proteins (myoglobin, cytochrome P450, lysozyme and green fluorescent protein) and experiments were performed at a range of instrument time resolutions (1ns - 10ps). The author's data looks compelling, and suggests that transitions in the protein and solvent behavior are not coupled and contrary to some previous reports, the apparent water transition temperature is a 'resolution effect'; i.e. instrument response limited. This is likely to be important in the field, as a reassessment of solvent 'slaving' and the role of the hydration shell on protein dynamics should be reassessed in light of these findings.

      Strengths:

      The use of multiple proteins and instruments with a rate of energy resolution/ timescales.

      We thank the reviewer for highlighting our key findings.

      Weaknesses:

      The paper could be organised to better allow the comparison of the complete dataset collected. The extent of hydration clearly influences the protein transition temperature. The authors suggest that "water can be considered here as lubricant or plasticizer which facilitates the motion of the biomolecule." This may be the case, but the extent of hydration may also alter the protein structure.

      Following the reviewer’s suggestion, we studied the secondary structure content and tertiary structure of CYP protein at different hydration levels (h = 0.2 and 0.4) through molecular dynamics simulation. As shown in Table S2 and Fig. S6, the extent of hydration does not alter the protein secondary structure content and overall packing. Thus, this result also suggests that water molecules have more influence on protein dynamics than on protein structure. We added the above results in the revised SI.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript entitled "Decoupling of the Onset of Anharmonicity between a Protein and Its Surface Water around 200 K" by Zheng et al. presents a neutron scattering study trying to elucidate if at the dynamical transition temperature water and protein motions are coupled. The origin of the dynamical transition temperature is highly debated since decades and specifically its relation to hydration.

      Strengths:

      The study is rather well conducted, with a lot of efforts to acquire the perdeuterated proteins, and some results are interesting.

      We thank the reviewer for highlighting our key findings.

      Weaknesses:

      The MD data presented appears to be missing description of the methods used.

      If these data support the authors claim that different levels of hydration do not affect the protein structure, careful analysis of the MD simulation data should be presented that show the systems are properly equilibrated under each condition. Additionally, methods are needed to describe the MD parameters and methods used, and for how long the simulations were run.

      We have now added the methods of MD simulation into the revised SI.

      “The initial structure of protein cytochrome P450 (CYP) for simulations was taken from PDB crystal structure (2ZAX). Two protein monomers were filled in a cubic box. 1013 and 2025 water molecules were inserted into the box randomly to reach a mass ratio of 0.2 and 0.4 gram water/1 gram protein, respectively, which mimics the experimental condition. Then 34 sodium counter ions were added to keep the system neutral in charge. The CHARMM 27 force field in the GROMACS package was used for CYP, whereas the TIP4P/Ew model was chosen for water. The simulations were carried out at a broad range of temperatures from 360 K to 100 K, with a step of 5 K. At each temperature, after the 5000 steps energy-minimization procedure, a 10 ns NVT is conducted. After that, a 30 ns NPT simulation was carried out at 1 atm with the proper periodic boundary condition. As shown in Fig. S7, 30 ns is sufficient to equilibrate the system. The temperature and pressure of the system is controlled by the velocity rescaling method and the method by Parrinello and Rahman, respectively. All bonds of water in all the simulations were constrained with the LINCS algorithm to maintain their equilibration length. In all the simulations, the system was propagated using the leap-frog integration algorithm with a time step of 2 fs. The electrostatic interactions were calculated using the Particle Mesh Ewalds (PME) method. A non-bond pair-list cutoff of 1 nm was used and the pair-list was updated every 20 fs. All MD simulations were performed using GROMACS 4.5.1 software packages.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Response to author's changes:

      See public review: The MD data presented appears to be missing description of the methods used.

      If these data support the authors claim that different levels of hydration do not affect the protein structure, careful analysis of the MD simulation data should be presented that show the systems are properly equilibrated under each condition. Additionally, methods are needed to describe the MD parameters and methods used, and for how long the simulations were run.

      We have now added the methods of MD simulation into the revised SI. Please see Reply 5.

      Reviewer #2 (Recommendations For The Authors):

      The authors answered my questions and substantially improved the manuscript.

      We thank the reviewer for the encouraging comments .

    1. Author response:

      'We thank the reviewers for their helpful comments and criticisms of our manuscript and are pleased by the overall positive nature of the comments. For the eLife Version of Record, we plan to carry out the following experiments to address reviewer comments:

      - We will use genetic approaches (e.g., driving p35 in glia to block apoptosis) and molecular markers, such as phospho-Histone H3, to assess whether reduced glial proliferation or increased glial apoptosis contributes to reduced glial cell number.

      - We will assess the ability of glial-specific expression of the Drosophila or Human ifc/DEGS1 transgenes to rescue the ifc lethal phenotype to adulthood.

      - We will replicate key phenotypic findings with additional ifc alleles.

      - We will enhance our characterization of 3xP3 RFP transgenes with respect to glial subtypes both for the insert we used in our study and at least one independent insert.

      - We will edit the text of the manuscript to clarify additional points raised by the reviewers.

      Once we complete the above approaches, we will modify our manuscript accordingly and submit a full response to the reviews to eLife along with the revised manuscript,'

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment 

      This important study explores the potential influence of physiologically relevant mechanical forces on the extrusion of vesicles from C. elegans neurons. The authors provide compelling evidence to support the idea that uterine distension can induce vesicular extrusion from adjacent neurons. The work would be strengthened by using an additional construct (preferably single-copy) to demonstrate that the observed phenotypes are not unique to a single transgenic reporter. Overall, this work will be of interest to neuroscientists and investigators in the extracellular vesicle and proteostasis fields. 

      We now include supporting data using a single copy alternate fluorescent reporter expressed in touch neurons (Fig. 3H).

      In brief, we examined the induction of exophergenesis in an alternative single-copy transgene strain that expresses mKate fluorescent protein specifically in touch receptor neurons. As compared to the multi-copy transgene that is broadly used in this study and expresses mCherry fluorescent protein specifically in touch receptor neurons, the mKate single-copy transgene is associated with a much lower frequency of exophergenesis. However, increasing uterine distension via blocking egg-laying can increase the exophergenesis of the mKate single-copy transgenic line from 0% to approximately 60% on adult day 1, indicating that the observed response is not tied to a single reporter.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      The authors sought to understand the stage-dependent regulation of exophergenesis, a process thought to contribute to promoting neuronal proteostasis in C. elegans. Focusing on the ALMR neuron, they show that the frequency of exopher production correlates with the timing of reproduction. Using many genetic tools, they dissect the requirements of this pathway to eventually find that occupancy of the uterus acts as a signal to induce exophergenesis. Interestingly, the physical proximity of neurons to the egg zone correlates with exophergenesis frequency. The authors conclude that communication between the uterus and proximal neurons occurs through the sensing of mechanic forces of expansion normally provided by egg occupancy to coordinate exophergenesis with reproduction. 

      Strengths: 

      The genetic data presented is thorough and solid, and the observation is novel. 

      Weaknesses: 

      The main weakness of the study is that the detection of exophers is based on the overexpression of a fluorescent protein in touch neurons, and it is not clear whether this process is actually stimulated in wild-type animals, or if neurons have accumulated damaged proteins in relatively young day 2 animals. 

      We now include data using a single copy alternate fluorescent reporter expressed in touch neurons. Although baseline exopher levels are low in this strain, we demonstrate that inducing egg retention in this background markedly increases exopher generation from a baseline of near zero to ~60% (new Fig. 3H), supporting that uterine distention, rather than reporter identity, is associated with early life exopher elevation. Data also add to our observations indicating that high protein-expressing strains generally produce higher baseline levels of exophers in early adulthood (for example, Melentijevic et al. (PMID 28178240) documented that mCherry RNAi knockdown in the strain primarily studied here can lower exopher levels).

      The second point raised here, regarding the occurrence and physiological role of early-adult exophers in “native” non-stressed neurons is a fascinating question that we are beginning to address in continuing experiments. Readers will appreciate that quantifying relatively rare, “invisible” touch receptor neuron exophergenesis accurately without expressing a fluorescent reporter is technically challenging. Our speculation, outlined now a bit more clearly in the Discussion here, is that certain molecular and organelle debris that cannot readily be degraded in cells during larval development may be stored until release to more capable degradative neighbors or to the coelomocytes for later management, as one component of the early adult transition in proteostasis (see J. Labbadia and R. I. Morimoto, PMID 24592319). Receiving cells may be primed for this at a particular timepoint, possibly analogous to the “bulky garbage” collection of over-sized difficult-to-dispose-of household items that a town will address with specialized action only at specific times. The prediction is that we should be able to detect some mass protein aggregation through early development, and at least partial elimination by adult day 3; this elimination should be impaired when eggs are eliminated. Initial testing is underway.

      Reviewer #2 (Public Review): 

      Summary: 

      This paper reports that mechanical stress from egg accumulation is a biological stimulus that drives the formation of extruded vesicles from the neurons of C. elegans ALMR touch neurons. Using powerful genetic experiments only readily available in the C. elegans system, the authors manipulate oocyte production, fertilization, embryo accumulation, and egg-laying behavior, providing convincing evidence that exopher production is driven by stretch-dependent feedback of fertilized, intact eggs in the adult uterus. Shifting the timing of egg production and egg laying alters the onset of observed exophers. Pharmacological manipulation of egg laying has the predicted effects, with animals retaining fewer eggs having fewer exophers and animals with increased egg accumulation having more. The authors show that egg production and accumulation have dramatic consequences for the viscera, and moving the ALMR process away from eggs prevents the formation of exophers. This effect is not unique to ALMR but is also observed in other touch neurons, with a clear bias toward neurons whose cell bodies are adjacent to the filled uterus. Embryos lacking an intact eggshell with reduced rigidity have impaired exopher production. Acute injection into the uterus to mimic the stretch that accompanies egg production causes a similar induction of exopher release. Together these results are consistent with a model where stretch caused by fertilized embryo accumulation, and not chemical signals from the eggs themselves or egg release, underlies ALMR exopher production seen in adult animals. 

      Strengths: 

      Overall, the experiments are very convincing, using a battery of RNAi and mutant approaches to distinguish direct from indirect effects. Indeed, these experiments provide a model generally for how one would methodically test different models for exopher production. The paper is well-written and easy to understand. I had been skeptical of the origin and purpose of exophers, concerned they were an artefact of imaging conditions, caused by deranged calcium activity under stressful conditions, or as evidence for impaired animal health overall. As this study addresses how and when they form in the animal using otherwise physiologically meaningful manipulations, the stage is now set to address at a cellular level how exophers like these are made and what their functions are. 

      Weaknesses: 

      Not many. The experiments are about as good as could be done. Some of the n's on the more difficult-to-work strains or experiments are comparatively low, but this is not a significant concern because of the number of different, complementary approaches used. The microinjection experiment in Figure 7 is very interesting, there are missing details that would confirm whether this is a sound experiment. 

      We expanded description of details for the microinjection experiment in both the figure legend and the methods section, to enhance clarity and substantiate approach.

      Reviewer #3 (Public Review): 

      Summary: 

      In this paper, the authors use the C. elegans system to explore how already-stressed neurons respond to additional mechanical stress. Exophers are large extracellular vesicles secreted by cells, which can contain protein aggregates and organelles. These can be a way of getting rid of cellular debris, but as they are endocytosed by other cells can also pass protein, lipid, and RNA to recipient cells. The authors find that when the uterus fills with eggs or otherwise expands, a nearby neuron (ALMR) is far more likely to secrete exophers. This paper highlights the importance of the mechanical environment in the behavior of neurons and may be relevant to the response of neurons exposed to traumatic injury. 

      Strengths: 

      The paper has a logical flow and a compelling narrative supported by crisp and clear figures. 

      The evidence that egg accumulation leads to exopher production is strong. The authors use a variety of genetic and pharmacological methods to show that increasing pressure leads to more exopher production, and reducing pressure leads to lower exopher production. For example, egg-laying defective animals, which retain eggs in the uterus, produce many more exophers, and hyperactive egg-laying is accompanied by low exopher production. The authors even inject fluid into the uterus and observe the production of exophers. 

      Weaknesses: 

      The main weakness of the paper is that it does not explore the molecular mechanism by which the mechanical signals are received or responded to by the neuron, but this could easily be the subject of a follow-up study. 

      We agree that the molecular mechanisms operative are of considerable interest, and our initial pursuit suggests that a comprehensive study will be required for satisfactory elaboration of how mechanical signals are received or responded to by the neuron.

      I was intrigued by this paper, and have many questions. I list a few below, which could be addressed in this paper or which could be the subject of follow-up studies. 

      - Why do such a low percentage of ALMR neurons produce exophers (5-20%)? Does it have to do with the variability of the proteostress? 

      We do not yet understand why some ALMR neurons within a same genotype will produce exophers and some will not. We know that in addition to the uterine occupation we report here, proteostasis compromise, feeding status, oxidative stress, and osmotic stress can elevate exopher numbers (PMID 34475208); cell autonomous influences on exopher levels include aggresome-associated biology (PMID 37488107) and expression levels of the mCherry protein (PMID 28178240). Turek reports that social interaction on plates can influence muscle exopher levels (PMID 34288362). Thus, although variable proteostress experienced by neurons is likely a factor, we have not yet experimentally defined specific trigger rules. We suspect the summation of internal proteostasis crisis and environmental conditions, including particular force vectors/frequency will underlie the variable exopher production phenomeonon.

      - Why does the production of exophers lag the peak in progeny production by 24-48 hours? Especially when the injection method produces exophers right away?

      The progeny production can track well with exopher production (Fig. 1B), although the nature of egg counts (permanent, one time events) vs. exophers (which are slowly degraded) can skew the peak scores apart. We synchronized animals at the L4 stage. 24 hours later was adult day 1, and we measured then and every subsequent 24 hours. The daily progeny count reflects the total number of progeny produced every 24 hours; exopher events were scored once a day, but exophers can persist such that the daily exopher count can partially reflect slow degradation, with some exophers being counted on two days. We now explain our scoring details better in the Methods section.

      The rapid appearance of exophers, as early as about ~10 minutes after sustained injection, is fascinating and probably holds mechanistic implications for exopher biology. For one thing, we can infer that in the mCherry Ag2 background, touch neurons can be poised to extrude exophers, but that the pressure/push acts to trigger or license final expulsion. It is interesting that we found we needed to administer sustained injection of two minutes to find exopher increase (now better emphasized in the expanded Methods section). We speculate that a multiple pressure events, or sustained force vector might be critical (like an egg slowly passing through??). Minimally, this assay may help us assign molecular roles to pathway components as we identify them moving forward. 

      - As mentioned in the discussion, it would be interesting to know if PEZO-1/PIEZO is required for uterine stretching to activate exophergenesis. pezo-1 animals accumulate crushed oocytes in the uterus. 

      We have begun to test the hypothesis that PEZO-1 is a signaling component for ALMR exophergenesis, initially using the N and C terminal pezo-1 deletion mutants as in Bai et al. (PMID 32490809). These pezo-1 mutants have a mild decrease in ALMR exophergenesis under normal conditions. However, vulva-less conditions in pezo-1N and piezo-1C increased ALMR exophergenesis from approximately 10% to 60%, similar to the response of wild-type worms to high mechanical stress, data that suggest PEZO-1 is not a required player in mediating mechanical force-induced ALMR exophergenesis. We are currently testing genetic requirements for other known mechanosensors. We intend comprehensive investigation of the molecular mechanisms of mechanical signaing in a future study.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      -The study would be significantly strengthened by the addition of data detecting regulation of exophergenesis by uterine forces in a more physiological context, in the absence of overexpression of a toxic protein. In other words, is this a process that occurs naturally during reproduction, or is it specific to proteotoxic stress induced by overexpression? Perhaps the authors could repeat key experiments using a single copy transgene, and challenge the animals with exogenous proteotoxic stress if necessary.

      We now include data using a single copy alternate fluorescent reporter expressed in touch neurons. Although baseline exopher levels are low in this strain, we demonstrate that inducing egg retention in this background markedly increases exopher generation from a baseline of near zero to ~60% (Fig. 3H), supporting that uterine distention, rather than reporter identity or over-expression alone dries early life exopher elevation.

      Also noteworthy is that we find exophergenesis in the single-copy transgenic line is only approximately 0.3% on adult day 2 (average in three trials, data not shown), which is much lower than the 5-20% exophergenesis rate typically observed in the multi-copy high expression mCherry transgenic line. Therefore, consequences of overexpression of mCherry likely potentiate exophergenesis.

      -The authors mention that exophergenesis has been described in muscle cells. Is this also dependent on the proximity to the uterus? It would have been interesting to include data on other cell types in the vicinity of the reproductive system.

      Yes, in interesting work on exophers produced by muscle, Turek et al. reported that muscle exopher events are mostly located in a region proximal to the uterus. Moreover, this work also documented that sterile hermaphrodites are associated with approximately 0% muscle exophergenesis, and egg retention in the uterus strongly increases muscle exophergenesis (PMID: 34288362).  

      -Is exophergenesis also induced by other forms of mechanical stress? For example, swimming.

      We have looked at crude treatments such as centrifugation or vortexing without observing changes in exopher levels. Our preliminary work indicates that swimming can increase exophergenesis, and this effect depends on the presence of eggs in the uterus. We appreciate the question, and expect to include documentation of alternative pressure screening in our planned future paper on molecular mechanisms.

      -In Figure 1E, the profile of exopher production for the control condition at 25oC is very similar to the profile observed at 20oC in Figure 1B. However, the profile of progeny production at 25oC is known to have an earlier peak of progeny production. Perhaps egg retention is differently correlated with progeny production at this temperature? The authors could easily test this.

      Overall, exophers (which degrade with time) and progeny counts (a fixed number) have slightly different temporal features, anchored in part by how long exophers or their “starry night” debris persist. Most exophers start to degrade within 1-6 hours (PMID: 36861960), but exopher debris can persist for more than 24 hours. An exopher event observed on day 1 may thus also be recorded at the day 2 time point, which leads to a higher frequency of exopher events on day 2 as compared to day 1.

      We have previously published on the impact of temperature on exopher number (Supplemental Figure 2 in PMID 34475208). In brief, increasing culture temperature for animals that are raised over constant lifetime temperature modestly increases exopher number; a greater increase in exophers is observed under conditions in which animals were switched to a higher temperature in adult life, suggesting changes in temperature (a mandatory part of the ts mutant studies) engages complex biology that modulates exopher production. Our previous data show that in a temperature shift to 25oC, the peak of exophers was at adult day 1. Here, Fig. 1B is constant temperature, 20oC; Fig. 1E has a temperature shift 15-25oC. That egg retention might be temperature-influenced is a plausible hypothesis, but given the complexities of temperature shifts for some mutants, we elected to defer drill-down on the temperature-exopher-egg relationship. 

      -It is not clear how to compare panels A and B in Figure 3. In panel A the males are present throughout the adult life of the hermaphrodites whereas in panel B the males are added in later life. Therefore, the effect of later-life mating on progeny production is not shown and the title of panel A in the legend is misleading. The authors need to perform a progeny count in the same conditions of mating presented in Figure 3B to allow direct comparison.

      As Reviewer 1 suggested, we performed a new progeny count now presented in new Fig. 3A, which more appropriately matches the study presented in Fig. 3B; legends adjusted.

      -On page 12, the authors state that the baseline of exophergenesis in rollers is 71%, but then attribute the 71% in Figure 4F to exophergenesis specifically in ALMR that is posterior to AVM. The authors need to clarify this point.

      Good catch on our error. The baseline of exophergenesis in rollers is ~40%, and we corrected the main text.

      -Considering the conclusion of Figure 2 that blocking embryonic events passed the 4-cell stage does not impact exopher production, it would have been interesting to compare the uterine length for emb-8 and for mex-3, since it is quite intriguing that the former suppresses exopher production while the latter has no effect.

      We repeated the emb-8 and mex-3 RNAi for these studies and encountered variability in outcome for 2 cell stage disruption via emb-8 RNAi, which is consistent with the range of published endpoints for emb-8 RNAi. We elected to include these emb-8 findings in the figure legend 2G, but removed the RNAi data from the main text figure. mex-3 uterine measures are added to revised panels 5H, 6I.

      Reviewer #2 (Recommendations For The Authors): 

      -Leaving the worms in halocarbon oil for too long (e.g. 10 min) can desiccate and kill them. Did the authors take them out of the oil before analyzing exopher production? The authors refer to these as 'sustained injections' without much description beyond that. As the worms are very small, the flow rate needed for a sustained injection over 2 minutes must be very low - so low that the needle is in danger of being clogged. Do the authors have an estimate of how much fluid was injected or the overall flow rate? I realize the flow rate measured outside of the worm may not compare directly to that of a pressurized worm, but such estimates would be instructive, particularly if they can be related to the relative volume of the eggs the injection is trying to mimic.

      After injection or mock injection, we removed the animal from the oil and flipped it if necessary to observe the ALMR neuron on the NGM-agar plate. We now expanded description of the experimental details of injection, including the estimated flow rate, in the revised Methods section.

      - The authors describe the ALMR neurons as "proteostressed", but I am not clear on whether these neurons were treated in a unique procedure to induce such a state or if the authors are merely building on other observations that egg-laying adults are dedicating significant resources to egg production, so they must be proteostressed. If they are not inducing a proteostressed state in their experiments, the authors should refrain from describing their neurons and effects as depending on such a state.

      We revised to more explicity feature published evidence that the ALMR neurons we track with mCherryAg2 bz166 are likely protestressed. Overexpression of mCherry in bz166 is associated with enlargement of lysosomes and formation of large mCherry foci that often correspond toe LAMP::GFP-positive structures in ALMR neurons (PMID: 28178240; PMID: 37488107). Marked changes in ultrastructure reflect TN stress in this background. These cellular features are not seen in wild type animals. We previously published that mCherry, polyQ74, polyQ128, Ab1-42 (which enhance proteostress) over-expression all increase exophers (PMID: 28178240). Likewise most genetic compromise of different proteostasis branches--heat shock chaperones, proteasome and autophagy--promote exophergenesis, supporting exophergenesis as a response to proteostress. In sum, the mCherryAg2 bz166 appear markedly stressed above a non-over expressing line and produce more exophers. RNAi knockdown of the mCherry lowers exopher levels (PMID: 28178240).

      In response to reviewer comment, we added a study with a single copy mKate reporter (new data Fig. 3H). We find a very low baseline of exophers in this background. This would support that high autonomous compromise associated with over-expression influences exopher levels. Interestingly, however, we found that ALMR neurons expressing mKate under a single-copy transgene still exhibit excessive exopher production (>60%) under high mechanical stress (Fig. 3H). These data are consistent with ideas that mechanical stresses can enhance exopher production, and may markedly lower the threshold for exophergenesis in close-to-native stress level neurons.

      - The authors should include more details on the source and use of the RNAi, for example, if the clones were from the Ahringer RNAi library, made anew for this study, or both.

      We now add this information in the methods section.

      - I would be curious if the authors would similarly see an induction in exopher production after acute vulval muscle silencing with histamine. I'm not suggesting this experiment, but it may offer a way to induce exophers in a more controlled manner.

      This is a great suggestion that we will try in future studies.

      - I am not sure if Figure 5 needs to be a main figure in the paper or if it would be more appropriate as a supplement.

      We considered this suggestion but we think that the strikingly strong correleation of uterus length and exopher levels is a major point of the story and these data establish a metric that we will use moving forward to distinquish whethere an exopher modulation disruption is more likely to act by modulation of reproduction or modulation of touch neuron biology. For this reason we elected to keep Figure 5 in the main text.

      Reviewer #3 (Recommendations For The Authors): 

      -The Statistics section in the methods should be expanded to describe the statistics used in the experiments that aren't nominal, of which there are many.

      We have updated and expanded the statistics section.

      -P.2 Line 49 spelling 'que' should be queue (I remember this by the useless queue of letters lined up after the 'q').

      Corrected 

      -The introduction has a bit too much information about oocyte maturation, not relevant to the study.

      We agree that the information about oocyte maturation is not critical for the laying out the related experiments and cut this section to improve focus.

      -p.3 line 22: Some exophers are seen on Day 3, so this should be restated for accuracy.

      Corrected

      -p.3 line 26. Explain here why sperm is necessary (ooyctes don't mature or ovulate effectively without sperm).

      We added this clarifying explanation.

      -p.3 line 44 Clarify in the spe-44 the oocytes are in the oviduct (not the uterus). Might be helpful to include a DIC image to accompany the helpful diagram in Figure 1D. 

      We added a sentence describing the impact of sperm absence on oocyte maturation, progression into the uterus, and retention in the gonad, with reference to PMID: 17472754.  We were able to add a DIC in the tightly packed Figure 1.

      In Supplemental Figure 6, we now include a field picture of oocyte retention in the sem-2 mutant and upon treatment of lin-39(RNAi).

      -p.5 line 3 in the Figure 1D legend; recommend delete 'light with' which is confusing and just refer to the sperm as dark dots. 

      Corrected

      -p.6 line 22-24 Check for alignment of the statements with Figure 2 (2F is cited, but it should be 2G).

      Corrected

      -p12 line 13-15; Many ALMRs not in the egg zone (70%) did not produce exophers - this is still quite a lot. It would be good to state this section in a more straightforward way (less leading the reader) and if possible to give a possible explanation.

      We modified the text to be less leading: “Thus, although ALMR soma positioning in the egg zone does not guarantee exophergenesis in the mCherryAg2 strain, the neurons that did make exophers were nearly always in the egg zone.”

      -p.15 paragraph 3 - clarify how uterine length was controlled for the overall body length of the worm.

      We did not systematically measure body length, but rather focused on uterine distention. It would be of interest to determine if length of the body correlates with uterine size, and then address how that relationship translates to exopher production but here our attention came to rest on the striking correlation of uterine length and number of exophers.

      -p.17 line 23-25; Could be stated more simply. 

      We adjusted the text: “Moreover, the oocyte retention was similarly efficacious in elevating exopher production to egg retention, increasing ALMR exophergenesis to approximately 80% in the sem-2(rf) mutant (Fig. 6C)”.

      -p.23 Line 4. I think by the time the reader reaches this sentence, the egg-coincident exophorgenesis will not be 'puzzling'. 

      Agreed, corrected.

      -p.26, Line 22, Male 'mating', not 'matting'.

      Corrected.

      -Throughout, leave space between number and unit (this is not required for degree or percent, but be consistent). 

      Corrected.

    1. Author Response:

      We thank the reviewers for their careful reading of the manuscript and for their comments. Generally, we agree with the reviewers on the strengths and weaknesses of our manuscript. It is true that this work is a first step towards understanding the molecular mechanisms underlying TNT formation, and that further biochemical and biophysical analyses will be necessary to elucidate CD9 and CD81 roles. It also provides a toolbox for the future identification of important TNT factors, and perhaps biological markers.

      However, we would like to better explain our choice of focusing on CD9 and CD81 in TNTs, given the fact that they are also expressed in EVPs. First, both were among the most abundant integral membrane proteins in TNTs, and overexpression of CD9 was previously shown to increase TNT number. However, a recent work directed by our coauthor E. Rubinstein clearly showed that the absence of CD9, CD81 or even both has minimal impact on the production or composition of EVs in MCF7 (Fan et al, Differential proteomics argues against a general role for CD9, CD81 or CD63 in the sorting of proteins into extracellular vesicles, J. Extracell Vesicles, 2023;12:12352. https://doi.org/10.1002/jev2.12352). This is in line with another recent publication (Tognoli, Commun biol 2023) and with our results showing that the concentration of EVPs was the same when CD9 was overexpressed, i.e. in conditions where TNT number and vesicle transfer were increased. Therefore, it is highly probable that the role of CD9 and CD81 in TNT vs. EVP formation is different, even if we cannot completely exclude a crosstalk between the two pathways.

      Regarding the importance of CD9 and CD81 in TNT formation, our results are consistent with a non-exclusive regulation of the TNTs by these tetraspanins, and/or with partial compensatory mechanisms occurring in the absence of them by yet unknown factors. Interestingly, to our knowledge, none of the TNT regulators described in the literature has a complete inhibitory effect when KO. These results confirm that several pathways can converge to regulate TNTs and are consistent with cellular plasticity. So it is hard to say whether factors like CD9 and CD81, which regulate TNTs and have other functions in cells, are “key” or simply “important”.

      Finally, the model we present in Figure 7 is a schematic working model of possible CD9/CD81 roles, which is obviously simplified for ease of understanding. It is important to note that when we write “no TNT” above an empty space between 2 cells, this describes what is drawn, and corresponds to real conditions where fewer TNTs are detected. It was never our intention to over-interpret our data, but rather to make it clearer with this diagram, and we hope that reading the article will make this clear.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      We thank the reviewer for the time and effort in reviewing our revised manuscript and are grateful for their constructive comments and for acknowledging the significance of our work.

      Summary: 

      Their findings elucidate the mechanisms underlying 2-AA-mediated reduction of pyruvate transport into mitochondria, which impairs the interaction between ERRα and PGC1α, consequently suppressing MPC1 expression and reducing ATP production in tolerized macrophages. While the data presented is intriguing and the paper is well-written, there are several points that warrant consideration. The authors should enhance the clarity, relevance, and impact of their study. 

      Strengths: 

      This paper presents a novel discovery regarding the mechanisms through which PA regulates the bioenergetics of tolerized macrophages. 

      Weaknesses: 

      The relevance of the in vivo model to support the conclusions is questionable. Further clarification is needed on this point. 

      We appreciate the reviewer’s comment. Our conclusion that 2-AA decreases bioenergetics while sustains bacterial burden is further supported by additional in vivo data we present now in Fig. S5. To strengthen the relevance of our in vivo data, we performed additional in vivo experiments. In this set of in vivo studies, mice received the first exposure to 2-AA by injecting 2-AA only and the 2nd exposure through infection with PA14 or ΔmvfR four days post-2-AA injection.  As shown in the supplementary Figure S5 the levels of ATP and acetyl-CoA in the spleen of infected animals and the enumeration of the bacterial counts were the similar between PA14 or ΔmvfR receiving the 1st 2-AA exposure and agree with the “one-shot infection” findings presented in Figure 5 with the PA14 or ΔmvfR+2-AA infected mice or those receiving 2-AA only. These results are consistent with our previous findings showing that 2-AA impedes the clearance of PA14 (Bandyopadhaya et al. 2012; Bandyopadhaya et al. 2016; Tzika et al. 2013) and provide compelling evidence that the metabolic alterations identified may favor PA persistence in infected tissues.

      Reviewer #2 (Public Review): 

      We thank the reviewer for the time and effort in reviewing our revised manuscript and are grateful for their constructive comments and for acknowledging the significance of our work.

      Summary: 

      The study tries to connect energy metabolism with immune tolerance during bacterial infection. The mechanism details the role of pyruvate transporter expression via ERRalpha-PGC1 axis, resulting in pro-inflammatory TNF alpha signalling responsible for acquired infection tolerance. 

      Strengths: 

      Overall, the study is an excellent addition to the role of energy metabolism during bacterial infection. The mechanism-based approach in dissecting the roles of metabolic coactivator, transcription factor, mitochondrial transporter, and pro-inflammatory cytokine during acquired tolerance towards infections indicates a detailed and well-written study. The in vivo studies in mice nicely corroborate with the cell line-based data, indicating the requirement for further studies in human infections with another bacterial model system. 

      Weaknesses:

      The authors have involved various mechanisms to justify their findings. However, they have missed out on certain aspects which connect the mechanism throughout the paper. For example, they measured ATP and acetyl COA production linked with bacterial re-exposures and added various targets like MCP1, EER alpha, PGC1 alpha, and TNF alpha. However, they skipped PGC1 alpha levels, ATP and acetyl COA in various parts of the paper. Including the details would make the work more comprehensive. 

      We appreciate the reviewer’s comments and apologize for omitting the PGC-1α levels.  Per the reviewer’s suggestion, we have added the PGC-1α transcript levels (Figure 4C) in the section describing 2-AA-mediated dysregulation of the ERRα and MPC1 transcription (lines 243-252). Moreover, we have added Figure S5, which shows additional ATP and acetyl CoA levels in vivo. In our view, ATP and acetyl-CoA levels are shown in all appropriate settings, interrogating the bioenergetics, including in the presence of bacteria and in their absence, where only 2-AA is added. Please see Figures 1 and 5 and the newly added Figure S5.

      The use of public data sets to support their claim on immune tolerance is missing. Including various data sets of similar studies will strengthen the findings independently. 

      Suppose we understand correctly the reviewer’s comment regarding public data sets on immune tolerance. In that case, we are referring to our data since there are no published data from other groups on 2-AA tolerization and because the outcome of the 2-AA effect on the bacterial burden differs from that of LPS. Therefore, this study did not consider comparing with published data from LPS.

      Reviewer #1 (Recommendations For The Authors): 

      (1) Animal model: The authors appropriately initiated the study with an in vitro tolerization model involving 2-AA re-exposure, providing foundational insights for further investigation. However, the rationale for the one-shot injection in the in vivo model lacks clarity. To strengthen the relevance of the in vivo data, the authors should consider establishing a model involving bacterial re-exposure, such as a two-challenge paradigm with antibiotic treatment in between. This approach would allow for the examination of peritoneal macrophages harvested from mice, assessing ATP levels, acetyl CoA, TNF production, and bacterial counts. Such an approach would better align the in vivo findings with the in vitro experiments, confirming the role of tolerized macrophages in controlling PA infection in the presence of 2-AA. 

      We thank the reviewer for this comment.  Indeed, we have performed a similar two-challenge paradigm study in which first exposure to 2-AA is achieved by injecting 2-AA, and 2nd exposure through infection with PA14 or ΔmvfR four days post -2-AA injection.  The results of Figure S5 can be directly compared with those in Fig 5 in vivo studies. As shown in supplementary Figure S5 the levels of ATP and acetyl-CoA in the spleen of infected animals and the enumeration of the bacterial counts agree with the “one-shot infection” presented in Fig 5 (PA14 or ΔmvfR+2-AA).  Figure S5 study although not included initially to simplify data presentation, it was performed in parallel with Fig 5 and thus they can be directly compared. 

      (2) Exogenous ATP treatment: It is crucial to explore whether 2-AA re-exposure suppresses inflammasome activation and whether this suppression can be reversed by exogenous ATP treatment. Specifically, the authors should investigate whether NLRP3 inflammasome activation is inhibited in tolerized macrophages and whether such activation is necessary for host defense. Clarifying these points would provide valuable insights into the mechanisms underlying macrophage tolerization induced by 2-AA. 

      Excellent point. We agree, indeed, this is planned in the near future.

      (3) Figures 4C and D: The authors should exercise care in describing these figures. For instance, line 263 states that "UK5099 had no effect on the PA14 burden in macrophages," which requires correction for accuracy. 

      We apologize and rephrase this sentence and other sentences referring to Fig 4D and 4E in this section. Please see the highlighted sentences in the results section referring to Fig 4. For example, “The addition of the UK5099 inhibitor strongly enhanced the bacterial intracellular burden in ΔmvfR infected macrophages compared to the non-inhibited ΔmvfR infected cells, reaching a similar burden to those infected with PA14 (Fig. 4D)”.

      (4) ERRα expression: While the study intriguingly demonstrates a decrease in ERRα levels in tolerized macrophages following exposure to 2-AA, the discussion of this finding is lacking. It is worth exploring the possibility of increasing ERRα expression to counteract the tolerization induced by 2-AA and enhance clearance of PA infection. This avenue should be thoroughly discussed in the manuscript's Discussion section, offering insights into potential therapeutic strategies to mitigate the effects of 2-AA on macrophage function. 

      Thank you so much for this additional comment.  We have now included this point in the discussion section (lines 373-376).

      Reviewer #2 (Recommendations For The Authors): 

      Overall, the study is an excellent addition to the role of energy metabolism during bacterial infection. The mechanism-based approach in dissecting the roles of metabolic coactivator, transcription factor, mitochondrial transporter, and pro-inflammatory cytokine during acquired tolerance indicates a detailed and well-written study. However, connecting the mechanisms often was not reflected in some of the experiments, and answering a few concerns/suggestions will undoubtedly improve the study's readability, appeal, and overall impact on a broader audience. 

      (1) The authors should rephrase the title if possible. The title indicates 2AA as a bacterial quorum sensing signal; however, throughout the manuscript, there are no studies associated with actual quorum sensing in bacteria. 

      Thank you for this comment. However, the title indicates 2-AA as a quorum sensing molecule because the synthesis of this signaling molecule is uniquely regulated by quorum sensing. Because of its importance in the virulence of Pseudomonas aeruginosa and its regulation by quorum sensing, we feel that it is appropriate to refer to it as such.

      (2) The authors generalised immunotolerance and memory of 2AA-exposed cells to broad-spectrum microbial exposure by just testing with LPS exposure. I would suggest they test at least 2 more heterologous microbial products known to illicit response and confirm their claim from Figure 1. 

      We appreciate the reviewer’s comment. We intend not to generalize immunotolerance and memory of 2-AA exposed cells to broad-spectrum microbial exposure. Moreover, since the manuscript is not focused on comparing other bacterial molecules to 2-AA and multiple studies have focused on LPS tolerance, we tested LPS only in the manuscript.

      (3) LPS triggers ATP production through glycolysis in nitric oxide (NO) dependent mechanisms in various immune and non-immune cells. The authors should study the concentrations of NO, Glucose, and Pyruvate levels to clarify the mechanism of energy dynamics and the source of ATP and Acetyl CoA generated/scavenged during primary and secondary exposures to both 2AA and LPS. 

      We agree that a cross-tolerization experiment using 2-AA and LPS would reveal interesting insights into immune response during PA infections.  However, this is out of the scope of this article. Please notice that the mechanism of 2-AA and LPS tolerization is mechanistically distinct, e.g. they rely on different HDAC enzymes, and LPS tolerization predominantly involves changes in H3K27 acetylation (Lauterbach et al. 2019). In contrast, 2-AA tolerization involves H3K18 modifications (Bandyopadhaya, Tsurumi, and Rahme 2017). For this reason, the complexity of such interactions would require a comprehensive set of experiments that are not part of the focus of this study.

      (4) Immunogenic triggers often rapidly alter mitochondrial membrane potential, which alters oxygen consumption rates. However, the authors tend to generalize energy homeostasis and claim the deregulation of OXPHOS-inducing quiescent phenotype depending upon OCR measurements from Figure 1D. The authors must evaluate mitochondrial health and membrane potential during first and second exposure in a time-dependent manner to strengthen their theory of mitochondrial dysfunction. The authors should also check the phenomena in vivo (mice exposed to infection) if possible. 

      Thank you for this suggestion. We now include electron microscopy images of mitochondria isolated from macrophages exposed to 2-AA. Results revealed that 2-AA alters mitochondrial morphology and cristae, supporting the mitochondrial dysfunctionality caused by 2-AA. These results are shown in Figure S4 and lines 185-188.

      (5) Since both MCP1 and MCP2 transporters are known to transport pyruvate to mitochondria, checking both MCP1 and 2 at transcript and protein levels in exposed cells will be essential. I suggest authors use MCP inhibitors or use RNA interference against MCPs to check the effect on tolerance of the cells exposed for a second time. 

      To our understanding, mitochondrial pyruvate carrier proteins, MPC1 and MPC2, form a hetero-oligomeric complex in the inner mitochondrial membrane to facilitate pyruvate import into mitochondria (McCommis and Finck 2015). We also used UK5099 an MPC carrier inhibitor for enumeration of bacterial load in macrophages in Figure 4 and observed a similar effect as 2-AA suggesting a similar mechanism of action.

      (6) The pyruvate levels of mitochondria in Figure 2A are shallow, and the authors claim statistical significance within a 1.5-fold change. The authors should cross-check the number of mitochondria they are isolating while estimating pyruvate from only mitochondrial fractions. Another point is, correlating mitochondrial pyruvate with the burst of ATP during first exposure in comparison to second exposure, one can argue that the number of mitochondria is variable between the exposures leading to a change in pyruvate amount (mitochondria number increases to compensate for the first exposure and decreases quickly to maintain homeostasis and remains quiescent during a second exposure due to activation of compensatory immune mechanism towards primary exposure). How do authors address the issue? 

      Our electron microscopic studies indicate that although after 2-AA exposure, no reduction in mitochondrial numbers is observed in macrophages, alterations in mitochondrial morphology and cristae are observed. Please also see our answer to point # 4.

      (7) The authors claim that ERR alpha regulates MCP1 transcription via activation of ERRalpha-PGC1 alpha axis and tolerization in cells to second exposure is due to impairment of the axis (Figure 3). PGC1 alpha is known to be induced during various metabolic, physiological, and immune-challenge-related stress in a tissue-dependent manner. In this context, one should expect changes in transcript and protein levels of PGC1 alpha. The authors must study PGC1 alpha levels with time-dependent exposures. LPS was shown to induce oscillations in PGC1 alpha levels in a tissue-specific manner. In experiments, authors should verify if such oscillations persist during time-dependent exposure, emphasising mitochondrial uncoupling that might get dampened during re-exposures to microbial challenges. 

      We appreciate the suggestion. We have now included PGC-1α (Figure 4C) transcript levels, which show the same profile as the transcript levels of ERRα and MPC1. Please note that PGC-1α is only one of several ERRα co-activators; therefore, the amount of ERRα protein is the most relevant assessment regarding the activation of the MPC1 transcription.

      (8) The authors claim that ERRalpha induces MCP1 through ChIP data in Figure 3. However, the physical verifications at mRNA levels and mutational/inhibitor-based experiments are missing. The authors should study the alterations of MCP1 mRNA in relation to exposures and inhibitors of ERRalpha and PGC1 alpha to strengthen their work. 

      This is an interesting approach; however, this experiment exceeds the scope of our manuscript. We will certainly consider this suggestion in our future experiments. Thank you.

      (9) Publicly available data sets with LPS exposures should be analyzed for gene sets pertaining to mitochondrial OXPHOS, metabolism, immune response, etc. This will support the authors' work and provide a global overview of transcriptome associated with immune tolerance. 

      We appreciate the reviewer’s comment. For the reasons explained in #3 point and because the bacterial burden outcome of the 2-AA effect is different from that of LPS, comparison with LPS published data was not considered in this study.  We agree that in the future, a comprehensive comparison of whole genome transcriptome studies between LPS and 2-AA may reveal important insights that may also help better understand and potentially classify the immune tolerance triggered by 2-AA.

      (10) In Figure 4, the authors study the role of MCP1 and associated pyruvate-dependent bacterial clearance during tolerization and associate them with a decrease in TNF alpha. I would suggest the addition of an ERR alpha inhibitor in these experiments. It is not clear as to why (mechanism) TNF alpha transcription was affected via pyruvate transport during bacterial exposure. I would suggest that the authors clarify the mechanism of TNF alpha activation/inactivation and its association with energy metabolism during acquired tolerance. 

      This is an excellent suggestion, given that a similar effect of ERRα on TNF-α was observed by other researchers (Chaltel-Lima et al. 2023).  Here, to clarify the mechanism of TNF alpha activation/inactivation and its association with energy metabolism, we elaborate on this aspect in the discussion section.

      Lines 388-393. The text reads:

      Previously, we reported that 2-AA tolerization induces histone deacetylation via HDAC1, reducing H3K18ac at the TNF-α promoter (Bandyopadhaya et al. 2016). The findings with acetyl-CoA reduction, the primary substrate of histone acetylation, and the TNF-α transcription  using UK5099 and ATP in 2-AA treated macrophages are in support of the bioenergetics disturbances observed in macrophages and their link to epigenetic modifications we have shown to be promoted by 2-AA (Bandyopadhaya et al. 2016)

      (11) It is surprising that authors specifically target TNF alpha as a pro-inflammatory cytokine during tolerance. Various reports of cytokines and immune modulatory factors play a vital role in immune tolerance upon bacterial exposure. I would suggest authors perform cytokine profiling or check public data sets to specify their reason for choosing TNF alpha. 

      The choice of TNF-α is based on the results obtained in our previous study  (Bandyopadhaya et al. 2016).

      Bandyopadhaya, A., M. Kesarwani, Y. A. Que, J. He, K. Padfield, R. Tompkins, and L. G. Rahme. 2012. 'The quorum sensing volatile molecule 2-amino acetophenon modulates host immune responses in a manner that promotes life with unwanted guests', PLoS pathogens, 8: e1003024.

      Bandyopadhaya, A., A. Tsurumi, D. Maura, K. L. Jeffrey, and L. G. Rahme. 2016. 'A quorum-sensing signal promotes host tolerance training through HDAC1-mediated epigenetic reprogramming', Nat Microbiol, 1: 16174.

      Bandyopadhaya, A., A. Tsurumi, and L. G. Rahme. 2017. 'NF-kappaBp50 and HDAC1 Interaction Is Implicated in the Host Tolerance to Infection Mediated by the Bacterial Quorum Sensing Signal 2-Aminoacetophenone', Front Microbiol, 8: 1211.

      Chaltel-Lima, L., F. Domínguez, L. Domínguez-Ramírez, and P. Cortes-Hernandez. 2023. 'The Role of the Estrogen-Related Receptor Alpha (ERRa) in Hypoxia and Its Implications for Cancer Metabolism', Int J Mol Sci, 24.

      Lauterbach, M. A., J. E. Hanke, M. Serefidou, M. S. J. Mangan, C. C. Kolbe, T. Hess, M. Rothe, R. Kaiser, F. Hoss, J. Gehlen, G. Engels, M. Kreutzenbeck, S. V. Schmidt, A. Christ, A. Imhof, K. Hiller, and E. Latz. 2019. 'Toll-like Receptor Signaling Rewires Macrophage Metabolism and Promotes Histone Acetylation via ATP-Citrate Lyase', Immunity, 51: 997-1011 e7.

      McCommis, K. S., and B. N. Finck. 2015. 'Mitochondrial pyruvate transport: a historical perspective and future research directions', Biochem J, 466: 443-54.

      Tzika, A. A., C. Constantinou, A. Bandyopadhaya, N. Psychogios, S. Lee, M. Mindrinos, J. A. Martyn, R. G. Tompkins, and L. G. Rahme. 2013. 'A small volatile bacterial molecule triggers mitochondrial dysfunction in murine skeletal muscle', PloS one, 8: e74528.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      This study by Paoli et al. used a resonant scanning multiphoton microscope to examine olfactory representation in the projection neurons (PNs) of the honeybee with improved temporal resolution. PNs were classified into 9 groups based on their response patterns. Authors found that excitatory repose in the PNs precedes the inhibitory responses for ~40ms, and ~50% of PN responses contain inhibitory components. They built the neural circuit model of the mushroom body (MB) with evolutionally conserved features such as sparse representation, global inhibition, and a plasticity rule. This MB model fed with the experimental data could reproduce a number of phenomena observed in experiments using bees and other insects, including dynamical representations of odor onset and offset by different populations of Kenyon cells, prolonged representations of after-smell, different levels of odorspecificity for early/delay conditioning, and shift of behavioral timing in delay conditioning. The trace conditioning was not modeled and tested experimentally. Also, the experimental result itself is largely confirmatory to preceding studies using other organisms. Nonetheless, the experimental data and the model provide a solid basis for future studies.  

      We thank the reviewer for summarizing the value of our study and recognizing its generality and significance. As suggested, in a revised version of the manuscript, we will discuss the implication of our approach for the context of trace conditioning. The model we presented hinges on the learning-induced plasticity of KC-to-MBON synapses recruited during the learning window (i.e., the simulated US arrival). In the case of trace conditioning, the model predicts that the time of the behavioral response time should match the expected US arrival. Contrary to this prediction, preliminary analyses on empirical measurements of PER latency upon trace conditioning indicate this is not the case. In a revised version of the manuscript, we will discuss the differences between the predictions of the model and the experimental observations in a trace conditioning paradigm.

      Reviewer #2 (Public Review):

      The study presented by Paoli et al. explores temporal aspects of neuronal encoding of odors and their perception, using bees as a general model for insects. The neuronal encoding of the presence of an odor is not a static representation; rather, its neuronal representation is partly encoded by the temporal order in which parallel olfactory pathways participate and are combined. This aspect is not novel, and its relevance in odor encoding and recognition has been discussed for more than the past 20 years. 

      The temporal richness of the olfactory code and its significance have traditionally been driven by results obtained based on electrophysiological methods with temporal resolution, allowing the identification and timing of the action potentials in the different populations of neurons whose combination encodes the identity of an odor. On the other hand, optophysiological methods that enable spatial resolution and cell identification in odor coding lack the temporal resolution to appreciate the intricacies of olfactory code dynamics. 

      (1) In this context, the main merit of Paoli et al.'s work is achieving an optical recording that allows for spatial registration of olfactory codes with greater temporal detail than the classical method and, at the same time, with greater sensitivity to measure inhibitions as part of the olfactory code. 

      The work clearly demonstrates how the onset and offset of odor stimulation triggers a dynamic code at the level of the first interneurons of the olfactory system that changes at every moment as a natural consequence of the local inhibitory interactions within the first olfactory neuropil, the antennal lobe. This gives rise to the interesting theory that each combination of activated neurons along this temporal sequence corresponds to the perception of a different odor. The extent to which the corresponding postsynaptic layers integrate this temporal information to drive the perception of an odor, or whether this sequence is, in a sense, a journey through different perceptions, is challenging to address experimentally. 

      In their work, the authors propose a computational approach and olfactory learning experiments in bees to address these questions and evaluate whether the sequence of combinations drives a sequence of different perceptions. In my view, it is a highly inspiring piece of work that still leaves several questions unanswered. 

      We thank the reviewer for considering that our work has an inspiring nature. Below we have tried to answer the questions raised by the following comments, and we will include part of these answers in the revised version of our manuscript.

      (2) In my opinion, the detailed temporal profile of the response of projection neurons and their respective probabilities of occurrence provide valuable information for understanding odor coding at the level of neurons transferring information from the antennal lobes to the mushroom bodies. An analysis of these probabilities in each animal, rather than in the population of animals that were measured, would aid in better comprehending the encoding function of such temporal profiles. Being able to identify the involved glomeruli and understanding the extent to which the sequence of patterns and inhibitions is conserved for each odor across different animals, as it is well known for the initial excitatory burst of activity observed in previous studies without the fine temporal detail, would also be highly significant. 

      We thank the reviewer for recognizing the relevance of the findings in understanding the logic of olfactory coding. We agree about the importance of establishing if the different glomerular response profiles are evenly distributed across individuals or have individual biases. In the revised version of the manuscript, we will provide data on the distribution of response profiles for each animal and for different olfactory stimuli. Also, we fully agree on the importance of assessing to what extent such response profiles - largely determined by the local network of AL interneurons - are glomerulus-specific and conserved across individuals.

      In my view, the computational approach serves as a useful tool to inspire future experiments; however, it appears somewhat simplistic in tackling the complexity of the subject. One question that I believe the researchers do not address is to what extent the inhibitions recorded in the projection neurons are integrated by the Kenyon cells and are functional for generating odor-specific patterns at that level. 

      The model we proposed represents, indeed, a simplification of olfactory signal processing throughout the honey bee olfactory circuit. Still, it shows that simple but realistic rules can be sufficient to grasp some fundamental aspects of olfactory coding. However, we agree with the reviewer and believe that such a minimalistic model can provide a basis for designing future experiments in which complexity can be increased by adding relevant features, such as the learning-induced plasticity of PN-to-KC synapses or the divergence of multiple PNs from the same glomerulus to different KCs.

      Concerning the reviewer's question on the involvement of inhibitory inputs in generating odor-specific patterns at the level of the KCs, the short answer is yes, they contribute to the summed input of a target KC, thus to the odor representation. In designing the model, we considered that a given glomerulus provides maximal input at maximal excitation and minimal input (=0 input) at maximal inhibition. For this reason, an inhibited glomerulus contributes less (to KC action potential probability) than a glomerulus showing baseline activity. This, in turn, contributes less than an excited glomerulus. From the modeling point of view, normalizing the signal between 0 and 1 (i.e., setting minimal inhibition to 0 and maximal excitation to 1) would yield a similar result as with the current approach, where values range from -25% to +30% F/F. We implement the model's description to clarify this point.

      Lastly, the behavioral result indicating a difference in conditioned response latency after early or delayed learning protocol is interesting. However, it does not align with the expected time for the neuronal representation that was theoretically rewarded in the delayed protocol. This final result does not support the authors' interpretation regarding the existence of a smell and an after-smell as separate percepts that can serve as conditioned stimuli.

      Considering that our odor stimulus lasted 5 seconds, glomerular activity is highly variable at odor onset (i.e., within the first 1s) because of short excitatory response profiles and the delayed and slower onset of inhibitory responses. After the initial phase, the neural representation of the stimulus becomes more stable. Consequently, a neural signature learned in the case of delay conditioning, i.e., with the US appearing towards the end of the olfactory stimulation (t = 4 - 5s), may present itself much earlier (t = 1.5s), triggering a behavioral response that largely anticipates the expected US arrival time. 

      In the model, we observe an early decrease in action potential probability even in the case of delay conditioning. This occurs because the synapses recruited during the last second of olfactory stimulation (within the learning window during which CS and US overlap) become inactive. Because odorant-induced activity recruits highly overlapping synaptic populations between 1.5 and 5 s from the onset, a learning-induced inactivation of part of these synapses will result in a reduced action-potential probability in the modeled MBON. Importantly, this event will not be governed by time but by the appearance of the learned synaptic configuration. 

      We will add a new section to the revised version of the manuscript to clarify this concept and perform further analyses to characterize the contribution of different response types to the modeled response latency.

    1. Author response:

      The following is the response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Nitta et al, in their manuscript titled, "Drosophila model to clarify the pathological significance of OPA1 in autosomal dominant optic atrophy." The novelty of this paper lies in its use of human (hOPA1) to try to rescue the phenotype of an OPA1 +/- Drosophilia DOA model (dOPA). The authors then use this model to investigate the differences between dominant-negative and haploinsufficient OPA1 variants. The value of this paper lies in the study of DN/HI variants rather than the establishment of the drosophila model per se as this has existed for some time and does have some significant disadvantages compared to existing models, particularly in the extra-ocular phenotype which is common with some OPA1 variants but not in humans. I judge the findings of this paper to be valuable with regards to significance and solid with regards to the strength of the evidence.

      Suggestions for improvements:

      (1) Stylistically the results section appears to have significant discussion/conclusion/inferences in section with reference to existing literature. I feel that this information would be better placed in the separate discussion section. E.g. lines 149-154.

      We appreciate the reviewer’s suggestion to relocate the discussion, conclusions, and inferences, particularly those that reference existing literature, to a separate discussion section. For lines 149–154, we placed them in the discussion section (lines 343–347) as follows. “Our established fly model is the first simple organism to allow observation of degeneration of the retinal axons. The mitochondria in the axons showed fragmentation of mitochondria. Former studies have observed mitochondrial fragmentation in S2 cells (McQuibban et al., 2006), muscle tissue (Deng et al., 2008), segmental nerves (Trevisan et al., 2018), and ommatidia (Yarosh et al., 2008) due to the LOF of dOPA1.”

      For lines 178–181, we also placed them in the discussion section (lines 347–351) as follows. “Our study presents compelling evidence that dOPA1 knockdown instigates neuronal degeneration, characterized by a sequential deterioration at the axonal terminals and extending to the cell bodies. This degenerative pattern, commencing from the distal axons and progressing proximally towards the cell soma, aligns with the paradigm of 'dying-back' neuropathy, a phenomenon extensively documented in various neurodegenerative disorders (Wang et al., 2012). ”

      For lines 213–217, 218–220, and 222–223, we also placed them in the discussion section (lines 363– 391) as follows. “To elucidate the pathophysiological implications of mutations in the OPA1 gene, we engineered and expressed several human OPA1 variants, including the 2708-2711del mutation, associated with DOA, and the I382M mutation, located in the GTPase domain and linked to DOA. We also investigated the D438V and R445H mutations in the GTPase domain and correlated with the more severe DOA plus phenotype. The 2708-2711del mutation exhibited limited detectability via HA-tag probing. Still, it was undetectable with a myc tag, likely due to a frameshift event leading to the mutation's characteristic truncated protein product, as delineated in prior studies (Zanna et al., 2008). Contrastingly, the I382M, D438V, and R445H mutations demonstrated expression levels comparable to the WT hOPA1. However, the expression of these mutants in retinal axons did not restore the dOPA1 deficiency to the same extent as the WT hOPA1, as evidenced in Figure 5E. This finding indicates a functional impairment imparted by these mutations, aligning with established understanding (Zanna et al., 2008). Notably, while the 2708-2711del and I382M mutations exhibited limited functional rescue, the D438V and R445H mutations did not show significant rescue activity. This differential rescue efficiency suggests that the former mutations, particularly the I382M, categorized as a hypomorph (Del Dotto et al., 2018), may retain partial functional capacity, indicative of a LOF effect but with residual activity. The I382M missense mutation within the GTPase domain of OPA1 has been described as a mild hypomorph or a disease modifier. Intriguingly, this mutation alone does not induce significant clinical outcomes, as evidenced by multiple studies (Schaaf et al., 2011; Bonneau et al., 2014; Bonifert et al., 2014; Carelli et al., 2015). A significant reduction in protein levels has been observed in fibroblasts originating from patients harboring the I382M mutation. However, mitochondrial volume remains unaffected, and the fusion activity of mitochondria is only minimally influenced (Kane et al., 2017; Del Dotto et al., 2018). This observation is consistent with findings reported by de la Barca et al. in Human Molecular Genetics 2020, where a targeted metabolomics approach classified I382M as a mild hypomorph. In our current study, the I382M mutation preserves more OPA1 function compared to DN mutations, as depicted in Figures 5E and F. Considering the results from our Drosophila model and previous research, we hypothesize that the I382M mutation may constitute a mild hypomorphic variant. This might explain its failure to manifest a phenotype on its own, yet its contribution to increased severity when it occurs in compound heterozygosity.

      (2) I do think further investigation as to why a reduction of mitochondria was noticed in the knockdown. There are conflicting reports on this in the literature. My own experience of this is fairly uniform mitochondrial number in WT vs OPA1 variant lines but with an increased level of mitophagy presumably reflecting a greater turnover. There are a number of ways to quantify mitochondrial load e.g. mtDNA quantification, protein quantification for tom20/hsp60 or equivalent. I feel the reliance on ICC here is not enough to draw conclusions. Furthermore, mitophagy markers could be checked at the same time either at the transcript or protein level. I feel this is important as it helps validate the drosophila model as we already have a lot of experimental data about the number and function of mitochondria in OPA+/- human/mammalian cells.

      We thank the reviewer for the insightful comments and suggestions regarding our study on the impact of mitochondrial reduction in a knockdown model. We concur with the reviewer’s observation that our initial results did not definitively demonstrate a decrease in the number of mitochondria in retinal axons. Furthermore, we measured mitochondrial quantity by conducting western blotting using antiCOXII and found no reduction in mitochondrial content with the knockdown of dOPA1 (Figure S4A and B). Consequently, we have revised our manuscript to remove the statement “suggesting a decreased number of mitochondria in retinal axons. However, whether this decrease is due to degradation resulting from a decline in mitochondrial quality or axonal transport failure remains unclear.” Instead, we have refocused our conclusion to reflect our electron microscopy findings, which indicate reduced mitochondrial size and structural abnormalities. The reviewer’s observation of consistent mitochondrial numbers in WT versus mutant variant lines and elevated mitophagy levels prompted us to evaluate mitochondrial turnover as a significant factor in our study. Regarding verifying mitophagy markers, we incorporated the mito-QC marker in our experimental design. In our experiments, mito-QC was expressed in the retinal axons of Drosophila to assess mitophagy activity upon dOPA1 knockdown. We observed a notable increase in mCherry positive but GFP negative puncta signals one week after eclosion, indicating the activation of mitophagy (Figure 2D–H). This outcome strongly suggests that dOPA1 knockdown enhances mitophagy in our Drosophila model. The application of mito-QC as a quantitative marker for mitophagy, validated in previous studies, offers a robust approach to analyzing this process. Our findings elucidate the role of dOPA1 in mitochondrial dynamics and its implications for neuronal health. These results have been incorporated into Figure 2, with the corresponding text updated as follows (lines 159–167): “Given that an increase in mitophagy activity has been reported in mouse RGCs and nematode ADOA models (Zaninello et al., 2022; Zaninello et al., 2020), the mitoQC marker, an established indicator of mitophagy activity, was expressed in the photoreceptors of Drosophila. The mito-QC reporter consists of a tandem mCherry-GFP tag that localizes to the outer membrane of mitochondria (Lee et al., 2018). This construct allows the measurement of mitophagy by detecting an increase in the red-only mCherry signal when the GFP is degraded after mitochondria are transported to lysosomes. Post dOPA1 knockdown, we observed a significant elevation in mCherry positive and GFP negative puncta signals at one week, demonstrating an activation of mitophagy as a consequence of dOPA1 knockdown (Figure 2D–H).”  

      (3) Could the authors comment on the failure of the dOPA1 rescue to return their biomarker, axonal number to control levels. In Figure 4D is there significance between the control and rescue. Presumably so as there is between the mutant and rescue and the difference looks less.

      As the reviewer correctly pointed out, there is a significant difference between the control and rescue groups, which we have now included in the figure. Additionally, we have incorporated the following comments in the discussion section (lines 329–342) regarding this significant difference: “In our study, expressing dOPA1 in the retinal axons of dOPA1 mutants resulted in significant rescue, but it did not return to control levels. There are three possible explanations for this result. The first concerns gene expression levels. The Gal4-line used for the rescue experiments may not replicate the expression levels or timing of endogenous dOPA1. Considering that the optimal functionality of dOPA1 may be contingent upon specific gene expression levels, attaining a wild-type-like state necessitates the precise regulation of these expression levels. The second is a nonautonomous issue. Although dOPA1 gene expression was induced in the retinal axons for the rescue experiments, many retinal axons were homozygous mutants, while other cell types were heterozygous for the dOPA1 mutation. If there is a non-autonomous effect of dOPA1 in cells other than retinal axons, it might not be possible to restore the wild-type-like state fully. The third potential issue is that only one isoform of dOPA1 was expressed. In mouse OPA1, to completely restore mitochondrial network shape, an appropriate balance of at least two different isoforms, lOPA1 and s-OPA1, is required (Del Dotto et al., 2017). This requirement implies that multiple isoforms of dOPA1 are essential for the dynamic activities of mitochondria.”

      (4) The authors have chosen an interesting if complicated missense variant to study, namely the I382M with several studies showing this is insufficient to cause disease in isolation and appears in high frequency on gnomAD but appears to worsen the phenotype when it appears as a compound het. I think this is worth discussing in the context of the results, particularly with regard to the ability for this variant to partially rescue the dOPA1 model as shown in Figure 5.

      As the reviewer pointed out, the I382M mutation is known to act as a disease modifier. However, in our system, as suggested by Figure 5, I382M appears to retain more activity than DN mutations. Considering previous studies, we propose that I382M represents a mild hypomorph. Consequently, while I382M alone may not exhibit a phenotype, it could exacerbate severity in a compound heterozygous state. We have incorporated this perspective in our revised discussion (lines 375-391).

      “Notably, while the 2708-2711del and I382M mutations exhibited limited functional rescue, the D438V and R445H mutations did not show significant rescue activity. This differential rescue efficiency suggests that the former mutations, particularly the I382M, categorized as a hypomorph (Del Dotto et al., 2018), may retain partial functional capacity, indicative of a LOF effect but with residual activity. The I382M missense mutation within the GTPase domain of OPA1 has been described as a mild hypomorph or a disease modifier. Intriguingly, this mutation alone does no induce significant clinical outcomes, as evidenced by multiple studies (Schaaf et al., 2011; Bonneau et al., 2014; Bonifert et al., 2014; Carelli et al., 2015). A significant reduction in protein levels has been observed in fibroblasts originating from patients harboring the I382M mutation. However, mitochondrial volume remains unaffected, and the fusion activity of mitochondria is only minimally influenced (Kane et al., 2017; Del Dotto et al., 2018). This observation is consistent with findings reported by de la Barca et al. in Human Molecular Genetics 2020, where a targeted metabolomics approach classified I382M as a mild hypomorph. In our current study, the I382M mutation preserves more OPA1 function compared to DN mutations, as depicted in Figures 5E and F. Considering the results from our Drosophila model and previous research, we hypothesize that the I382M mutation may constitute a mild hypomorphic variant. This might explain its failure to manifest a phenotype on its own, yet its contribution to increased severity when it occurs in compound heterozygosity.”

      (5) I feel the main limitation of this paper is the reliance on axonal number as a biomarker for OPA1 function and ultimately rescue. I have concerns because a) this is not a well validated biomarker within the context of OPA1 variants b) we have little understanding of how this is affected by over/under expression and c) if it is a threshold effect e.g. once OPA1 levels reach <x% pathology develops but develops normally when opa1 expression is >x%. I think this is particularly relevant when the authors are using this model to make conclusions on dominant negativity/HI with the authors proposing that if expression of a hOPA1 transcript does not increase opa1 expression in a dOPA1 KO then this means that the variant is DN. The authors have used other biomarkers in parts of this manuscript e.g. ROS measurement and mito trafficking but I feel this would benefit from something else particularly in the latter experiments demonstrated in figure 5 and 6.

      The reviewer raised concerns regarding the adequacy of axonal count as a validated biomarker in the context of OPA1 mutants. In response, we corroborated its validity using markers such as MitoSOX, Atg8, and COXII. Experiments employing MitoSOX revealed that the augmented ROS signals resulting from dOPA1 knockdown were mitigated by expressing human OPA1. Conversely, the mutant variants 2708-2711del, D438V, and R445H did not ameliorate these effects, paralleling the phenotype of axonal degeneration observed. These findings are documented in Figure 5F, and we have incorporated the following text into section lines 248–254 of the results:

      “Furthermore, we assessed the potential for rescuing ROS signals. Similar to its effect on axonal degeneration, wild-type hOPA1 effectively mitigated the phenotype, whereas the 2708-2711del, D438V, and R445H mutants did not (Figure 5F). Importantly, the I382M variant also reduced ROS levels comparably to the wild type. These findings demonstrate that both axonal degeneration and the increase in ROS caused by dOPA1 downregulation can be effectively counteracted by hOPA1. Although I382M retains partial functionality, it acts as a relatively weak hypomorph in this experimental setup.”

      Moreover, utilizing mito-QC, we observed elevated mitophagy in our Drosophila model, with these results now included in Figure 2D–H. Given the complexity of the genetics involved and the challenges in establishing lines, autophagy activity was quantified by comparing the ratio of Atg8-1 to Atg8-2 via Western blot analysis. However, no significant alterations were detected across any of the genotypes. Additionally, mitochondrial protein levels derived from COXII confirmed consistent mitochondrial quantities, showing no considerable variance following knockdown. These insights affirm that retinal axon degeneration and mitophagy activation are present in the Drosophila DOA model, although the Western blot analysis revealed no significant changes in autophagy activation. Such findings necessitate caution as this model may not fully replicate the molecular pathology of the corresponding human disease. These Western blot findings are presented in Figure S4, with the following addition made to section lines 255–263 of the results:

      “We also conducted Western blot analyses using anti-COXII and anti-Atg8a antibodies to assess changes in mitochondrial quantity and autophagy activity following the knockdown of dOPA1. Mitochondrial protein levels, indicated by COXII quantification, were evaluated to verify mitochondrial content, and the ratio of Atg8a-1 to Atg8a-2 was used to measure autophagy activation. For these experiments, Tub-Gal4 was employed to systemically knockdown dOPA1. Considering the lethality of a whole-body dOPA1 knockdown, Tub-Gal80TS was utilized to repress the knockdown until eclosion by maintaining the flies at 20°C. After eclosion, we increased the temperature to 29°C for two weeks to induce the knockdown or expression of hOPA1 variants. The results revealed no significant differences across the genotypes tested (Figure S4A–D).”

      In assessing the effects of dominant negative mutations, measurements including ROS levels, the ratio of Atg8-1 to Atg8-2, and the quantity of COXII protein were conducted, yet no significant differences were observed (Figure S6). This limitation of the fly model is mentioned in the results, noting the observation of the axonal degeneration phenotype but not alterations in ROS signaling, autophagy activity, or mitochondrial quantity as follows (line 287–290):

      “We investigated the impacts of dominant negative mutations on mitochondrial oxidation levels, mitochondrial quantity, and autophagy activation levels; however, none of these parameters showed statistical significance (Figure S6).”

      The reviewer also inquired about the effects of overexpressing and underexpressing OPA1 on axonal count and whether these effects are subject to a threshold. In response, we expressed both wild-type and variant forms of human OPA1 in Drosophila in vivo and assessed their protein levels using Western blot analysis. The results showed no significant differences in expression levels between the wild-type and variant forms in the OPA1 overexpression experiments, suggesting the absence of a variation threshold effect. These findings have been newly documented as quantitative data in Figure 5C. Furthermore, we have included a statement in the results section for Figure 6A, clarifying that overexpression of hOPA1 exhibited no discernible impact, as detailed on lines 274–276.

      “The results presented in Figure 5C indicate that there are no significant differences in the expression levels among the variants, suggesting that variations in expression levels do not influence the outcomes.”

      (6) Could the authors clarify what exons in Figure 5 are included in their transcript. My understanding is transcript NM_015560.3 contains exon 4,4b but not 5b. According to Song 2007 this transcript produces invariably s-OPA1 as it contains the exon 4b cleavage site. If this is true, this is a critical limitation in this study and in my opinion significantly undermines the likelihood of the proposed explanation of the findings presented in Figure 6. The primarily functional location of OPA1 is at the IMM and l-OPA1 is the primary opa1 isoform probably only that localizes here as the additional AA act as a IMM anchor. Given this is where GTPase likely oligomerizes the expression of s-OPA1 only is unlikely to interact anyway with native protein. I am not aware of any evidence s-OPA1 is involved in oligomerization. Therefore I don't think this method and specifically expression of a hOPA1 transcript which only makes s-OPA1 to be a reliable indicator of dominant negativity/interference with WT protein function. This could be checked by blotting UAS-hOPA1 protein with a OPA1 antibody specific to human OPA1 only and not to dOPA1. There are several available on the market and if the authors see only s-OPA1 then it confirms they are not expressing l-OPA1 with their hOPA1 construct.

      As suggested by the reviewer, we performed a Western blot using a human OPA1 antibody to determine if the expressed hOPA1 was producing the l-OPA1 isoform, as shown in band 2 of Figure 5D. The results confirmed the presence of both l-OPA1 and what appears to be s-OPA1 in bands 2 and 4, respectively. These findings are documented in the updated Figure 5D, with a detailed description provided in the manuscript at lines 224-226. Additionally, the NM_015560.3 refers to isoform 1, which includes only exons 4 and 5, excluding exons 4b and 5b. This isoform can express both l-OPA1 and s-OPA1 (refer to Figure 1 in Song et al., J Cell Biol. 2007). We have updated the schematic diagram in the figure to include these exons. The formation of s-OPA1 through cleavage occurs at the OMA1 target site located in exon 5 and the Yme1L target site in exon 5b of OPA1. Isoform 1 of OPA1 is prone to cleavage by OMA1, but a homologous gene for OMA1 does not exist in Drosophila. Although a homologous gene for Yme1L is present in Drosophila, exon 5b is missing in isoform 1 of OPA1, leaving the origin of the smaller band resembling s-OPA1 unclear at this point.

      Reviewer #2 (Public Review):

      The data presented support and extend some previously published data using Drosophila as a model to unravel the cellular and genetic basis of human Autosomal dominant optic atrophy (DOA). In human, mutations in OPA1, a mitochondrial dynamin like GTPase (amongst others), are the most common cause for DOA. By using a Drosophila loss-of-function mutations, RNAi- mediated knockdown and overexpression, the authors could recapitulate some aspects of the disease phenotype, which could be rescued by the wild-type version of the human gene. Their assays allowed them to distinguish between mutations causing human DOA, affecting the optic system and supposed to be loss-of-function mutations, and those mutations supposed to act as dominant negative, resulting in DOA plus, in which other tissues/organs are affected as well. Based on the lack of information in the Materials and Methods section and in several figure legends, it was not in all cases possible to follow the conclusions of the authors.

      We appreciate the reviewer's constructive feedback and the emphasis on enhancing clarity in our manuscript. We recognize the concerns raised about the lack of detailed information in the Materials and Methods section and several figure legends, which may have obscured our conclusions. In response, we have appended the detailed genotypes of the Drosophila strains used in each experiment to a supplementary table. Additionally, we realized that the description of 'immunohistochemistry and imaging' was too brief, previously referenced simply as “immunohistochemistry was performed as described previously (Sugie et al., 2017).” We have now expanded this section to include comprehensive methodological details. Furthermore, we have revised the figure legends to provide clearer and more thorough descriptions.

      Similarly, how the knowledge gained could help to "inform early treatment decisions in patients with mutations in hOPA1" (line 38) cannot be followed.

      To address the reviewer's comments, we have refined our explanation of the clinical relevance of our findings as follows. We believe this revision succinctly articulates the practical application of our research, directly responding to the reviewer’s concerns about linking the study's outcomes to treatment decisions for patients with hOPA1 mutations. By underscoring the model’s value in differential diagnosis and its influence on initiating treatment strategies, we have clarified this connection explicitly, within the constraints of the abstract’s word limit. The revised sentence now reads: "This fly model aids in distinguishing DOA from DOA plus and guides initial hOPA1 mutation treatment strategies."

      Reviewer #3 (Public Review):

      Nitta et al. establish a fly model of autosomal dominant optic atrophy, of which hundreds of different OPA1 mutations are the cause with wide phenotypic variance. It has long been hypothesized that missense OPA1 mutations affecting the GTPase domain, which are associated with more severe optic atrophy and extra-ophthalmic neurologic conditions such as sensorineural hearing loss (DOA plus), impart their effects through a dominant negative mechanism, but no clear direct evidence for this exists particularly in an animal model. The authors execute a well-designed study to establish their model, demonstrating a clear mitochondrial phenotype with multiple clinical analogs including optic atrophy measured as axonal degeneration. They then show that hOPA1 mitigates optic atrophy with the same efficacy as dOPA1, setting up the utility of their model to test disease-causing hOPA1 variants. Finally, they leverage this model to provide the first direct evidence for a dominant negative mechanism for 2 mutations causing DOA plus by expressing these variants in the background of a full hOPA1 complement.

      Strengths of the paper include well-motivated objectives and hypotheses, overall solid design and execution, and a generally clear and thorough interpretation of their results. The results technically support their primary conclusions with caveats. The first is that both dOPA1 and hOPA1 fail to fully restore optic axonal integrity, yet the authors fail to acknowledge that this only constitutes a partial rescue, nor do they discuss how this fact might influence our interpretation of their subsequent results.

      As the reviewer rightly points out, neither dOPA1 nor hOPA1 achieve a complete recovery. Therefore, we acknowledge that this represents only a partial rescue and have added the following explanations regarding this partial rescue in the results and discussion sections.

      Result:

      Significantly —> partially (lines 207 and 228) Discussion (lines 329–342):

      In our study, expressing dOPA1 in the retinal axons of dOPA1 mutants resulted in significant rescue, but it did not return to control levels. There are three possible explanations for this result. The first concerns gene expression levels. The Gal4-line used for the rescue experiments may not replicate the expression levels or timing of endogenous dOPA1. Considering that the optimal functionality of dOPA1 may be contingent upon specific gene expression levels, attaining a wild-type-like state necessitates the precise regulation of these expression levels. The second is a non-autonomous issue. Although dOPA1 gene expression was induced in the retinal axons for the rescue experiments, many retinal axons were homozygous mutants, while other cell types were heterozygous for the dOPA1 mutation. If there is a non-autonomous effect of dOPA1 in cells other than retinal axons, it might not be possible to restore the wild-type-like state fully. The third potential issue is that only one isoform of dOPA1 was expressed. In mouse OPA1, to completely restore mitochondrial network shape, an appropriate balance of at least two different isoforms, l-OPA1 and s-OPA1, is required (Del Dotto et al., 2017). This requirement implies that multiple isoforms of dOPA1 are essential for the dynamic activities of mitochondria.

      The second caveat is that their effect sizes are small. Statistically, the results indeed support a dominant negative effect of DOA plus-associated variants, yet the data show a marginal impact on axonal degeneration for these variants. The authors might have considered exploring the impact of these variants on other mitochondrial outcome measures they established earlier on. They might also consider providing some functional context for this marginal difference in axonal optic nerve degeneration.

      In response to the reviewer’s comment regarding the modest effect sizes observed, we acknowledge that the magnitude of the reported changes is indeed small. To explore the impact of these variants on additional mitochondrial outcomes as suggested, we employed markers such as MitoSOX, Atg8, and COXII for validation. However, we could not detect any significant effects of the DOA plus-associated variants using these methods. We apologize for the redundancy, but to address Reviewer #1's fifth question, we present experimental results showing that while the increased ROS signals observed upon dOPA1 knockdown were rescued by expressing human OPA1, the mutant variants 2708-2711del, D438V, and R445H did not ameliorate this effect. This outcome mirrors the axonal degeneration phenotype and is documented in Figure 5F. The following text has been added to the results section lines 248–254:

      “Furthermore, we assessed the potential for rescuing ROS signals. Similar to its effect on axonal degeneration, wild-type hOPA1 effectively mitigated the phenotype, whereas the 2708-2711del, D438V, and R445H mutants did not (Figure 5F). Importantly, the I382M variant also reduced ROS levels comparably to the wild type. These findings demonstrate that both axonal degeneration and the increase in ROS caused by dOPA1 downregulation can be effectively counteracted by hOPA1. Although I382M retains partial functionality, it acts as a relatively weak hypomorph in this experimental setup.”

      Moreover, utilizing mito-QC, we observed elevated mitophagy in our Drosophila model, with these results now included in Figure 2D–H. Given the complexity of the genetics involved and the challenges in establishing lines, autophagy activity was quantified by comparing the ratio of Atg8-1 to Atg8-2 via Western blot analysis. However, no significant alterations were detected across any of the genotypes. Additionally, mitochondrial protein levels derived from COXII confirmed consistent mitochondrial quantities, showing no considerable variance following knockdown. These insights affirm that retinal axon degeneration and mitophagy activation are present in the Drosophila DOA model, although the Western blot analysis revealed no significant changes in autophagy activation. Such findings necessitate caution as this model may not fully replicate the molecular pathology of the corresponding human disease. These Western blot findings are presented in Figure S4, with the following addition made to section lines 255–263 of the results:

      “We also conducted Western blot analyses using anti-COXII and anti-Atg8a antibodies to assess changes in mitochondrial quantity and autophagy activity following the knockdown of dOPA1. Mitochondrial protein levels, indicated by COXII quantification, were evaluated to verify mitochondrial content, and the ratio of Atg8a-1 to Atg8a-2 was used to measure autophagy activation. For these experiments, Tub-Gal4 was employed to systemically knockdown dOPA1. Considering the lethality of a whole-body dOPA1 knockdown, Tub-Gal80TS was utilized to repress the knockdown until eclosion by maintaining the flies at 20°C. After eclosion, we increased the temperature to 29°C for two weeks to induce the knockdown or expression of hOPA1 variants. The results revealed no significant differences across the genotypes tested (Figure S4A–D).”

      In assessing the effects of dominant negative mutations, measurements including ROS levels, the ratio of Atg8-1 to Atg8-2, and the quantity of COXII protein were conducted, yet no significant differences were observed (Figure S6). This limitation of the fly model is mentioned in the results, noting the observation of the axonal degeneration phenotype but not alterations in ROS signaling, autophagy activity, or mitochondrial quantity as follows (line 287–290):

      “We investigated the impacts of dominant negative mutations on mitochondrial oxidation levels, mitochondrial quantity, and autophagy activation levels; however, none of these parameters showed statistical significance (Figure S6).”

      Despite these caveats, the authors provide the first animal model of DOA that also allows for rapid assessment and mechanistic testing of suspected OPA1 variants. The impact of this work in providing the first direct evidence of a dominant negative mechanism is under-stated considering how important this question is in development of genetic treatments for DOA. The authors discuss important points regarding the potential utility of this model in clinical science. Comments on the potential use of this model to investigate variants of unknown significance in clinical diagnosis requires further discussion of whether there is indeed precedent for this in other genetic conditions (since the model is nevertheless so evolutionarily removed from humans).

      As suggested by the reviewer, we have expanded the discussion in our study to emphasize in greater detail the significance of the fruit fly model and the MeDUsA software we have developed, elaborating on the model's potential applications in clinical science and its precedents in other genetic disorders. Our text is as follows (lines 299–318):

      “We have previously utilized MeDUsA to quantify axonal degeneration, applying this methodology extensively to various neurological disorders. The robust adaptability of this experimental system is demonstrated by its application in exploring a wide spectrum of genetic mutations associated with neurological conditions, highlighting its broad utility in neurogenetic research. We identified a novel de novo variant in Spliceosome Associated Factor 1, Recruiter of U4/U6.U5 Tri-SnRNP (SART1). The patient, born at 37 weeks with a birth weight of 2934g, exhibited significant developmental delays, including an inability to support head movement at 7 months, reliance on tube feeding, unresponsiveness to visual stimuli, and development of infantile spasms with hypsarrhythmia, as evidenced by EEG findings. Profound hearing loss and brain atrophy were confirmed through MRI imaging. To assess the functional impact of this novel human gene variant, we engineered transgenic Drosophila lines expressing both wild type and mutant SART1 under the control of a UAS promoter.

      Our MeDUsA analysis suggested that the variant may confer a gain-of-toxic-function (Nitta et al.,  2023). Moreover, we identified heterozygous loss-of-function mutations in DHX9 as potentially causative for a newly characterized neurodevelopmental disorder. We further investigated the pathogenic potential of a novel heterozygous de novo missense mutation in DHX9 in a patient presenting with short stature, intellectual disability, and myocardial compaction. Our findings indicated a loss of function in the G414R and R1052Q variants of DHX9 (Yamada et al., 2023). This experimental framework has been instrumental in elucidating the impact of gene mutations, enhancing our ability to diagnose how novel variants influence gene function.”

      Recommendations for the Authors:

      Reviewer #1 (Recommendations For The Authors):

      Overall I enjoyed reading this paper. It is well presented and represents a significant amount of well executed study. I feel it further characterizes a poorly understood model of OPA1 variants and one which displays significant differences with the human phenotype. However I feel the use of this model with the author's experiments are not enough to validate this model/experiment as a screening tool for dominant negativity. I have therefore suggested the above experiments as a way to both further validate the mitochondrial dysfunction in this model and to ensure that the expressed transcript is able affect oligomerization as this is a pre-requisite to the authors conclusions.

      We assessed the extent to which our model reflects mitochondrial dysfunction using COXII, Atg8, and MitoSOX markers. Unfortunately, neither COXII levels nor the ratio of Atg8a-1 to Atg8a-2 showed significant variations across genotypes that would clarify the impact of dominant negative mutations. Nonetheless, MitoSOX and mito-QC results revealed that mitochondrial ROS levels and mitophagy are increased in Drosophila following intrinsic knockdown of dOPA1. These findings are documented in Figures 2, 5, and S6.

      Regarding oligomer formation, the specifics remain elusive in this study. However, the expression of dOPA1K273A, identified as a dominant negative variant in Drosophila, significantly disrupted retinal axon organization, as detailed in Figure S7. From these observations, we hypothesize that oligomerization of wild-type and dominant negative forms in Drosophila results in axonal degeneration. Conversely, co-expression of Drosophila wild-type with human dominant negative forms does not induce degeneration, suggesting that they likely do not interact.

      Reviewer #2 (Recommendations For The Authors):

      Materials and Methods:

      The authors used GMR-Gal4 to express OPA1-RNAi. I) GMR is expressed in most cells in the developing eye behind the morphogenetic furrow. So the defects observed can be due to knock- down in support cells rather than in photoreceptor cells.

      We have added the following sentences in the result (lines 194–196)."The GMR-Gal4 driver does not exclusively target Gal4 expression to photoreceptor cells. Consequently, the observed retinal axonal degeneration could potentially be secondary to abnormalities in support cells external to the photoreceptors.”

      OPA1-RNAi: how complete is the knock-down? Have the authors tested more than one RNAi line?

      We conducted experiments with an additional RNAi line, and similarly observed degeneration in the retinal axons (Figure S2 A and B; lines 178–179).

      The loss-of-function allele, induced by a P-element insertion, gives several eye phenotypes when heterozygous (Yarosh et al., 2008). Does RNAi expression lead to the same phenotypes?

      A previous report indicated that the compound eyes of homozygous mutations of dOPA1 displayed a glossy eye phenotype (Yarosh et al., 2008). Upon knocking down dOPA1 using the GMR-Gal4 driver, we also observed a glossy eye-like rough eye phenotype in the compound eyes. These findings have been added to Figure S3 and lines 192–194.

      There is no description on the way the somatic clones were generated. How were mutant cells in clones distinguished from wild-type cells (e. g. in Fig. 4).

      In the Methods section, we described the procedure for generating clones and their genotypes as follows (lines 502–505): "The dOPA1 clone analysis was performed by inducing flippase expression in the eyes using either ey-Gal4 with UAS-flp or ey3.5-flp, followed by recombination at the chromosomal location FRT42D to generate a mosaic of cells homozygous for dOPA1s3475." Furthermore, we have created a table detailing these genotypes. In these experiments, it was not possible to differentiate between the clone and WT cells. Accordingly, we have noted in the Results section (lines 201–203): "Note that the mutant clone analysis was conducted in a context where mutant and heterozygous cells coexist as a mosaic, and it was not possible to distinguish between them.”

      Why were flies kept at 29{degree sign}C? this is rather unusual.

      Increased temperature was demonstrated to induce elevated expression of GAL4 (Kramer and Staveley, Genet. Mol. Res., 2003), which in turn led to an enhanced expression of the target genes. Therefore, experiments involving knockdown assays or Western blotting to detect human OPA1 protein were exclusively conducted at 29°C. However, all other experiments were performed at 25°C, as described in the methods sections: “Flies were maintained at 25°C on standard fly food. For knockdown experiments (Figures 1C–E, 1F–H, 2A–H, 3B–K, 5F, S1, S2 A and B, and S6A), flies were kept at 29°C in darkness.” Furthermore, “We regulated protein expression temporally across the whole body using the Tub-Gal4 and Tub-GAL80TS system. Flies harboring each hOPA1 variant were maintained at a permissive temperature of 20°C, and upon emergence, females were transferred to a restrictive temperature of 29°C for subsequent experiments.”

      Legends:

      It would be helpful to have a description of the genotypes of the flies used in the different experiments. This could also be included as a table.

      We have created a table detailing the genotypes. Additionally, in the legend, we have included a note to consult the supplementary table for genotypes.

      Results:

      Line 141: It is not clear what they mean by "degradation", is it axonal degeneration? And if so, what is the argument for this here?

      In the manuscript, we addressed the potential for mitochondrial degradation; however, recognizing that the expression was ambiguous, the following sentence has been omitted: "Nevertheless, the degradation resulting from mitochondrial fragmentation may have decreased the mitochondrial signal.”

      Fig. 2: Axons of which photoreceptors are shown?

      We have added "a set of the R7/8 retinal axons" to the legend of Figure 2.

      Line 167: The authors write that axonal degeneration is more severe after seven days than after eclosion. Is this effect light-dependent? The same question concerns the disappearance of the rhabdomere (Fig. 3G–J).

      We conducted the experiments in darkness, ensuring that the observed degeneration is not light- dependent. This condition has been added to the methods section to clarify the experimental conditions.

      Line 178/179: Based on what results do they conclude that there is degeneration of the "terminals" of the axons?

      Quantification via MeDUsA has enabled us to count the number of axonal terminals, and a noted decrease has led us to conclude axonal terminal degeneration. We have published two papers on these findings. We have added the following description to the results section to clarify how we defined degeneration (lines 174–176): "We have assessed the extent of their reduction from the total axonal terminal count, thereby determining the degree of axonal terminal degeneration (Richard JNS 2022; Nitta HMG 2023).

      Line 189: They write: ".. we observed dOPA1 mutant axons...". How did they distinguish es mutant from the controls?

      Fig. 5 and Fig. 6: How did they distinguish genetically mutant cells from genetically control cells in the somatic clones?

      Mutant clone analysis was conducted in a context where mutant and heterozygous cells coexist as a mosaic, and it was not possible to distinguish between them. Accordingly, this point has been added to lines 201–203, “Note that the mutant clone analysis was conducted in a context where mutant and heterozygous cells coexist as a mosaic, and it was not possible to distinguish between them.” and the text in the results section has been modified as follows:

      (Before “To determine if dOPA1 is responsible for axon neurodegeneration, we observed the dOPA1 mutant axons by expressing full- length versions of dOPA1 in the photoreceptors at one day after eclosion and found that dOPA1 expression significantly rescued the axonal degeneration” —>

      (After “To determine if dOPA1 is responsible for axon neurodegeneration, we quantify the number of the axons in the dOPA1 eye clone fly with the expression of dOPA1 at one day after eclosion and found that dOPA1 expression partially rescued the axonal degeneration”

      Line 225/226: It is not clear to me how their approach "can quantitatively measure the degree of LOF".

      To address the reviewer's question and clarify how our approach quantitatively measures the degree of loss of function (LOF), we revised the statement (lines 238–247):

      "Our methodology distinctively facilitates the quantitative evaluation of LOF severity by comparing the rescue capabilities of various mutations. Notably, the 2708-2711del and I382M mutations demonstrated only partial rescue, indicative of a hypomorphic effect with residual activity. In contrast, the D438V and R445H mutations failed to show significant rescue, suggesting a more profound LOF. The correlation between the partial rescue by the 2708-2711del and I382M mutations and their classification as hypomorphic is significant. Moreover, the observed differences in rescue efficacy correspond to the clinical severities associated with these mutations, namely in DOA and DOA plus disorders. Thus, our results substantiate the model’s ability to quantitatively discriminate among mutations based on their impact on protein functionality, providing an insightful measure of LOF magnitude.”

      Discussion:

      Line 251, 252 and line 358: What is "the optic nerve" in the adult Drosophila?

      In humans, the axons of retinal ganglion cells (RGCs) are referred to as the optic nerve, and we posit that the retinal axons in flies are similar to this structure. In the introduction section, where it is described that the visual systems of flies and humans bear resemblance, we have appended the following definition (lines 107–108): “In this study, we defined the retinal axons of Drosophila as analogous to the human optic nerve.”

      Line 344: These bands appear only upon overexpression of the hOPA1 constructs, so this part of the is very speculative.

      Confirmation was achieved using anti-hOPA1, demonstrating that myc is not nonspecific. These results have been added to Figure 5D. Furthermore, the phrase “The upper band was expected as” has been revised to “From a size perspective, the upper band was inferred to represent the full-length hOPA1 including the mitochondria import sequence (MIS).” (lines 464–465)

      I was missing a discussion about the increase of ROS upon loss/reduction of dOPA1 observed by others and described here. Is there an increase of ROS upon expression of any of the constructs used?

      We demonstrated that not only axonal degeneration but also ROS can be suppressed by expressing human OPA1 in the genetic background of dOPA1 knockdown. Additionally, rescue was not possible with any variants except for I382M. Furthermore, we assessed whether there were changes in ROS in the evaluation of dominant negatives, but no significant differences were observed in this experimental system. These findings have been added to the discussion section as follows (lines 318–328). “Our research established that dOPA1 knockdown precipitates axonal degeneration and elevates ROS signals in retinal axons. Expression of human OPA1 within this context effectively mitigated both phenomena; it partially reversed axonal degeneration and nearly completely normalized ROS levels. These results imply that factors other than increased ROS may drive the axonal degeneration observed post-knockdown. Furthermore, while differences between the impacts of DN mutations and loss-of- function mutations were evident in axonal degeneration, they were less apparent when using ROS as a biomarker. The extensive use of transgenes in our experiments might have mitigated the knockdown effects. In a systemic dOPA1 knockdown, assessments of mitochondrial quantity and autophagy activity revealed no significant changes, suggesting that the cellular consequences of reduced OPA1 expression might vary across different cell types.”

      Reviewer #3 (Recommendations For The Authors):

      Consider being more explicit regarding literature that has or has failed to test a direct dominant negative effect by expressing a variant in question in the background of a full OPA1 complement. My understanding is that this is the first direct evidence of this widely held hypothesis. This lends to the main claim promoting the utility of fly as a model in general. The authors might also outline this in the introduction as a knowledge gap they fill through this study.

      In the introduction, we have incorporated a passage that highlights precedents capable of distinguishing between LOF and DN effects, and we note the absence of models capable of dissecting these distinctions within an in vivo organism. This study aims to address this gap, proposing a model that elucidates the differential impacts of LOF and DN within the context of a living model organism, thereby contributing to a deeper understanding of their roles in disease pathology. We added the following sentences in the introduction (lines 71–80).

      “In the quest to differentiate between LOF and DN effects within the context of genetic mutations, precedents exist in simpler systems such as yeast and human fibroblasts. These models have provided valuable insights into the conserved functions of OPA1 across species, as evidenced by studies in yeast models (Del Dotto et al., 2018) and fibroblasts derived from patients harboring OPA1 mutations (Kane et al., 2017). However, the ability to distinguish between LOF and DN effects in an in vivo model organism, particularly at the structural level of retinal axon degeneration, has remained elusive. This gap underscores the necessity for a more complex model that not only facilitates molecular analysis but also enables the examination of structural changes in axons and mitochondria, akin to those observed in the actual disease state.”

      The authors should clarify the language used in the abstract and introduction on the effect of hOPA1 DOA and DOA plus on the dOPA1- phenotype. Currently written as "none of the previously reports mutations known to cause DOA or DOA plus were rescued, their functions seems to be impaired." but presumably the authors mean that these variants failed to rescue to the dOPA1 deficient phenotype.

      We thank the reviewer for the constructive feedback. We acknowledge the need for clarity in our description of the effects of hOPA1 DOA and DOA plus mutations on the dOPA1- phenotype in both the abstract and the introduction. The current phrasing, "none of the previously reported mutations known to cause DOA or DOA plus were rescued, their functions seem to be impaired," may indeed be confusing. To address your concern, we have revised this statement to more accurately reflect our findings: "Previously reported mutations failed to rescue the dOPA1 deficiency phenotype." For Abstract site, we have changed as following. "we could not rescue any previously reported mutations known to cause either DOA or DOA plus.”→ “mutations previously identified did not ameliorate the dOPA1 deficiency phenotype.”

      DOA plus is associated with a multiple sclerosis-like illness; as written it suggests that the pathogenesis of sporadic multiple sclerosis and that associated with DOA plus share and underlying pathogenic mechanism. Please use the qualifier "-like illness." 

      We have added the term “multiple sclerosis-like illness” wherever “multiple sclerosis” is mentioned.

    1. Author response:

      Reviewer #1 (Public Review):

      Summary:

      The authors want to elucidate which are the mechanisms that regulate the immune response in physiological conditions in cortical development. To achieve this goal, authors used a wide range of mutant mice to analyse the consequences of immune activation in the formation of cortical ectopia in mice.

      Strengths:

      The authors demonstrated that Abeta monomers are anti-inflammatory and inhibit microglial activation. This is a novel result that demonstrates the physiological role of APP in cortical development.

      Weaknesses:

      -On the other hand, cortical ectopia has been already described in mouse models in which the amyloid signalling has been disrupted (Herms et al., 2004; Guenette et al., 2006), making the current study less novel.

      We agree these previous studies have implicated amyloid precursor protein in cortical ectopia. However, since these studies use whole-body knockouts, they have not implicated the functional roles of specific cell types.  Nor have they identified the specific mechanisms underlying the formation of this unique class of cortical ectopia. In contrast, our studies show that the disruption of a novel Abeta-regulated signaling pathway in microglia is the primary cause of ectopia formation in this class of ectopia mutants. This is the first time that microglia have been specifically implicated in the development of cortical ectopia. We further show that elevated MMP activity and resulting cortical basement membrane degradation is the underlying mechanism leading to ectopia formation.  This is also the first time that MMP activity and basement membrane degradation (instead of maintenance) have been implicated in cortical ectopia development. As such, our results have provided novel insights into the diverse mechanisms underlying cortical ectopia formation in developmental brain disorders.

      One of the molecules analysed is Ric8a, a GTPase activator involved in neuronal development. Authors used the conditional mutant mice Emx1-Ric8a to delete Ric8a from early progenitors and glutamatergic neurons in the pallium. Emx1-Ric8a mutant mice present cortical ectopias and authors attributed this malformation to the increase in inflammatory response due to Ric8a deletion in microglia. Several discordances do not fit this interpretation:

      -The role of Ric8a in cortical development and function has been already described in several papers, but none of them has been cited in the current manuscript (Kask et al., 2015, 2018; Ruisu et al., 2013; Tonissoo et al., 2006).

      We will include reference to these publications in revision.

      -Ectopia formation in the cortex has been already described in Nestin-Ric8a cKO mice (Kask et al., 2015). In the current manuscript, authors analyzed the same mutant mice (Nestin-Ric8a), but they did not detect any ectopia. Authors should discuss this discordance.

      The expression pattern of nestin-cre is known to vary dependent on factors including transgene insertion site, genetic background, and sex. Early studies show, for example, that the nestin gene promoter drives cre expression in many non-neural tissues in another transgenic line in the FVB/N genetic background (Dubois et al Genesis. 2006 Aug;44(8):355-60. doi: 10.1002/dvg.20226).  The specific nestin-cre line used in Kask et al 2015 has also been shown to be active in brain microglia and lead to increased microglia pro-inflammatory activity upon breeding to a conditional allele of a cholesterol transporter gene (Karasinska et al., Neurobiol Dis. 2013 Jun:54:445-55; Karasinska et al.,  J Neurosci. 2009 Mar 18; 29(11): 3579–3589). The ectopia reported in Kask et al 2015 are also significantly more subtle than what we have observed and apparently not observed in all mutant animals (we observe severe ectopia in every single emx1-cre mutant).  We presume the ectopia reported in Kask et al 2015 may result from a combined deletion of ric8a gene from microglia and neural cells due to unique combinations of factors affecting nestin-cre expression in a subset of mutants.

      -Authors claim that microglia express Emx1, and therefore, Ric8a is deleted in microglia cells. However, the arguments for this assumption are very weak and the evidence suggests that this is not the case. This is an important point considering that authors want to emphasise the role of Ric8a in microglia activation, and therefore, additional experiments should demonstrate that Ric8a is deleted in microglia in Emx1-Ric8a mutant mice.

      We have observed altered mRNA expression of several genes in purified microglia cultured from the emx1-cre mutants (Supplemental Fig. 8), which indicates that ric8a is deleted from microglia and suggests a role of microglial ric8a deficiency in ectopia formation.  This interpretation is further strengthened by the observation that deletion of ric8a from microglia using a microglia-specific cx3cr1-cre results in similar ectopia (Fig. 2). We also have other data supporting this interpretation, including data showing induction of the expression of a cre reporter in brain microglia by emx1-cre and loss of ric8a gene expression in microglia cells isolated from emx1-cre mutants. We will include these data in revision.

      Reviewer #2 (Public Review):

      Kwon et al. used several conditional KO mice for the deletion of ric8a or app in different cell types. Some of them exhibited pial basement membrane breaches leading to neuronal ectopia in the neocortex.

      They first investigated ric8a, a Guanine Nucleotide Exchange Factor for Heterotrimeric G Proteins. They observed the above-mentioned phenotype when ric8a is deleted from microglia and neural cells (ric8a-emx1-cre or dual deletion with cre combination cx3cr1 (in microglia) and nestin (in neural cells)) but not in microglia alone or neural cells alone (whether it is in CR cells (ric8a-Wnt3a-cre), post-mitotic neurons (nex-cre or dlx5/6-cre), or in progenitors and their progeny (nestin-cre or foxg1-cre). They also show that ric8a KO mutant microglia cells stimulated in vitro by LPS exhibit an increased TNFa, IL6 and IL1b secretion compared to controls (Fig 2). They therefore injected LPS in vivo and observed the neuronal ectopia phenotype in the ric8a-cx3cr1-cre (microglial deletion) cortices at P0 (Fig 2). They suggest that ric8a KO in neuronal cells mimics immune stimulation (but we have no clue how ric8a KO in neural cells would induce immune stimulation).

      We agree we do not currently know the precise mechanisms by which mutant microglia are activated in the mutant brain.  However, this does not affect the conclusion that deficiency in the Abeta monomer-regulated APP/Ric8a pathway in microglia is the primary cause of cortical ectopia in these mutants, since we have shown that genetic disruption of this pathway in microglia alone by different means targeting different pathway components, using cell type specific cre, all results in similar cortical ectopia phenotypes.  Regarding the source of the immunogens, there are several possibilities which we plan to investigate in future studies. For example, the clearance of apoptotic cells and associated cellular debris is an important physiological process and deficits in this process have been linked to inflammatory diseases throughout life (Doran et al., Nat Rev Immunol. 2020 Apr;20(4):254-267; Boada-Romero et al., Nat Rev Mol Cell Biol. 2020 Jul;21(7):398-414.).  In the embryonic cortex, studies have shown that large numbers of cell death take place starting as early as E12 (Blaschke et al., Development. 1996 Apr;122(4):1165-74; Blaschke et al., J Comp Neurol. 1998 Jun 22;396(1):39-50).  Studies have also shown that radial glia and neuronal progenitors play critical roles in the clearance of apoptotic cells and associated cellular debris in the brain (Lu et al., Nat Cell Biol. 2011 Jul 31;13(9):1076-83; Ginisty et al., Stem Cells. 2015 Feb;33(2):515-25; Amaya et al., J Comp Neurol. 2015 Feb 1;523(2):183-96). Moreover, Ric8a-dependent heterotrimeric G proteins have been found to specifically promote the phagocytic activity of both professional and non-professional phagocytic cells (Billings et al., Sci Signal. 2016 Feb 2;9(413):ra14; Preissler et al., Glia. 2015 Feb;63(2):206-15; Pan et al. Dev Cell. 2016 Feb 22;36(4):428-39; Flak et al. J Clin Invest. 2020 Jan 2;130(1):359-373; Zhang et al., Nat Commun. 2023 Sep 14;14(1):5706).  Thus, it is likely that the failure to promptly clear up apoptotic cells and debris by radial glia may play a role in the triggering of microglial activation in ric8a mutants. We have not included discussion of these possibilities since the precise mechanisms remain to be determined.  Moreover, they also do not impact the conclusion of the current study.

      The authors then turned their attention on APP. They observed neuronal ectopia into the marginal zone when APP is deleted in microglia (app-cxcr3-cre) + intraperitoneal LPS injection (they did not show it, but we have to assume there would not be a phenotype without the injection of LPS) (Fig 3). (The phenotype is similar but not identical to ric8a-cx3cr1-cre + LPS. They suggest that the reason is because they had to inject 3 times less LPS due to enhanced immune sensitivity in this genetic background but it is only a hypothesis). After in vitro stimulation by LPS, app mutant microglia show a reduced secretion of TNFa and IL6 but not IL1b (this is the opposite to ric8a-cx3cr1-cre microglia cells) while peritoneal macrophages in culture show increased secretion of TNFa, IL1, IL6 and IL23 (fig 3 and Suppl. Fig 9).

      We have data showing that that app-cxcr3-cre mutants without LPS injection do not show ectopia and will include them in revision.  The reason we employ LPS injection is, in the first place, we do not see a phenotype without the injection. We agree, and have also stated in the text, that the phenotype of the app mutants is not as severe as that of the ric8a mutant.  Besides the low LPS dosage used, we also suggest that other app family members may compensate since the ectopia in the app family gene mutants reported previously were only observed in app/aplp1/2 triple knockouts, not even in any of the double knockouts (Herms et al., 2004). These potential causes are also not mutually exclusive. Nonetheless, the microglia specific app mutants clearly show ectopia upon immune stimulation, implicating a role of microglial APP in cortical ectopia formation.

      The distinct response of ric8a and app microglia to LPS results from in vitro culturing of microglia. Indeed, we have shown that, when acutely isolated macrophages are used, these mutants show changes in the same direction (both increased cytokine secretion).  The microglia used for analysis in this study have all been cultured in vitro for two weeks before assay. They have thus been under chronic stimulation exposing to dead cells and debris in the culture dish through this period.  Dependent on the degree of perturbation to inflammation-regulating pathways, such exposures are known to significantly change microglial cytokine expression, sometimes in an opposite direction from expected.  For example, under chronic immune stimulation, while the trem2+/- microglia, which are heterozygous mutant for the anti-inflammatory Trem2, show elevated pro-inflammatory cytokine expression as expected, trem2-/- (null) microglia under the same conditions instead not only do not show increases but for some pro-inflammatory cytokines, actually show decreases in expression (Sayed et al.,, Proc Natl Acad Sci U S A. 2018 Oct 2;115(40):10172-10177).  In several systems, Ric8a-dependent heterotrimeric G proteins have been shown to act downstream of APP and mediate one of the branches of the signaling activated by APP (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).  It is likely that in microglia Ric8a-dependent heterotrimeric G proteins may also mediate only a subset of the signaling downstream of APP.  As such, app knockout in microglia may have more severe effects than ric8a knockout on microglial immune activation and lead to changes in the opposite direction compared to ric8a knockout, as has been observed for trem2 null mutation vs heterozygosity discussed above. This may explain the subdued TNF and IL6 secretion by cultured app mutant microglia.

      Amyloid beta (Ab) being one of the molecules binding to APP, the authors showed that Ab40 monomers (they did not test Ab40 oligomers) partially inhibit cytokines (TNFa, IL6, IL1b, MCP-1, IL23a, IL10) secretion in vitro by microglia stimulated by LPS but does not affect secretion by microglia from app-cx3cr1-cre (tested for TNFa, IL6, IL1b, IL23a, IL10) (Fig 4, Suppl fig 10) (but still does it in aplp2-cx3cr1-cre) and does not affect secretion by ric8a-cx3cr1-cre microglia (tested for TNFa and IL6 but still suppress IL1b) (Therefore here is another difference between app and ric8a KO microglia).

      We have tested the effects of Abeta40 oligomers, which induce instead of suppressing microglial cytokine secretion, and will include the data in revision.  As mentioned above, in several systems, Ric8a-dependent heterotrimeric G proteins have been shown to act downstream of APP and mediate one of the branches of the signaling activated by APP (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).  We assume that this is likely also true in microglia and that Ric8a-dependent heterotrimeric G proteins may mediate only a subset of the signaling downstream of APP.  This may explain the difference in the effects of APP and ric8a knockout mutation in abolishing the anti-inflammatory effects of Abeta monomers on IL-1b vs TNF/IL-6.  It also suggests that TNF/IL-6 and IL-1b secretion must be regulated by different mechanisms. Indeed, it is well established in immunology that the secretion of IL1b, but not of TNF or IL6, is regulated by inflammasome-dependent mechanisms (see, for example, Proz & Dixit. Nat Rev Immunol. 2016 Jul;16(7):407-20. doi: 10.1038/nri.2016.58).

      The authors injected inhibitors of Akt or Stat3 in the ric8a-emx1-cre cortex and found it suppressed neuronal ectopia (Fig 5, Suppl fig 11). It is not clear whether it suppresses immune stimulation from neuronal cells or immune reaction from microglia cells.

      We agree at present the pharmacological approaches we have taken are not able to distinguish these possibilities.  However, whichever of these possibilities turns out to be the case would still implicate a role of excessive microglial activation in the formation of cortical ectopia and support the conclusion of the study.  Thus, while potentially worthwhile of further investigation, this question does not impact the conclusion of this study. Furthermore, as mentioned, we plan to determine the mechanisms of how ric8a mutation in neural cells induces immune activation in future studies. These results will likely enable us to adopt more specific approaches to address this question.

      Finally, the authors examined the activities of MMP2 and MMP9 in the developing cortex using gelatin gel zymography. The activity and protein levels of MMP9 but not MMP2 in the ric8a-emx1-cre cortex were claimed significantly increased (Fig 5, Suppl fig 12). Unfortunately, they did not show it in the app-cx3cr1-cre +LPS mouse. They make a connection between ric8a deletion and MMP9 but unfortunately do not make the connection between app deletion and MMP9, which is at the center of the pathway claimed to be important here). Then they injected BB94, a broad-spectrum inhibitor of MMPs or an inhibitor specific for MMP9 and 13. They both significantly suppress the number and the size of the ectopia in ric8a mutants (Fig5).

      For all the gelatin gel zymography analysis, we quantify protein concentrations in the cortical lysates using the Bio-Rad Bradford assay kit and load the same amounts of proteins per lane. The results across lanes are thus directly comparable. From the quantification, our results clearly show that MMP9, but not MMP2, levels are increased in the mutants (supplemental Figure 12).  The data on MMP2 also provide an internal control further supporting the observation of a specific change in MMP9.  For this analysis, we focus on the ric8a-emx1-cre mutants since the app-cx3cr1-cre +LPS animals show less severe, more localized ectopia and in most cases only in one of the hemispheres.  Any changes in MMP9 are therefore likely to be masked and the experiments unlikely to yield meaningful results.  On the other hand, we have clearly shown that the administration of different classes of MMP inhibitors significantly eliminate ectopia in ric8a-emx1-cre mutants. This has strongly implicated a functional contribution of MMPs.

      After reading the manuscript, I still do not know how ric8a in neural cells is involved in the immune inhibition. Is it through the control of Ab monomers? In addition, the authors did not show in vivo data supporting that Ab monomers are the key players here. As the authors said, this is not the only APP interactor. Finally, I still do not know how ric8a is linked to APP in microglia in the model.

      As detailed above, there are several possibilities including potential deficits in the clearance of apoptotic cells and associated debris that may trigger microglial activation in ri8ca-emx1-cre mutants. We will investigate these possibilities in future studies.  We have not included discussion since their roles remain to be determined.  As for the role of Abeta monomers, we have indicated that we currently do not have evidence that in the developing cortex Abeta monomers play a role in inhibiting microglia.  We have also indicated in the manuscript that our conclusion is that an Abeta monomer-activated microglial pathway regulates normal brain development, not that Abeta monomers themselves regulate brain development.  Regarding the link between Ric8a and APP, the reviewer has missed several major lines of supporting evidence. For example, we have shown that Abeta monomers activates a pathway in microglia that inhibits the secretion of several proinflammatory cytokines including TNF, IL-6, IL-10, and IL-23 (Figure 4 and Supplemental Figures 8-10).  This inhibition is abolished when either app or ric8a gene is deleted from microglia.  This indicates that app and ric8a act in the same pathway activated by Abeta monomers in microglia. We also show that this Abeta monomer-activated pathway also inhibits the transcription of several cytokines in microglia.  This inhibition is also abolished when either app or ric8a gene is deleted from microglia.  This reinforces the conclusion that app and ric8a act in the same pathway in microglia.  Furthermore, cell type specific deletion of app or ric8a from microglia in vivo also results in similar phenotypes of cortical ectopia. Together, these results thus strongly support the conclusion that app and ric8a act in the same pathway activated by Abeta monomers in microglia. This conclusion is also consistent with published findings that Ric8a dependent heterotrimeric G proteins bind to APP and mediate subsets of APP signaling across different different species (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).         

      While several of the findings presented in this manuscript are of potential interest, there are a number of shortcomings. Here are some suggestions that could improve the manuscript and help substantiate the conclusions:

      (1) As the title suggests it, the focus is on Ab and APP functions in microglia. However, the analysis is more focused on ric8a. The connection between ric8a and APP in this study is not investigated, besides the fact that their deletion induces somewhat similar but not identical phenotypes. Showing a similar phenotype is not enough to conclude that they are working on the same pathway. The authors should find a way to make that connection between ric8a and app in the cells investigated here.

      As discussed above, the reviewer misses several major lines of evidence showing that APP and Ric8a acts in the same pathway in microglia.  For example, besides the similarity of the ectopia phenotypes, we have shown that Abeta monomers activates a pathway in microglia that inhibits the secretion of several proinflammatory cytokines including TNF, IL-6, IL-10, and IL-23 (Figure 4 and Supplemental Figures 8-10).  These inhibitory effects are completely abolished when either app or ric8a gene is deleted from microglia.  This indicates that app and ric8a act in the same pathway activated by Abeta monomers in microglia. We also show that this Abeta monomer-activated pathway inhibits the transcription of several cytokine genes in microglia.  These effects are again completely abolished when either app or ric8a gene is deleted from microglia.  This further reinforces the conclusion that app and ric8a act in the same pathway in microglia.  Not only so we also show that the same results are true in macrophages.  Together, these results therefore strongly support the conclusion that app and ric8a act in the same pathway in microglia. This conclusion is also consistent with published findings that Ric8a dependent heterotrimeric G proteins bind to APP and mediate APP signaling across different species (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).

      (2) This would help to show the appearance of breaches in the pial basement membrane leading to neuronal ectopia; to investigate laminin debris, cell identity, Wnt pathway for app-cxcr3-cre + LPS injection as you did for ric8a-emx1-cre.

      We will provide further data on the breaches in the pial basement membrane.  We have not observed any changes in cell identity or Wnt pathway activity in ric8a-emx1-cre mutants. The ectopia phenotype in the app-cxcr3-cre + LPS animals is also less severe.  It is therefore likely of limited value to examine potential changes in these areas.

      (3) As a control, this would help to show that app-cxcr3-cre without the LPS injection does not display the phenotype.

      We have the data on app-cx3cr1-cre mutants without LPS injection, which show no ectopia, and will include the data in revision.

      (4) This would help to show the activity and protein levels of MMP9 and MMP2 and perform the rescue experiments with the inhibitors in the app-cx3cr1-cre cortex +LPS.

      As discussed above, we focus analysis on the ric8a-emx1-cre mutants since app-cx3cr1-cre +LPS animals show less severe, more localized ectopia and in most cases only in one of the hemispheres.  Determining potential changes in MMP9 levels and effects of MMP inhibitors are therefore not likely to yield useful data.  On the other hand, we have shown that MMP9 levels are increased and administration of different classes of MMP inhibitors eliminate cortical ectopia in ric8a-emx1-cre mutants.  This has strongly implicated a functional contribution of MMPs.

      (5) Is MMP9 secreted by microglia cells or neural cells?

      Our in situ hybridization data show MMP9 is most highly expressed in macrophage-like cells in the embryonic cortex, suggesting that microglia may be a major source of MMP9. We will incorporate these data in revision.

      (6) The in vitro evidence indicates that one of the multiple APP interactors, ie Ab40 monomers, is less effective in suppressing the expression of some cytokines by microglia cells mutants for ric8a (TNFa and IL6 but still suppress IL1b) or APP (TNFa, IL6, IL1b, IL23a, IL10) when compared to WT. But there are other interactors for APP. In order to support the claim, it seems crucial to have in vivo data to show that Ab40 monomers are the molecules involved in preventing the breach in the pial basement membrane.

      As addressed in detail above, we have indicated that our conclusion is that an Abeta monomer-activated microglial pathway regulates normal brain development, not that Abeta monomers themselves regulate brain development.  We currently do not have evidence that the Abeta monomers play a role in inhibiting microglia in the developing cortex.  There are candidate ligands for the pathway in the developing cortex, the functional study of which, however, is a major undertaking and beyond the scope of the current study.

      (7) In order to claim that this is specific to Ab40 monomers and not oligomers, it is necessary to show that the Ab40 oligomers do not have the same effect in vitro and in vivo. Also, an assay should be done to show that your Ab preparations are pure monomers or oligomers.

      We have tested the effects of Abeta40 oligomers, which induce instead of suppressing microglial cytokine secretion, and will include the data in revision. The protocols we use in preparing the monomers and oligomers are standard protocols employed in the field of Alzheimer’s disease research and have been optimized and validated repeatedly over the past several decades.  

      (8) Most of the cytokine secretion assays used microglia cells in culture. Two results draw my attention. Ric8a deletion increases TNFa and IL6 secretion after LPS stimulation in vitro on microglia cells while app deletion decreases their secretion. Then later, papers show that the decrease in IL1b induced by Ab on microglia cells is prevented by APP deletion but not ric8a deletion. Those two pieces of data suggest that ric8a and APP might not be in the same pathway. In addition, the phenotype from app-cxcr3-cre + LPS injection and ric8a-cxcr3-cre + LPS injection are not exactly the same. It could be due to the level of LPS as the author suggests or it might not be. More experiments are needed to prove they are in the same pathway.

      As discussed above, the reviewer misses several major lines of evidence, which strongly support the conclusion that APP and Ric8a act in the same pathway activated by Abeta monomers in microglia (see detailed discussion in point 1).  The differential response of app and ric8a mutant microglia likely results from chronic immune stimulation during in vitro culturing, which is known to alter microglia cytokine expression (see detailed discussion in point 9 below on how chronic immune stimulation changes microglial cytokine expression). We have demonstrated this by showing that, without culturing, acutely isolated app and ric8a mutant macrophages both display elevated cytokine secretion (Figure 4).  Regarding the distinct regulation of TNF/IL-6 and IL-1b by APP and Ric8a, as discussed above, in several systems, Ric8a-dependent heterotrimeric G proteins have been shown to act downstream of APP and mediate one of the branches of the signaling activated by APP (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).  It is likely this is also the case in microglia and Ric8a-dependent heterotrimeric G proteins may mediate only a subset of the anti-inflammatory signaling activated by APP.  As such, this may explain why app, but ric8a, mutation abolishes the inhibitory effects of Abeta monomers on IL-1b.  This also suggests that the secretion of TNF/IL-6 and IL-1b must be regulated by different mechanisms. Indeed, it is well established in immunology that the secretion of IL1b, but not that of TNF or IL6, is regulated by inflammasome-dependent mechanisms (see, for example, Proz & Dixit. Nat Rev Immunol. 2016 Jul;16(7):407-20. doi: 10.1038/nri.2016.58).

      (9) How do the authors reconcile the reduced TNFa and IL6 secretion upon stimulation of app mutant microglia with the model where app is attenuating immune response in vivo? Line 213 says that microglia exhibit attenuated immune response following chronic stimulation but I don't know if 3 hours of LPS in vitro is a chronic stimulation.

      The reviewer has misunderstood.  The microglia used in this study have all been cultured in vitro for approximately two weeks before assay. They have thus been under chronic stimulation exposing to dead cells and debris in the culture dish throughout this period.  Dependent on the degree of perturbation to inflammation-regulating pathways, such exposures are known to significantly change microglial cytokine expression, sometimes in an opposite direction than expected.  For example, under chronic immune stimulation, while the trem2+/- microglia, which are heterozygous mutant for the anti-inflammatory Trem2, show elevated pro-inflammatory cytokine expression as expected, trem2-/- (null) microglia under the same conditions instead not only do not show increases but for some pro-inflammatory cytokines, actually show decreases in expression (Sayed et al.,, Proc Natl Acad Sci U S A. 2018 Oct 2;115(40):10172-10177).  As mentioned, in several systems, Ric8a-dependent heterotrimeric G proteins have also been shown to bind to APP and mediate one of the branches of the signaling activated by APP (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).  It is likely that Ric8a-dependent heterotrimeric G proteins also mediate only a subset of the anti-inflammatory signaling activated by APP in microglia.  As such, app knockout in microglia may have more severe effects than ric8a knockout on microglial immune activation, similar to the relationship between trem2 null mutation vs heterozygosity discussed above. This likely explains why TNF and IL6 secretion by cultured app mutant microglia is subdued.  In contrast, we find that acutely isolated app mutant macrophages show increased cytokine secretion. This is likely more representative of the response of app mutant microglia in the absence of chronic stimulation.

      (10) Line 119: In their model, the authors suggest that there is a breach in pial basement membrane but that the phenotype is different from the retraction of the radial fibers due to reduced adhesion. So, could the author discuss to what substrate the radial fibers are attached to, in their model where the pial surface is destroyed?

      Radial glial endfeet normally bind to the basement membrane via cell surface receptors including the integrin and the dystroglycan protein complexes. We observe free radial glial endfeet at the breach sites, apparently without attachment to any basement membrane.  However, we cannot exclude the possibility that there may be residual basement components not detected by the methodology employed. 

      (11) The authors should show that the increased cytokine secretion observed in vitro is also happening in vivo in ric8a-emx1-cre compared to WT mice and compared to ric8a-nestin-cre mice. Or when app is deleted in microglia (app-cxcr3-cre) + LPS injection compared to WT mice +LPS.

      Unfortunately, this is not technically feasible since it is impossible to extract the extracellular (secreted) fractions of cytokines from an embryonic brain without causing cell lysis and the release of the intracellular pool.  This, however, does not affect our conclusion that the Abeta monomer-regulated microglia pathway plays a key role in regulates normal brain development since its genetic disruption, by different approaches, clearly results in brain malformation.

      (12) The authors injected inhibitors of Akt or Stat3 in the ric8a-emx1-cre cortex and found that it suppressed neuronal ectopia (Fig 5, Suppl fig 11). Does it suppress immune stimulation from neuronal cells or immune reaction from microglia cells?

      As discussed above, we agree at present the pharmacological approaches we have taken are not able to distinguish these two possibilities.  However, no matter which possibility is true, it does not affect our conclusion.  Furthermore, we also plan to determine the mechanisms of how ric8a mutation in neural cells induce immune activation in future studies. These results will likely enable us to adopt specific approaches to address this question.

      (13) Fig 5 and Supplementary fig 12: Please show a tubulin loading control in Fig 5i as you did in suppl fig 12 d (gel zymography). Please provide a gel zymography showing side by side Control, mutant and mutant +DM/S3I treatment. The same request for the MMP9 staining. Please provide statistics for control vs mutant for suppl fig 12c and d.

      For all experiments of the gelatin gel zymography analysis, we quantify protein concentrations in the cortical lysates using the Bio-Rad Bradford assay kit and load the same amounts of proteins per lane. The results across lanes are thus all comparable.  These experiments were also performed several years ago before the pandemic and we unfortunately no longer have the samples.  We will, however, provide the protein quantification information in revision.  The MMP9 staining images for the controls and mutants have also all been taken with the same parameters on the microscope and can be directly compared.  The statistics will be provided as suggested.

      (14) Please provide the name and the source of the MMP9/13 inhibitor used in this study.

      This inhibitor is MMP-9/MMP-13 inhibitor I (CAS 204140-01-2), from Santa Cruz Biotechnology. This information will be included in revision.

      (15) The results show that deletion of ric8a in microglia and neural cells induced pia membrane breaches but no phenotype is apparent in ric8a deletion in microglia or neural cells alone. Then, the results showed that intraperitoneal injection of LPS induced the phenotype in ric8a-cxcr3-cre mutants. It would be beneficial as a control supporting the model to show that the insult induced by LPS injection does not induce the phenotype in the ric8a-foxg1-cre mice.

      We agree it may potentially be useful to show that LPS injection does not induce ectopia in ric8a-foxg1-cre mice.  Unfortunately, since the ric8a-foxg1-cre mutation shows no phenotype, we are no longer in possession of this line.

      Reviewer #1 (Recommendations For The Authors):

      -The information in the abstract and the introduction is only related to app. So, it is very abrupt how authors start the manuscript studying the role of Ric8a, with no information at all about this protein and why the authors want to investigate this role in microglial activation. Later in the manuscript, the authors tried to link Ric8a with app to study the role of app in the inflammatory response and ectopia formation. This link is quite weak as well.

      In the last paragraph of the Introduction, we explain the use of the ric8a mutant and how it leads to discovery of the Abeta monomer-regulated pathway. We will improve the writing in revision to make these points clearer.  We will also improve the writing of the potential link of Ric8a to APP by highlighting, especially, the fact that ric8a and app pathway mutants are among a unique group of only three mouse mutants (ric8a, app/aplp1/2, and apbb1/2) that show cortical ectopia exclusively in the lateral cortex, while all other cortical ectopia mutants show the most severe ectopia are at the midline.

      -In order to validate the mouse model, double immunofluorescence or immunofluorescence+in situ hybridization should be performed to show that microglia express ric8a and that is eliminated in the Emx1-Ric8a mutant mice.

      As mentioned above, we have additional lines of evidence showing that ric8a is deleted from microglia in emx1-cre mutants. This includes data showing induction of the expression of a cre reporter in brain microglia by emx1-cre and loss of ric8a gene expression in microglia cells isolated from emx1-cre mutants.  We will include these data in revision.

      -In Supplemental Fig. 6, the authors claimed that cell proliferation is normal in Ric8a mutant mice without doing any quantification. They also quantified the angle of mitotic division of progenitors in the ventricular zone, but there are no images for the spindle orientation quantification, and no description of how they did it. In addition, this data is contrary to what has already been published in conditional Ric8a mutant mice (Kask et al., 2015). The Vimentin staining should be improved.

      We will provide quantification of cell proliferation in revision. We will also provide details on the quantification on mitotic spindle orientation.  We are not sure why the results are different from the other study. We were indeed anticipating deficits in mitotic spindle orientation and spent major efforts in the analysis.  However, based on the data, we could not draw the conclusion.

      -Analysis of the MMP9 expression should be done by western blot and not by immunofluorescence. In fact, the MMP9 expression shown in Figure 5g,h, does not correspond with RNA expression shown in gene expression atlas like genepaint or the allen atlas, doubting the specificity of the antibody. The expression of Mmp9 is quite low or absent in the cortex at E13.5-E14.5, making this protein very unlikely to be responsible for laminin degradation during development.

      We perform gelatin gel zymography on MMP2/9, which shows increased MMP9 activity levels in the mutant cortex. This is similar to Western blot analysis (all lanes are loaded with the same amounts of cortical lysates).  The immunofluorescence staining, a different type, of analysis, was designed as a complementary approach.  Regarding RNA expression, please also note that MMP9 is a secreted protein and the protein expression pattern is expected to be different from that of RNA. We also have in situ data showing that, while MMP9 mRNA is indeed low, it is strongly expressed in macrophage-like cells most prominently in cortical blood vessels at E12-E13 (we will include these data in revision).  We suspect that these cells are microglial lineage cells populating the embryonic cortex at this stage (see, for example, Squarzoni et al., Cell Rep. 2014 Sep 11;8(5):1271-9. doi: 10.1016/j.celrep.2014.07.042.) and may be a major source of cortical MMP9.  As for functional contributions, we agree that we cannot rule roles played by other MMPs.  However, based on the ectopia suppression data, our results clearly indicate a key functional contribution by MMP9/13.

      For MMP9 activity, authors should show the whole membrane with a minimum of three control and three mutant individual samples and with the quantification.<br /> -The graphs should be improved, including individual values and titles of the Y axes.

      We will include these data in revision (the quantification of MMP9 activity is provided in Supplemental Figure 12d) and improve the graphs as suggested.

    1. Author response:

      Puvlic Reviews:

      Reviewer #1 (Public Review): 

      Summary: 

      Dr. Santamaria's group previously utilized antigen-specific nanomedicines to induce immune tolerance in treating autoimmune diseases. The success of this therapeutic strategy has been linked to expanded regulatory mechanisms, particularly the role of T-regulatory type-1 (TR1) cells. However, the differentiation program of TR1 cells remained largely unclear. Previous work from the authors suggested that TR1 cells originate from T follicular helper (TFH) cells. In the current study, the authors aimed to investigate the epigenetic mechanisms underlying the transdifferentiation of TFH cells into IL-10-producing TR1 cells. Specifically, they sought to determine whether this process involves extensive chromatin remodeling or is driven by preexisting epigenetic modifications. Their goal was to understand the transcriptional and epigenetic changes facilitating this transition and to explore the potential therapeutic implications of manipulating this pathway. 

      The authors successfully demonstrated that the TFH-to-TR1 transdifferentiation process is driven by pre-existing epigenetic modifications rather than extensive new chromatin remodeling. The comprehensive transcriptional and epigenetic analyses provide robust evidence supporting their conclusions. 

      Strengths: 

      (1) The study employs a broad range of bulk and single-cell transcriptional and epigenetic tools, including RNA-seq, ATAC-seq, ChIP-seq, and DNA methylation analysis. This comprehensive approach provides a detailed examination of the epigenetic landscape during the TFH-to-TR1 transition. 

      (2) The use of high-throughput sequencing technologies and sophisticated bioinformatics analyses strengthens the foundation for the conclusions drawn. 

      (3) The data generated can serve as a valuable resource for the scientific community, offering insights into the epigenetic regulation of T-cell plasticity. 

      (4) The findings have significant implications for developing new therapeutic strategies for autoimmune diseases, making the research highly relevant and impactful. 

      We thank the reviewer for providing constructive feedback on the manuscript.

      Weaknesses: 

      (1) While the scope of this study lies in transcriptional and epigenetic analyses, the conclusions need to be validated by future functional analyses. 

      We fully agree with the reviewer’s suggestion. The current study provides a foundational understanding of how the epigenetic landscape of TFH cells evolves as they transdifferentiate into TR1 progeny in response to chronic ligation of cognate TCRs using pMHCII-NPs. Functional validation is indeed the focus of our current studies, where we are carrying out extensive perturbation studies of the TFH-TR1 transdifferentiation pathway in conditional transcription factor gene knock-out mice. In these ongoing studies, genes coding for a series of transcription factors expressed along the TFH-TR1 pathway are selectively knocked out in T cells, to ascertain (i) the specific roles of key transcription factors in the various cell conversion events and transcriptional changes that take place along the TFH-TR1 cell axis; (ii) the roles that such transcription factors play in the chromatin re-modeling events that underpin the TFH-TR1 transdifferentiation process; and (iii) the effects of transcription factor gene deletion on phenotypic and functional readouts of TFH and regulatory T cell function.

      (2) This study successfully identified key transcription factors and epigenetic marks. How these factors mechanistically drive chromatin closure and gene expression changes during the TFH-to-TR1 transition requires further investigation. 

      Agreed. Please see our response to point #1 above.  

      (3) The study provides a snapshot of the epigenetic landscape. Future dynamic analysis may offer more insights into the progression and stability of the observed changes. 

      We have previously shown that the first event in the pMHCII-NP-induced TFH-TR1 transdifferentiation process involves proliferation of cognate TFH cells in the splenic germinal centers. This event is followed by immediate conversion of the proliferated TFH cells into transitional and terminally differentiated TR1 subsets. Although the snapshot provided by our single cell studies reported herein documents the simultaneous presence of the different subsets composing the TFH-TR1 cell pathway upon the termination of treatment, the transdifferentiation process itself is extremely fast, such that proliferated TFH cells already transdifferentiate into TR1 cells after a single pMHCII-NP dose (Sole et al., 2023a). This makes it extremely challenging to pursue dynamic experiments. Notwithstanding this caveat, ongoing studies of cognate T cells post treatment withdrawal, coupled to single cell studies of the TFHTR1 pathway in transcription factor gene knockout mice exhibiting perturbed transdifferentiation processes are likely to shed light into the progression and stability of the epigenetic changes reported herein. 

      We will revise the manuscript accordingly, to address the three concerns raised by the reviewer, in the context of the ongoing studies mentioned above. 

      Reviewer #2 (Public Review): 

      Summary: 

      This study, based on their previous findings that TFH cells can be converted into TR1 cells, conducted a highly detailed and comprehensive epigenetic investigation to answer whether TR1 differentiation from TFH is driven by epigenetic changes. Their evidence indicated that the downregulation of TFH-related genes during the TFH to TR1 transition depends on chromatin closure, while the upregulation of TR1-related genes does not depend on epigenetic changes. 

      Strengths: 

      (1) A significant advantage of their approach lies in its detailed and comprehensive assessment of epigenetics. Their analysis of epigenetics covers chromatin open regions, histone modifications, DNA methylation, and using both single-cell and bulk techniques to validate their findings. As for their results, observations from different epigenetic perspectives mutually supported each other, lending greater credibility to their conclusions. This study effectively demonstrates that (1) the TFH-to-TR1 differentiation process is associated with massive closure of OCRs, and (2) the TR1-poised epigenome of TFH cells is a key enabler of this transdifferentiation process. Considering the extensive changes in epigenetic patterns involved in other CD4+ T lineage commitment processes, the similarity between TFH and TR1 in their epigenetics is intriguing. 

      (2) They performed correlation analysis to answer the association between "pMHC-NPinduced epigenetic change" and "gene expression change in TR1". Also, they have made their raw data publicly available, providing a comprehensive epigenomic database of pMHC-NPinduced TR1 cells. This will serve as a valuable reference for future research. 

      We thank the reviewer for his/her constructive feedback and suggestions for improvement of the manuscript.

      Weaknesses: 

      (1) A major limitation is that this study heavily relies on a premise from the previous studies performed by the same group on pMHC-NP-induced T-cell responses. This significantly limits the relevance of their conclusion to a broader perspective. Specifically, differential OCRs between Tet+ and naïve T cells were limited to only 821, as compared to 10,919 differential OCRs between KLH-TFH and naïve T cells (Figure 2A), indicating that the precursors and T cell clonotypes that responded to pMHC-NP were extremely limited. This limitation should be clearly discussed in the Discussion section. 

      We agree that this study focuses on a very specific, previously unrecognized pathway discovered in mice treated with pMHCII-NPs. Despite this apparent narrow perspective, we now have evidence that this is a naturally occurring pathway that also develops in other contexts (i.e., in mice that have not been treated with pMHCII-NPs). Furthermore, this pathway affords a unique opportunity to further understand the transcriptional and epigenetic mechanisms underpinning T cell plasticity; the findings reported here can help guide/inform not only upcoming translational studies of pMHCII-NP therapy in humans, but also other research in this area. We will discuss the limitations and opportunities that this research provides more explicitly in a revised manuscript to provide a clearer context for the scope and applicability of our findings.

      We acknowledge that, in the bulk ATAC-seq studies, the differences in the number of OCRs found in tetramer+ cells or KLH-induced TFH cells vs. naïve T cells may be influenced by the intrinsic oligoclonality of the tetramer+ T cell pool arising in response to repeated pMHCII-NP challenge (Sole et al., 2023a). However, we note that scATAC-seq studies of the tetramer+ T cell pool found similar differences between the oligoclonal tetramer+ TFH subpool and its (also oligoclonal) tetramer+ TR1 counterparts (i.e., substantially higher number of OCRs in the former vs. the latter relative to naïve T cells). This will be clarified in a revised version of the manuscript.

      (2) This article uses peak calling to determine whether a region has histone modifications, claiming that the regions with histone modifications in TFH and TR1 are highly similar. However, they did not discuss the differences in histone modification intensities measured by ChIP-seq. For example, as shown in Figure 6C, IL10 H3K27ac modification in Tet+ cells showed significantly higher intensity than KLH-TFH, while in this article, it may be categorized as "possessing same histone modification region". This will strengthen their conclusions.

      We appreciate your suggestion to discuss differences in histone modification intensities as measured by ChIP-seq. However, we respectfully disagree with the reviewer’s interpretation of these data.

      Our study primarily focuses on the identification of epigenetic similarities and differences between pMHCII-NP-induced tetramer+ cells and KLH-induced TFH cells relative to naive T cells. The outcome of direct comparisons of histone deposition (ChIP-seq) between these cell types is summarized in the lower part of Figure 4B and detailed in Datasheet 5. Throughout this section, we report the number of differentially enriched regions, their overlap with OCRs shared between tetramer+ TFH and tetramer+ TR1 cells based on scATAC-seq data, and the associated genes. Clearly, most of the epigenetic modifications that TR1 cells inherit from TFH cells had already been acquired by TFH cells upon differentiation from naïve T cell precursors. 

      Regarding the specific point raised by the reviewer on differences in the intensity of the H3K27Ac peaks linked to Il10 in Figure 6C, we note that the genomic tracks shown are illustrative. However, thorough statistical analyses involving signal background for each condition and p-value adjustment did not support differential enrichment for H3K27Ac deposition around the Il10 gene between pMHCII-NP-induced tetramer+ T cells and KLHinduced TFH cells. 

      We acknowledge that peak calling alone does not account for intensity variations of histone modifications. However, our analysis includes both qualitative and quantitative assessments to ensure robust conclusions. We will edit the relevant sections of the manuscript to clarify these points and better communicate our methodology and findings to the readers.

      (3) Last, the key findings of this study are clear and convincing, but some results and figures are unnecessary and redundant. Some results are largely a mere confirmation of the relationship between histone marks and chromatin status. I propose to reduce the number of figures and text that are largely confirmatory. Overall, I feel this paper is too long for its current contents. 

      We understand this reviewer’s concern about the potential redundancy of some results and figures. The goal of including these analyses is to provide a comprehensive understanding of the intricate relationships between epigenetic features and transcriptomic differences. We believe that a detailed examination of these relationships is crucial for several reasons: (i) the breadth of the data allows for a thorough exploration of the relationships between histone marks, chromatin accessibility and transcriptional differences. This comprehensive approach helps ensure that our conclusions are robust and well-supported by the data; (ii) some of the results that may appear confirmatory are, in fact, important for validating and reinforcing the consistency of our findings across different contexts. These details intend to provide a nuanced understanding of the interactions between epigenetic features and gene expression; and (iii) by presenting a detailed analysis, we aim to offer a solid foundation for future research in this area. The extensive datasets that are presented in this paper will serve as a valuable resource for others in the field who may seek to build upon our findings.

      That said, we will carefully review the manuscript to identify and streamline any elements that may be overly redundant. We will consider consolidating figures and refining the text to ensure that the paper remains concise and focused while retaining the depth of analysis that we believe is essential.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The study of human intelligence has been the focus of cognitive neuroscience research, and finding some objective behavioral or neural indicators of intelligence has been an ongoing problem for scientists for many years. Melnick et al, 2013 found for the first time that the phenomenon of spatial suppression in motion perception predicts an individual's IQ score. This is because IQ is likely associated with the ability to suppress irrelevant information. In this study, a high-resolution MRS approach was used to test this theory. In this paper, the phenomenon of spatial suppression in motion perception was found to be correlated with the visuo-spatial subtest of gF, while both variables were also correlated with the GABA concentration of MT+ in the human brain. In addition, there was no significant relationship with the excitatory transmitter Glu. At the same time, SI was also associated with MT+ and several frontal cortex FCs.

      Strengths:

      (1) 7T high-resolution MRS is used.

      (2) This study combines the behavioral tests, MRS, and fMRI.

      Weaknesses:

      (1) In the intro, it seems to me that the multiple-demand (MD) regions are the key in this study. However, I didn't see any results associated with the MD regions. Did I miss something?

      Thank you to the reviewer for pointing this out. After careful consideration, we agree with your point of view. According to the results of Melnick 2013, the motion surround suppression (SI) and the time thresholds of small and large gratings representing hMT+ functionality are correlated with Verbal Comprehension, Perceptual Reasoning, Working Memory, and Processing Speed Indicators, with correlation coefficients of 0.69, 0.47, 0.49, and 0.50, respectively. This suggests that hMT+ does have the potential to become the core of MD system. However, due to our results only delving into “the GABA-ergic inhibition in human MT predicts visuo-spatial intelligence mediated through the frontal cortex”, it is not yet sufficient to prove that hMT+is the core node of the MD system, we have adjusted the explanatory logic of the article. Briefly, we emphasize the de-redundancy of hMT+ in visual-spatial intelligence and the improvement of information processing efficiency, while weaken the significance of hMT+ in MD systems.

      (2) How was the sample size determined? Is it sufficient?

      Thank you to reviewer for pointing this out. We use G*power to determine our sample size. In the study by Melnick (2013), they reported a medium effect between SI and Perception Reasoning sub-ability (r=0.47). Here we use this r value as the correlation coefficient (ρ H1), setting the power at the commonly used threshold of 0.8 and the alpha error probability at 0.05. The required sample size is calculated to be 26. This ensures that our study has reasonable power to yield valid statistical results. Furthermore, compared to earlier within-subject studies like Schallmo et al.'s 2018 research, which used 22 datasets to examine GABA levels in MT+ and the early visual cortex (EVC), our study includes an enough dataset.

      (3) In Schallmo elife 2018, there was no correlation between GABA concentration and SI. How can we justify the different results different here?

      Thank reviewer for pointing this out. There are several differences between us:

      a. While the earlier study by Schallmo et al. (2018) employed 3T MRS, we utilize 7T MRS, enhancing our ability to detect and measure GABA with greater accuracy.

      b. Schallmo elife 2018 choose to use the bilateral hMT+ as the MRS measurement region while we use the left hMT+. The reason why we focus on left hMT+ are describe in reviewer 1. (6). Briefly, use of left MT/V5 as a target was motivated by studies demonstrating that left MT/V5 TMS is more effective at causing perceptual effects (Tadin et al., 2011).

      c. The resolution of MRS sequence in Schallmo elife 2018 is 3 cm isotropic voxel, while we apply 2 cm isotropic voxel. This helps us more precisely locate hMT+ and exclude more white matter signal.

      (4) Basically this study contains the data of SI, BDT, GABA in MT+ and V1, Glu in MT+ and V1-all 6 measurements. There should be 6x5/2 = 15 pairwise correlations. However, not all of these results are included in Figure 1 and supplementary 1-3. I understand that it is not necessary to include all figures. But I suggest reporting all values in one Table.

      We thank the reviewer for the good suggestion, we have made a correlation matrix to reporting all values in Figure Supplementary 9.

      (5) In Melnick (2013), the IQ scores were measured by the full set of WAIS-III, including all subtests. However, this study only used the visual spatial domain of gF. I wonder why only the visuo-spatial subtest was used not the full WAIS-III?

      We thank the reviewer for pointing this out. The decision was informed by Melnick’s findings which indicated high correlations between Surround suppression (SI) and the Verbal Comprehension, Perceptual Reasoning, Working Memory, and Processing Speed Indexes, with correlation coefficients of 0.69, 0.47, 0.49, and 0.50, respectively. It is well-established that the hMT+ region of the brain is a sensory cortex involved in visual perception processing (3D perception). Furthermore, motion surround suppression (SI), a specific function of hMT+, aligns closely with this region's activities. Given this context, the Perception Reasoning sub-ability was deemed to have the clearest mechanism for further exploration. Consequently, we selected the most representative subtest of Perception Reasoning—the Block Design Test—which primarily assesses 3D visual intelligence.

      (6) In the functional connectivity part, there is no explanation as to why only the left MT+ was set to the seed region. What is the problem with the right MT+?

      We thank the reviewer for pointing this out. The main reason is that our MRS ROI is the left hMT+, we would like to make different models’ ROI consistent to each other. Use of left MT/V5 as a target was motivated by studies demonstrating that left MT/V5 TMS is more effective at causing perceptual effects (Tadin et al., 2011).

      (7) In Melnick (2013), the authors also reported the correlation between IQ and absolute duration thresholds of small and large stimuli. Please include these analyses as well.

      We thank the reviewer for the good advice. Containing such result do help researchers compare the result between Melnick and us. We have made such figures in the revised version (Figure 3f, g).

      Reviewer #2 (Public Review):

      Summary:

      Recent studies have identified specific regions within the occipito-temporal cortex as part of a broader fronto-parietal, domain-general, or "multiple-demand" (MD) network that mediates fluid intelligence (gF). According to the abstract, the authors aim to explore the mechanistic roles of these occipito-temporal regions by examining GABA/glutamate concentrations. However, the introduction presents a different rationale: investigating whether area MT+ specifically, could be a core component of the MD network.

      Strengths:

      The authors provide evidence that GABA concentrations in MT+ and its functional connectivity with frontal areas significantly correlate with visuo-spatial intelligence performance. Additionally, serial mediation analysis suggests that inhibitory mechanisms in MT+ contribute to individual differences in a specific subtest of the Wechsler Adult Intelligence Scale, which assesses visuo-spatial aspects of gF.

      Weaknesses:

      (1) While the findings are compelling and the analyses robust, the study's rationale and interpretations need strengthening. For instance, Assem et al. (2020) have previously defined the core and extended MD networks, identifying the occipito-temporal regions as TE1m and TE1p, which are located more rostrally than MT+. Area MT+ might overlap with brain regions identified previously in Fedorenko et al., 2013, however the authors attribute these activations to attentional enhancement of visual representations in the more difficult conditions of their tasks. For the aforementioned reasons, It is unclear why the authors chose MT+ as their focus. A stronger rationale for this selection is necessary and how it fits with the core/extended MD networks.

      We really appreciate reviewer’s opinions. The reason why we focus on hMT+ is following: According to the results of Melnick 2013, the motion surround suppression (SI) and the time thresholds of small and large gratings representing hMT+ functionality are correlated with Verbal Comprehension, Perceptual Reasoning, Working Memory, and Processing Speed Indicators, with high correlation coefficients of 0.69, 0.47, 0.49, and 0.50, respectively. In addition, Fedorenko et al. 2013, the averaged MD activity region appears to overlap with hMT+. Based on these findings, we assume that hMT+ does have the potential to become the core of MD system.

      (2) Moreover, although the study links MT+ inhibitory mechanisms to a visuo-spatial component of gF, this evidence alone may not suffice to position MT+ as a new core of the MD network. The MD network's definition typically encompasses a range of cognitive domains, including working memory, mathematics, language, and relational reasoning. Therefore, the claim that MT+ represents a new core of MD needs to be supported by more comprehensive evidence.

      Thank reviewer for pointing this out. After careful consideration, we agree with your point of view. Due to our results only delving into visuo-spatial intelligence, it is not yet sufficient to prove that hMT is the core node of the MD system. We will adjust the explanatory logic of the article, that is, emphasizing the de-redundancy of hMT+in visual-spatial intelligence and the improvement of information processing efficiency, while weakening the significance of hMT+ in MD systems.

      Reviewer #3 (Public Review):

      Summary:

      This manuscript aims to understand the role of GABA-ergic inhibition in the human MT+ region in predicting visuo-spatial intelligence through a combination of behavioral measures, fMRI (for functional connectivity measurement), and MRS (for GABA/glutamate concentration measurement). While this is a commendable goal, it becomes apparent that the authors lack fundamental understanding of vision, intelligence, or the relevant literature. As a result, the execution of the research is less coherent, dampening the enthusiasm of the review.

      Strengths:

      (1) Comprehensive Approach: The study adopts a multi-level approach, i.e., neurochemical analysis of GABA levels, functional connectivity, and behavioral measures to provide a holistic understanding of the relationship between GABA-ergic inhibition and visuo-spatial intelligence.

      (2) Sophisticated Techniques: The use of ultra-high field magnetic resonance spectroscopy (MRS) technology for measuring GABA and glutamate concentrations in the MT+ region is a recent development.

      Weaknesses:

      Study Design and Hypothesis

      (1) The central hypothesis of the manuscript posits that "3D visuo-spatial intelligence (the performance of BDT) might be predicted by the inhibitory and/or excitation mechanisms in MT+ and the integrative functions connecting MT+ with the frontal cortex." However, several issues arise:

      (1.1) The Suppression Index depicted in Figure 1a, labeled as the "behavior circle," appears irrelevant to the central hypothesis.

      We thank the reviewer for pointing this out. In our study, the inhibitory mechanisms in hMT+ are conceptualized through two models: the neurotransmitter model and the behavioral model. The Suppression Index is essential for elucidating the local inhibitory mechanisms within the behavioral model. However, we acknowledge that our initial presentation in the introduction may not have clearly articulated our hypothesis, potentially leading to misunderstandings. We have revised the introduction to better clarify these connections and ensure the relevance of the Suppression Index is comprehensively understood.

      (1.2) The construct of 3D visuo-spatial intelligence, operationalized as the performance in the Block Design task, is inconsistently treated as another behavioral task throughout the manuscript, leading to confusion.

      We thank the reviewer for pointing this out. We acknowledge that our manuscript may have inconsistently presented this construct across different sections, causing confusion. To address this, we ensured a consistent description of 3D visuo-spatial intelligence in both the introduction and the discussion sections. But we maintained ‘Block Design task score' within the results section to help readers clarify which subtest we use.

      (1.3) The schematics in Figure 1a and Figure 6 appear too high-level to be falsifiable. It is suggested that the authors formulate specific and testable hypotheses and preregister them before data collection.

      We thank the reviewer for pointing this out. We have revised the Figure 1a and made it less abstract and more logical. For Figure 6, the schematic represents our theoretical framework of how hMT+ contributes to 3D visuo-spatial intelligence, we believe the elements within this framework are grounded in related theories and supported by evidence discussed in our results and discussions section, making them specific and testable.

      (2) Central to the hypothesis and design of the manuscript is a misinterpretation of a prior study by Melnick et al. (2013). While the original study identified a strong correlation between WAIS (IQ) and the Suppression Index (SI), the current manuscript erroneously asserts a specific relationship between the block design test (from WAIS) and SI. It should be noted that in the original paper, WAIS comprises Similarities, Vocabulary, Block design, and Matrix reasoning tests in Study 1, while the complete WAIS is used in Study 2. Did the authors conduct other WAIS subtests other than the block design task?

      Thank you for pointing this out. Reviewer #1 also asked this question, we copy the answers in here “The decision was informed by Melnick’s findings which indicated high correlations between Surround suppression (SI) and the Verbal Comprehension, Perceptual Reasoning, Working Memory, and Processing Speed Indexes, with correlation coefficients of 0.69, 0.47, 0.49, and 0.50, respectively. It is well-established that the hMT+ region of the brain is a sensory cortex involved in visual perception processing (3D perception). Furthermore, motion surround suppression (SI), a specific function of hMT+, aligns closely with this region's activities. Given this context, the Perception Reasoning sub-ability was deemed to have the clearest mechanism for further exploration. Consequently, we selected the most representative subtest of Perception Reasoning—the Block Design Test—which primarily assesses 3D visual intelligence.”

      (3) Additionally, there are numerous misleading references and unsubstantiated claims throughout the manuscript. As an example of misleading reference, "the human MT ... a key region in the multiple representations of sensory flows (including optic, tactile, and auditory flows) (Bedny et al., 2010; Ricciardi et al., 2007); this ideally suits it to be a new MD core." The two references in this sentence are claims about plasticity in the congenitally blind with sensory deprivation from birth, which is not really relevant to the proposal that hMT+ is a new MD core in healthy volunteers.

      Thank you for pointing this out. We have carefully read the corresponding references and considered the corresponding theories and agree with these comments. Due to our results only delving into “the GABA-ergic inhibition in human MT predicts visuo-spatial intelligence mediated by reverberation with frontal cortex”, it is not yet sufficient to prove that hMT+ is the core node of the MD system, we will adjust the explanatory logic of the article, that is, emphasizing the de redundancy of hMT+in visual-spatial intelligence and the improvement of information processing efficiency, while weakening the significance of hMT+ in MD systems. In addition, regarding the potential central role of hMT+ in the MD system, we agree with your view that research on hMT+ as a multisensory integration hub mainly focuses on developmental processes. Meanwhile, in adults, the MST region of hMT+ is considered a multisensory integration area for visual and vestibular inputs, which potentially supports the role of hMT+ in multitasking multisensory systems (Gu et al., J. Neurosci, 26(1), 73–85, 2006; Fetsch et al., Nat. Neurosci, 15, 146–154, 2012.). Further research could explore how other intelligence sub-ability such as working memory and language comprehension are facilitated by hMT+'s features.

      Another example of unsubstantiated claim: the rationale for selecting V1 as the control region is based on the assertion that "it mediates the 2D rather than 3D visual domain (Born & Bradley, 2005)". That's not the point made in the Born & Bradley (2005) paper on MT. It's crucial to note that V1 is where the initial binocular convergence occurs in cortex, i.e., inputs from both the right and left eyes to generate a perception of depth.

      Thank you for pointing this out. We acknowledge the inappropriate citation of "Born & Bradley, 2005," which focuses solely on the structure and function of the visual area MT. However, we believe that choosing hMT+ as the domain for 3D visual analysis and V1 as the control region is justified. Cumming and DeAngelis (Annu Rev Neurosci, 24:203–238.2001) state that binocular disparity provides the visual system with information about the three-dimensional layout of the environment, and the link between perception and neuronal activity is stronger in the extrastriate cortex (especially MT) than in the primary visual cortex. This supports our choice and emphasizes the relevance of hMT+ in our study. We have revised our reference in the revised version.

      Results & Discussion

      (1) The missing correlation between SI and BDT is crucial to the rest of the analysis. The authors should discuss whether they replicated the pattern of results from Melnick et al. (2013) despite using only one WAIS subtest.

      We thank for the reviewer’s suggestion. We have placed it in the main text (Figure 3e).

      (2) ROIs: can the authors clarify if the results are based on bilateral MT+/V1 or just those in the left hemisphere? Can the authors plot the MRS scan area in V1? I would be surprised if it's precise to V1 and doesn't spread to V2/3 (which is fine to report as early visual cortex).

      We thank for the reviewer’s suggestion. We have drawn the V1 ROI MRS scanning area (Figure supplement 1). Using the template, we checked the coverage of V1, V2, and V3. Although the MRS overlap regions extend to V2 (3%) and V3 (32%), the major coverage of the MRS scanning area is in V1, with 65% overlap across subjects.

      (3) Did the authors examine V1 FC with either the frontal regions and/or whole brain, as a control analysis? If not, can the author justify why V1 serves as the control region only in the MRS but not in FC (Figure 4) or the mediation analysis (Figure 5)? That seems a little odd given that control analyses are needed to establish the specificity of the claim to MT+

      We thank for the reviewer’s suggestion. We have done the V1 FC-behavior connection as control analysis (Figure supplement 7). Only positive correlations in the frontal area were detected, suggesting that in the 3D visuo-spatial intelligence task, V1 plays a role in feedforward information processing. However, hMT+, which showed specific negative correlations in the frontal, is involved in the inhibition mechanism. These results further emphasize the de-redundancy function of hMT+ in 3D visuo-spatial intelligence.

      Regarding the mediation analysis, since GABA/Glu concentration in V1 has no correlation with BDT score, it is not sufficient to apply mediation analysis.

      (4) It is not clear how to interpret the similarity or difference between panels a and b in Figure 4.

      We thank the reviewer for pointing this out. We have further interpreted the difference between a and b in the revised version. Panels a represents BDT score correlated hMT+-region FC, which is obviously involved in frontal cortex. While panels b represents SI correlated hMT+-region FC, which shows relatively less regions. The overlap region is what we are interested in and explain how local inhibitory mechanisms works in the 3D visuo-spatial intelligence. In addition, we have revised Figure 4 and point out the overlap region.

      (5) SI is not relevant to the authors‘ priori hypothesis, but is included in several mediation analyses. Can the authors do model comparisons between the ones in Figure 5c, d, and Figure S6? In other words, is SI necessary in the mediation model? There seem discrepancies between the necessity of SI in Figures 5c/S6 vs. Figure 5d.

      We thank the reviewer for highlighting this point. The relationship between the Suppression Index (SI) and our a priori hypotheses is elaborated in the response to reviewer 3, section (1). SI plays a crucial role in explicating how local inhibitory mechanisms, on the psychological level, function within the context of the 3D visuo-spatial task. Additionally, Figure 5c illustrates the interaction between the frontal cortex and hMT+, showing how the effects from the frontal cortex (BA46) on the Block Design Task are fully mediated by SI. This further underscores the significance of SI in our model.

      (6) The sudden appearance of "efficient information" in Figure 6, referring to the neural efficiency hypothesis, raises concerns. Efficient visual information processing occurs throughout the visual cortex, starting from V1. Thus, it appears somewhat selective to apply the neural efficiency hypothesis to MT+ in this context.

      We thank the reviewer for highlighting this point. There is no doubt that V1 involved in efficient visual information processing. However, in our result, the V1 GABA has no significant correlation between BDT score, suggesting that the V1 efficient processing might not sufficiently account for the individual differences in 3D visuo-spatial intelligence. Additionally, we will clarify our use of the neural efficiency hypothesis by incorporating it into the introduction of our paper to better frame our argument.

      Transparency Issues:

      (1) Don't think it's acceptable to make the claim that "All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary information". It is the results or visualizations of data analysis, rather than the raw data themselves, that are presented in the paper/supp info.

      We thank the reviewer for pointing this out. We realized that such expression would lead to confusion. We have deleted this expression.

      (2) No GitHub link has been provided in the manuscript to access the source data, which limits the reproducibility and transparency of the study.

      We thank the reviewer for pointing this out. We have attached the GitHub link in the revised version.

      Minor:

      "Locates" should be replaced with "located" throughout the paper. For example: "To investigate this issue, this study selects the human MT complex (hMT+), a region located at the occipito-temporal border, which represents multiple sensory flows, as the target brain area."

      We thank the reviewer for pointing this out. We have revised it.

      Use "hMT+" instead of "MT+" to be consistent with the term in the literature.

      We thank the reviewer for pointing this out. We agree to use hMT+ in the literature.

      "Green circle" in Figure 1 should be corrected to match its actual color.

      We thank the reviewer for pointing this out. We have revised it.

      The abbreviation for the Wechsler Adult Intelligence Scale should be "WAIS," not "WASI."

      We thank the reviewer for pointing this out. We have revised it.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The figures and tables should be substantially improved.

      We thank the reviewer for pointing this out. We have improved some of the figures’ quality.

      (2) Please explain the sample size, and the difference between Schallmo eLife 2018, and Melnick, 2013.

      We thank the reviewer for pointing this out. These questions are answered in the public review. We copy the answer in the public review.

      (2.1)  How was the sample size determined? Is it sufficient??

      Thank you to the reviewer for pointing this out. We use G*power to determine our sample size. In the study by Melnick (2013), they reported a medium effect between SI and Perception Reasoning sub-ability (r=0.47). Here we use this r value as the correlation coefficient (ρ H1), setting the power at the commonly used threshold of 0.8 and the alpha error probability at 0.05. The required sample size is calculated to be 26. This ensures that our study has adequate power to yield valid statistical results. Furthermore, compared to earlier within-subject studies like Schallmo et al.'s 2018 research, which used 22 subjects to examine GABA levels in MT+ and the early visual cortex (EVC), our study includes an enough dataset.

      (2.2)  In Schallmo elife 2018, there was no correlation between GABA concentration and SI. How can we justify the different results different here?

      Thank you to the reviewer for pointing this out. There are several differences between the two studies, ours and theirs:

      a. While the earlier study by Schallmo et al. (2018) employed 3T MRS, we utilize 7T MRS, enhancing our ability to detect and measure GABA with greater accuracy.

      b. Schallmo elife 2018 choose to use the bilateral hMT+ as the MRS measurement region while we use the left hMT+. The reason why we focus on left hMT+ are described in review 1. (6). Briefly, use of left MT/V5 as a target was motivated by studies demonstrating that left MT/V5 TMS is more effective at causing perceptual effects (Tadin et al., 2011).

      c. The resolution of MRS sequence in Schallmo elife 2018 is 3 cm isotropic voxel, while we apply 2 cm isotropic voxel. This helps us more precisely locate hMT+ and exclude more white matter signal.

      (3) Table 1 and Table Supplementary 1-3 contain many correlation results. But what are the main points of these values? Which values do the authors want to highlight? Why are only p-values shown with significance symbols in Table Supplementary 2?

      (3.1) what are the main points of these values?

      Thank you to the reviewer for pointing this out. These correlations represent the relationship between behavior task (SI/BDT) and resting-state functional connectivity. It indicates that left hMT+ is involved in the efficient information integration network when it comes to the BDT task. In addition, left hMT+’s surround suppression is involved in several hMT+ - frontal connectivity. Furthermore, the overlapping regions between two tasks indicate a shared underlying mechanism.

      (3.2) Which values do the authors want to highlight?

      Table 1 and Table Supplementary 1-3 present the preliminary analysis results for Table 2 and Table Supplementary 4-6. So, we generally report all value. Conversely, in the Table 2 and Table Supplementary 4-6, we highlight (bold font) indicating the significant correlations survived from multi correlation correction.

      (3.3) Why are only p-values shown with significance symbols in Table Supplementary 2?

      Thank you for pointing this out, it is a mistake. We have revised it and delete the significance symbols.

      (4) Line 27, it is unclear to me what is "the canonical theory".

      We thank the reviewer for pointing this out. We have revised “the canonical theory" to “the prevailing opinion”.

      (5) Throughout the paper, the authors use "MT+", I would suggest using "hMT+" to indicate the human MT complex, and to be consistent with the human fMRI literature.

      We thank the reviewer for pointing this out. We have revised them and used "hMT+" to be consistent with the human fMRI literature.

      (6) At the beginning of the results section, I suggest including the total number of subjects. It is confusing what "31/36 in MT+, and 28/36 in V1" means.

      We thank the reviewer for pointing this out. We have included the total number of subjects in the beginning of result section.

      (7) Line 138, "This finding supports the hypothesis that motion perception is associated with neural activity in MT+ area". This sentence is strange because it is a well-established finding in numerous human fMRI papers. I think the authors should be more specific about what this finding implies.

      We thank the reviewer for pointing this out. We have deleted the inappropriate sentence "This finding supports the hypothesis that motion perception is associated with neural activity in MT+ area".

      (8) There are no unit labels for all x- and y-axies in Figure 1. I only see the unit for Conc is mmol per kg wet weight.

      We thank the reviewer for pointing this out. Figure 1 is a schematic and workflow chart, so labels for x- and y-axes are not needed. I believe this confusion might pertain to Figure 3. In Figures 3a and 3b, the MRS spectrum does not have a standard y-axis unit as it varies based on the individual physical conditions of the scanner; it is widely accepted that no y-axis unit is used. While the x-axis unit is ppm, which indicate the chemical shift of different metabolites. In Figure 3c, the BDT represents IQ scores, which do not have a standard unit. Similarly, in Figures 3d and 3e, the Suppression Index does not have a standard unit.

      (9) Although the correlations are not significant in Figure Supplement 2&3, please also include the correlation line, 95% confidence interval, and report the r values and p values (i.e., similar format as in Figure 1C).

      We thank the reviewer for pointing this out. We have revised them.

      (10) There is no need to separate different correlation figures into Figure Supplementary 1-4. They can be combined into the same figure.

      We thank the reviewer for the suggestion. However, each correlation figure in the supplementary figures has its own specific topic and conclusion. The correlation figures in Supplementary Figure 1 indicate that GABA in V1 does not show any correlation with BDT and SI, illustrating that inhibition in V1 is unrelated to both 3D visuo-spatial intelligence and motion suppression processing. The correlations in Supplementary Figure 2 indicate that the excitation mechanism, represented by Glutamate concentration, does not contribute to 3D visuo-spatial intelligence in either hMT+ or V1. Supplementary Figure 3 validates our MRS measurements. Supplementary Figure 4 addresses potential concerns regarding the impact of outliers on correlation significance. Even after excluding two “outliers” from Figures 3d and 3e, the correlation results remain stable.

      (11) Line 213, as far as I know, the study (Melnick et al., 2013) is a psychophysical study and did not provide evidence that the spatial suppression effect is associated with MT+.

      We thank the reviewer for pointing this out. It was a mistake to use this reference, and we have revised it accordingly.

      (12) At the beginning of the results, I suggest providing more details about the motion discrimination tasks and the measurement of the BDT.

      We thank the reviewer for pointing this out. We have included some brief description of task at the beginning of the result section.

      (13) Please include the absolute duration thresholds of the small and large sizes of all subjects in Figure 1.

      We thank the reviewer for the suggestion. We have included these results in Figure 3.

      (14) Figure 5 is too small. The items in plot a and b can be barely visible.

      We thank the reviewer for pointing this out. We increase the size and resolution of Figure 5.

      Reviewer #2 (Recommendations For The Authors):

      Recommendations for improving the writing and presentation.

      I highly recommend editing the manuscript for readability and the use of the English language. I had significant difficulties following the rationale of the research due to issues with the way language was used.

      We thank the reviewer for pointing this out. We apologize for any shortcomings in our initial presentation. We have invited a native English speaker to revise our manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:  

      Reviewer #1 (Public Review):  

      Summary:  

      Heer and Sheffield used 2 photon imaging to dissect the functional contributions of convergent dopamine and noradrenaline inputs to the dorsal hippocampus CA1 in head-restrained mice running down a virtual linear path. Mice were trained to collect water rewards at the end of the track and on test days, calcium activity was recorded from dopamine (DA) axons originating in the ventral tegmental area (VTA, n=7) and noradrenaline axons from the locus coeruleus (LC, n=87) under several conditions. When mice ran laps in a familiar environment, VTA DA axons exhibited ramping activity along the track that correlated with distance to reward and velocity to some extent, while LC input activity remained constant across the track, but correlated invariantly with velocity and time to motion onset. A subset of recordings taken when the reward was removed showed diminished ramping activity in VTA DA axons, but no changes in the LC axons, confirming that DA axon activity is locked to reward availability. When mice were subsequently introduced to a new environment, the ramping to reward activity in the DA axons disappeared, while LC axons showed a dramatic increase in activity lasting 90 s (6 laps) following the environment switch. In the final analysis, the authors sought to disentangle LC axon activity induced by novelty vs. behavioral changes induced by novelty by removing periods in which animals were immobile and established that the activity observed in the first 2 laps reflected novelty-induced signal in LC axons.  

      Strengths:  

      The results presented in this manuscript provide insights into the specific contributions of catecholaminergic input to the dorsal hippocampus CA1 during spatial navigation in a rewarded virtual environment, offering a detailed analysis of the resolution of single axons. The data analysis is thorough and possible confounding variables and data interpretation are carefully considered.  

      Weaknesses:  

      Aspects of the methodology, data analysis, and interpretation diminish the overall significance of the findings, as detailed below.  

      The LC axonal recordings are well-powered, but the DA axonal recordings are severely underpowered, with recordings taken from a mere 7 axons (compared to 87 LC axons).

      Additionally, 2 different calcium indicators with differential kinetics and sensitivity to calcium changes (GCaMP6S and GCaMP7b) were used (n=3, n=4 respectively) and the data pooled. This makes it very challenging to draw any valid conclusions from the data, particularly in the novelty experiment. The surprising lack of novelty-induced DA axon activity may be a false negative. Indeed, at least 1 axon (axon 2) appears to be showing a novelty-induced rise in activity in Figure 3C. Changes in activity in 4/7 axons are also referred to as a 'majority' occurrence in the manuscript, which again is not an accurate representation of the observed data.  

      We appreciate the reviewer's detailed feedback regarding the analysis of VTA axons in our dataset. The relatively low sample size for VTA axons is due to their sparsity in the dCA1 region of the hippocampus and the inherent difficulty in recording from these axons. VTA axons are challenging to capture due to their low baseline fluorescence and long-range axon segments, resulting in a typical yield of only a single axon per field of view (FOV) per animal. In contrast, LC axons are more abundant in dCA1.

      To address the disparity in sample sizes between LC and VTA axons, we down-sampled the LC axons to match the number of VTA axons, repeating this process 1000 times to create a distribution. However, we acknowledge the reviewer's concern that the relatively low sample size for VTA axons might result in insufficient sampling of this population. Increasing the baseline expression of GCaMP to record from VTA axons requires several months, limiting our ability to quickly expand the sample size.

      In response to the reviewer's comments, we have added recordings from 2 additional VTA axons, increasing the sample size from 7 to 9. We re-analyzed all data from the familiar environment with n=9 VTA axons, comparing them to down-sampled LC axons as previously described. However, the additional axons were not recorded in the novel environment. We agree with the reviewer that the lack of novelty-induced DA axon activity may be a false negative. To address this, we have revised the description of our results to include the following sentence:

      “However, 1 VTA ROI showed an increase in activity immediately following exposure to novelty, indicating heterogeneity across VTA axons in CA1, and the lack of a novelty signal on average may be due to a small sample size.”

      Regarding the use of two different GCaMP constructs, we understand the reviewer's concern. We used GCaMP6s and GCaMP7b variants to determine if one would improve the success rate of recording from VTA axons. Given the long duration of these experiments and the low yield, we pooled the data from both GCaMP variants to increase statistical power. However, we recognize the importance of verifying that there are no differences in the signals recorded with these variants.

      With the addition of 2 VTA DA axons expressing GCaMP6s, we now have n=5 GCaMP6s and n=4 GCaMP7b VTA DA axons. This allowed us to compare the activity of the two sensors in the familiar environment. As shown in new Supplementary Figure 2, both sets of axons responded similarly to the variables measured: position in VR, time to motion onset, and animal velocity (although the GCaMP6s expressing axons showed stronger correlations). Since all LC axons recorded expressed GCaMP6s, we also specifically compared VTA GCaMP6s axons to LC GCaMP6s axons (Supp Fig. 3). Our conclusions remained consistent when comparing this subset of VTA axons to LC axons.

      Overall, our paper now includes comparisons of combined VTA axons (n=9) and separately the GCaMP6s-expressing VTA axons (n=5) with LC axons. Both datasets support our initial conclusions that VTA axons signal proximity to reward, while LC axons encode velocity and motion initiation in familiar environments.

      The authors conducted analysis on recording data exclusively from periods of running in the novelty experiment to isolate the effects of novelty from novelty-induced changes in behavior. However, if the goal is to distinguish between changes in locus coeruleus (LC) axon activity induced by novelty and those induced by motion, analyzing LC axon activity during periods of immobility would enhance the robustness of the results.  

      We appreciate the reviewer's insightful suggestion to analyze LC axon activity during periods of immobility to distinguish between changes induced by novelty and those induced by motion. This additional analysis would indeed strengthen our conclusions regarding the LC novelty signal.

      In response to this suggestion, we performed the same analysis as before, but focused on periods of immobility. Our findings indicate that following exposure to novelty, there was a significant increase in LC activity specifically during immobility. This supports the idea that LC axons produce a novelty signal that is independent of novelty-induced behavioral changes. The results of this analysis are now presented in new Supplementary Figure 5b

      The authors attribute the ramping activity of the DA axons to the encoding of the animals' position relative to reward. However, given the extensive data implicating the dorsal CA1 in timing, and the remarkable periodicity of the behavior, the fact that DA axons could be signalling temporal information should be considered.  

      This is an insightful comment regarding the potential role of VTA DA axons in signaling temporal information. We agree that VTA DA axons could indeed be encoding temporal information, as previous work from our lab has shown that these axons exhibit ramping activity when averaged by time to reward (Krishnan et al., 2022).

      To address this, we have now examined DA axon activity relative to time to reward, as shown in new Supplementary Figure 4. Our analysis confirms that these axons ramp up in activity relative to time to reward. Given the periodicity of our mice's behavior in these experiments, as the reviewer correctly points out, we are unable to distinguish between spatial proximity to reward and time to reward. We have added a sentence to our paper highlighting this limitation and stating that further experiments are necessary to differentiate these two variables.

      Krishnan, L.S., Heer, C., Cherian, C., Sheffield, M.E. Reward expectation extinction restructures and degrades CA1 spatial maps through loss of a dopaminergic reward proximity signal. Nat Commun 13, 6662 (2022).

      The authors should explain and justify the use of a longer linear track (3m, as opposed to 2m in the DAT-cre mice) in the LC axon recording experiments.  

      We appreciate the reviewer's insightful comment regarding the use of a longer linear track (3m, as opposed to 2m in the DAT-cre mice) in the LC axon recording experiments. The choice of a 3m track for LC axon recordings was made to align with a previous experiment from our lab (Dong et al., 2021), in which mice were exposed to a novel 3m track while CA1 pyramidal cell populations were recorded. In that study, we detailed the time course of place field formation within the novel track. Our current hypothesis is that LC axons signal novelty, and we aimed to investigate whether the time course of LC axon activity aligns with the time course of place field formation. This hypothesis, and the potential role of LC axons in facilitating plasticity for new place field formation, is further discussed in the Discussion section of our paper.

      For the VTA axon recordings, we utilized a 2m track, consistent with another recent study from our lab (Krishnan et al., 2022), where reward expectation was manipulated, and CA1 pyramidal cell populations were recorded. By matching the track length to this prior study, we aimed to explore how VTA dopaminergic inputs to CA1 might influence CA1 population dynamics along the track under conditions of varying reward expectations.

      We acknowledge that using different track lengths for LC and VTA recordings introduces a variable that could potentially confound direct comparisons. To address this, we normalized the track lengths for our LC versus VTA comparison analysis. This normalization allowed us to directly compare patterns of activity across the two types of axons by adjusting the data to a common scale, thereby ensuring that any observed differences or similarities are attributable to the intrinsic properties of the axons rather than differences in track lengths. By doing so, we could assess relative changes in activity levels at matched spatial bins.

      Although the experiences of the animals on the different track lengths are not identical, our observations suggest that LC and VTA axon signals are not majorly influenced by variations in track length. LC axons are associated with velocity and a pre-motion initiation signal, neither of which are affected by track length. VTA axons, which also correlate with velocity, can be compared to LC axon velocity signals because mice reach maximal velocity very quickly a long the track, well before the end of the 2m track. The range of velocities are therefore capture on both track lengths. While VTA axons exhibit ramping activity as they approach the reward zone—a signal potentially modulated by track length—LC axons do not show such ramping to reward signals. Thus, a comparison across different track lengths is justified for this aspect of our analysis.

      To further enhance the rigor of our comparisons between axon dynamics recorded on 2m and 3m tracks, we conducted an additional analysis plotting axon activity by time to reward and actual (un-normalized) distance from reward (Supplementary Figure 4). This analysis revealed very similar signals between the two sets of axons, supporting our initial conclusions.

      We thank the reviewer for raising this important point and hope that our detailed explanation and additional analysis address their concern.

      Krishnan, L.S., Heer, C., Cherian, C., Sheffield, M.E. Reward expectation extinction restructures and degrades CA1 spatial maps through loss of a dopaminergic reward proximity signal. Nat Commun 13, 6662 (2022).

      Dong, C., Madar, A. D. & Sheffield, M.E. Distinct place cell dynamics in CA1 and CA3 encode experience in new environments. Nat Commun 12, 2977 (2021).

      Reviewer #2 (Public Review):  

      Summary:  

      The authors used 2-photon Ca2+-imaging to study the activity of ventral tegmental area (VTA) and locus coeruleus (LC) axons in the CA1 region of the dorsal hippocampus in head-fixed male mice moving on linear paths in virtual reality (VR) environments.  

      The main findings were as follows:  

      - In a familiar environment, the activity of both VTA axons and LC axons increased with the mice's running speed on the Styrofoam wheel, with which they could move along a linear track through a VR environment.  

      - VTA, but not LC, axons showed marked reward position-related activity, showing a ramping-up of activity when mice approached a learned reward position.  

      - In contrast, the activity of LC axons ramped up before the initiation of movement on the Styrofoam wheel.  

      - In addition, exposure to a novel VR environment increased LC axon activity, but not VTA axon activity.  

      Overall, the study shows that the activity of catecholaminergic axons from VTA and LC to dorsal hippocampal CA1 can partly reflect distinct environmental, behavioral, and cognitive factors. Whereas both VTA and LC activity reflected running speed, VTA, but not LC axon activity reflected the approach of a learned reward, and LC, but not VTA, axon activity reflected initiation of running and novelty of the VR environment.  

      I have no specific expertise with respect to 2-photon imaging, so cannot evaluate the validity of the specific methods used to collect and analyse 2-photon calcium imaging data of axonal activity.  

      Strengths:  

      (1) Using a state-of-the-art approach to record separately the activity of VTA and LC axons with high temporal resolution in awake mice moving through virtual environments, the authors provide convincing evidence that the activity of VTA and LC axons projecting to dorsal CA1 reflect partly distinct environmental, behavioral and cognitive factors.  

      (2) The study will help a) to interpret previous findings on how hippocampal dopamine and norepinephrine or selective manipulations of hippocampal LC or VTA inputs modulate behavior and b) to generate specific hypotheses on the impact of selective manipulations of hippocampal LC or VTA inputs on behavior.  

      Weaknesses:  

      (1) The findings are correlational and do not allow strong conclusions on how VTA or LC inputs to dorsal CA1 affect cognition and behavior. However, as indicated above under Strengths, the findings will aid the interpretation of previous findings and help to generate new hypotheses as to how VTA or LC inputs to dorsal CA1 affect distinct cognitive and behavioral functions.  

      (2) Some aspects of the methodology would benefit from clarification.  

      First, to help others to better scrutinize, evaluate, and potentially to reproduce the research, the authors may wish to check if their reporting follows the ARRIVE (Animal Research: Reporting of In Vivo Experiments) guidelines for the full and transparent reporting of research involving animals (https://arriveguidelines.org/). For example, I think it would be important to include a sample size justification (e.g., based on previous studies, considerations of statistical power, practical considerations, or a combination of these factors). The authors should also include the provenance of the mice. Moreover, although I am not an expert in 2-photon imaging, I think it would be useful to provide a clearer description of exclusion criteria for imaging data.

      We thank the reviewer for helping us formalize the scientific rigor of our study. There are ten ARRIVE Guidelines and we have addressed most of them in our study already. However, there is an opportunity to add detail. We have listed below all ten points and how we have addressed each one (and point out any new additions):

      (1) Experimental design - we go into great depth explaining the experimental set-up, how we used the autofluorescent blebs as imaging controls, how we controlled for different sample sizes between the two populations, and the statistical tests used for comparisons. We also carefully accounted for animal behavior when quantifying and describing axon dynamics both in the familiar and novel environments.

      (2) Sample size - we state both the number of ROIs and mice for each analysis. We have now also added the number of mice we observed specific types of activity in. 

      (3) Inclusion/exclusion criteria - The following has now been added to the Methods section: Out of the 36 NET-Cre mice injected, 15 were never recorded from for either failing to reach behavioral criteria, or a lack of visible expression in axons. Out of the 54 DAT-Cre mice injected, imaging was never conducted in 36 of them for lack of expression or failing to reach behavioral criteria. Out of the remaining 21 NET-CRE, 5 were excluded for heat bubbles, z-drift, or bleaching, while 10 DAT-Cre were excluded for the same reasons. This was determined by visually assessing imaging sessions, followed by using the registration metrics output by suite2p. This registration metric conducted a PCA on the motion-corrected ROIs and plotted the first PC. If the PC drifted largely, to the point where no activity was apparent, the video was excluded from analysis. 

      (4) Randomization - Already included in the paper is a description of random downsampling of LC axons to make statistical comparisons with VTA axons. LC axons were selected pseudo-randomly (only one axon per imaging session) to match VTA sampling statistics. This randomization was repeated 1000 times and comparisons were made against this random distribution. 

      (5) Blinding-masking - no blinding/masking was conducted as no treatments were given that would require this. We will include this statement in the next version. 

      (6) Outcomes - We defined all outcomes measured, such as those related to animal behavior and axon signaling. 

      (7) Statistical methods - None of the reviewers had any issues regarding our description of statistical methods, which we described in great detail in this version of the paper. 

      (8) Experimental animals - We have now described that DAT- Cre mice were obtained through JAX labs, and NET-Cre mice were obtained from the Tonegawa lab (Wagatsuma et al. 2017). This was absent in the initial version of the paper.

      (9) Experimental procedure - Already listed in great detail in Methods section.

      (10) Results - Rigorously described in detail for behaviors and related axon dynamics.

      Wagatsuma, Akiko, Teruhiro Okuyama, Chen Sun, Lillian M. Smith, Kuniya Abe, and Susumu Tonegawa. “Locus Coeruleus Input to Hippocampal CA3 Drives Single-Trial Learning of a Novel Context.” Proceedings of the National Academy of Sciences 115, no. 2 (January 9, 2018): E310–16. https://doi.org/10.1073/pnas.1714082115.

      Second, why were different linear tracks used for studies of VTA and LC axon activity (from line 362)? Could this potentially contribute to the partly distinct activity correlates that were found for VTA and LC axons?  

      We thank the reviewer for pointing this out and giving us a chance to address it directly. A detailed response to this is written above for a similar comment from reviewer 1.

      Third, the authors seem to have used two different criteria for defining immobility. Immobility was defined as moving at <5 cm/s for the behavioral analysis in Figure 3a, but as <0.2 cm/s for the imaging data analysis in Figure 4 (see legends to these figures and also see Methods, from line 447, line 469, line 498)? I do not understand why, and it would be good if the authors explained this.  

      This is a typo leftover from before we converted velocity from rotational units of the treadmill to cm/s. This has now been corrected.

      (3) In the Results section (from line 182) the authors convincingly addressed the possibility that less time spent immobile in the novel environment may have contributed to the novelty-induced increase of LC axon activity in dorsal CA1 (Figure 4). In addition, initially (for the first 2-4 laps), the mice also ran more slowly in the novel environment (Figure 3aIII, top panel). Given that LC and VTA axon activity were both increasing with velocity (Figure 1F), reduced velocity in the novel environment may have reduced LC and VTA axon activity, but this possibility was not addressed. Reduced LC axon activity in the novel environment could have blunted the noveltyinduced increase. More importantly, any potential novelty-induced increase in VTA axon activity could have been masked by decreases in VTA axon activity due to reduced velocity. The latter may help to explain the discrepancy between the present study and previous findings that VTA neuron firing was increased by novelty (see Discussion, from line 243). It may be useful for the authors to address these possibilities based on their data in the Results section, or to consider them in their Discussion.  

      We appreciate the reviewer's insightful comment regarding the potential impact of decreased velocity on novelty responses in LC and VTA axons. The decreased velocity in the novel environment could lead to a diminished novelty response in LC axons and could mask a subtle novelty signal in VTA axons. We have now included the following points in our discussion:

      “In addition, as noted above, on average we did observe a velocity associated signal in VTA axons. When mice were exposed to the novel environment their velocity initially decreased. This would be expected to reduce the average signal across the VTA axon population relative to the higher velocity in the familiar environment. It is possible that this decrease could somewhat mask a subtle novelty induced signal in VTA axons. Therefore, additional experiments should be conducted to investigate the heterogeneity of these axons and their activity under different experimental conditions during tightly controlled behavior.”

      “As discussed above, the slowing down of animal behavior in the novel environment could have decreased LC axon activity and reduced the magnitude of the novelty signal we detected during running. The novelty signal we report here may therefore be an under estimate of it's magnitude under matched behavioral settings.”

      However, it is important to note that although VTA axons, on average, showed activity modulated by velocity in a familiar rewarded environment, this relationship was largely due to the activity of two VTA axons that were strongly modulated by velocity, indicating heterogeneity within the VTA axon population in dCA1. We have highlighted this point in the discussion. We also discuss that:

      “It is possible that some VTA DA inputs to dCA1 respond to novel environments, and the small number of axons recorded here are not representative of the whole population.”

      (4) Sensory properties of the water reward, which the mice may be able to detect, could account for reward-related activity of VTA axons (instead of an expectation of reward). Do the authors have evidence that this is not the case? Occasional probe trials, intermixed with rewarded trials, could be used to test for this possibility.  

      Mice receive their water reward through a water spout that is immobile and positioned directly in front of their mouth. Water delivery is triggered by a solenoid when the mice reach the end of the virtual track. Therefore, because the water spout is immobile and the water reward is not delivered until they reach the end of the track, there is nothing for the mice to detect during their run. We have added clarifications about the water spout to the Methods and Results sections, along with appropriate discussion points.

      Additionally, we note that the ramping activity of VTA axons is still present on the initial laps with no reward (Krishnan et al., 2022), indicating that this activity is not directly related to the presence or absence of water but is instead associated with the animal’s reward expectation.

      We thank the reviewer for raising this point and hope that these clarifications address their concern.

      Reviewer #3 (Public Review):  

      Summary:  

      Heer and Sheffield provide a well-written manuscript that clearly articulates the theoretical motivation to investigate specific catecholaminergic projections to dorsal CA1 of the hippocampus during a reward-based behavior. Using 2-photon calcium imaging in two groups of cre transgenic mice, the authors examine the activity of VTA-CA1 dopamine and LC-CA1 noradrenergic axons during reward seeking in a linear track virtual reality (VR) task. The authors provide a descriptive account of VTA and LC activities during walking, approach to reward, and environment change. Their results demonstrate LC-CA1 axons are activated by walking onset, modulated by walking velocity, and heighten their activity during environment change. In contrast, VTA-CA1 axons were most activated during the approach to reward locations. Together the authors provide a functional dissociation between these catecholamine projections to CA1. A major strength of their approach is the methodological rigor of 2-photon recording, data processing, and analysis approaches. These important systems neuroscience studies provide solid evidence that will contribute to the broader field of learning and memory. The conclusions of this manuscript are mostly well supported by the data, but some additional analysis and/or experiments may be required to fully support the author's conclusions.  

      Weaknesses:  

      (1) During teleportation between familiar to novel environments the authors report a decrease in the freezing ratio when combining the mice in the two experimental groups (Figure 3aiii). A major conclusion from the manuscript is the difference in VTA and LC activity following environment change, given VTA and LC activity were recorded in separate groups of mice, did the authors observe a similar significant reduction in freezing ratio when analyzing the behavior in LC and VTA groups separately?  

      In response to the comment regarding the freezing ratios during teleportation between familiar and novel environments, we have analyzed the freezing ratios and lap velocities of DAT-Cre and NET-Cre mice separately (Fig. 3Aiii). Our analysis shows that the mean lap velocities of both groups overlap in the familiar environment and significantly decrease on the first lap of the novel environment (Fig. 3iii, top). For subsequent laps, the velocities in both groups are not statistically significantly different from the familiar environment lap velocities.

      Freezing ratios also show a statistically significant decrease on the first lap of the novel environment compared to the familiar environment in both groups (Fig. 3iii, bottom). In the NETCRE mice, the freezing ratios remain statistically lower in subsequent laps, while in the DATCRE mice, the following laps show a similar trend but without statistical significance. This lack of statistical significance in the DAT-CRE mice is likely due to their already lower freezing ratios in the familiar environment. Overall, the data demonstrate similar behavioral responses in the two groups of mice during the switch from the familiar to the novel environment.

      (2) The authors satisfactorily apply control analyses to account for the unequal axon numbers recorded in the LC and VTA groups (e.g. Figure 1). However, given the heterogeneity of responses observed in Figures 3c, 4b and the relatively low number of VTA axons recorded (compared to LC), there are some possible limitations to the author's conclusions. A conclusion that LC-CA1 axons, as a general principle, heighten their activity during novel environment presentation, would require this activity profile to be observed in some of the axons recorded in most all LC-CA1 mice.

      We agree with the reviewer’s point. To address this issue, when downsampling LC axons to compare to VTA axons, we matched the sampling statistics of the VTA axons/mice by only selecting one LC axon from each mouse to match the VTA dataset.

      Additionally, we have now included the number of recording sessions and the number of mice in which we observed each type of activity. This information has been added to further clarify and support our conclusions.

      Additionally, if the general conclusion is that VTA-CA1 axons ramp activity during the approach to reward, it would be expected that this activity profile was recorded in the axons of most all VTA-CA1 mice. Can the authors include an analysis to demonstrate that each LC-CA1 mouse contained axons that were activated during novel environments and that each VTA-CA1 mouse contained axons that ramped during the approach to reward?  

      As above, we have now added the number of mice that had each activity type we report in the paper here.  

      (3) A primary claim is that LC axons projecting to CA1 become activated during novel VR environment presentation. However, the experimental design did not control for the presentation of a familiar environment. As I understand, the presentation order of environments was always familiar, then novel. For this reason, it is unknown whether LC axons are responding to novel environments or environmental change. Did the authors re-present the familiar environment after the novel environment while recording LC-CA1 activity?  

      While we did not vary the presentation order of familiar and novel environments, we recorded the activity of LC axons in some mice when exposed to a dark environment (no VR cues) prior to exposure to the familiar environment. Our analysis of this data demonstrates that LC axons are also active following abrupt exposure to the familiar environment.

      We have added a new figure showing this response (Supplementary Figure 5A) and expanded on our original discussion point that LC axon activity generally correlates with arousal, as this result also supports that interpretation.

      We thank the reviewer for highlighting this important consideration. It certainly helps with the interpretation regarding what LC axons generally encode.  

      >Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):  

      In addition to what has been described in the public review, I have the following recommendations:  

      The sample size of DA axon recordings should be increased with the use of a single GCaMP for valid conclusions to be made about the lack of novelty-inducted activity in these axons.  

      We have increased the n of VTA GCaMP6s axons in the familiar environment by including two axons that were recorded in the familiar rewarded condition. We have also conducted an analysis comparing GCaMPs versus GCaMP7b, which is discussed in detail above.

      Regarding the concerns about valid conclusions of novelty-induced activity in VTA axons, we have added a comment in the discussion to tone down our conclusions regarding the lack of a novelty signal in the VTA axons. This valid concern is discussed in detail above.  

      The title is currently very generic, and non-informative. I recommend the use of more specific language in describing the type of behavior under investigation. It is not clear to the reviewer why 'learning' is included here.  

      Original title: “Distinct catecholaminergic pathways projecting to hippocampal CA1 transmit contrasting signals during behavior and learning”

      To make it more specific to the experiments conducted here, we have changed the title to this:

      New title: “Distinct catecholaminergic pathways projecting to hippocampal CA1 transmit contrasting signals during navigation in familiar and novel environments”

      Error noted in Figure 4C legend - remove reference to VTA ROIs.  

      The reference to VTA ROIs has been removed from the figure legend

      Reviewer #2 (Recommendations For The Authors):  

      (1) The concluding sentence of the Abstract could be more specific: which distinct types of information are reflected/'signaled'/'encoded' by LC and VTA inputs to dorsal CA1?  

      The abstract has been adjusted accordingly. The new sentence is more specific: “These inputs encode unique information, with reward information in VTA inputs and novelty and kinematic information in LC inputs, likely contributing to differential modulation of hippocampal activity during behavior and learning.”

      (2) Line 46/47: The study by Mamad et al. (2017) did not quite show that VTA dopamine input to dorsal CA1 'drives place preference'. To my understanding, the study showed that suppression of VTA dopamine signaling in a specific place caused avoidance of this place and that VTA dopamine signaling modulated hippocampal place-related firing. So, please consider rephrasing.  

      Corrected, thanks for pointing this out.

      (3) Legend to Figure 3AIII: 'Each lap was compared to the first lap in F . . .' Could you clarify if 'F' refers to the 'familiar environment?  

      Figure legend has been changed accordingly

      (4) Line 176: '36 LC neurons' - should this not be '36 imaged axon terminals in dorsal CA1' or something along these lines?  

      This reference has been changed to “LC axon ROIs”

      (5) Line 353: Why was water restriction started before the hippocampal window implant, if behavioral training to run for water reward only started after the implant? Please clarify.

      A sentence was added to the methods to explain that this was done to reduce bleeding and swelling during the hippocampal window implantation.  

      (6) Line 377: '. . . which took 10-14 days (although some mice never reached this threshold).' How many mice did not reach the criterion within 14 days? I think it is not accurate to say the mice 'never' reached the threshold, as they were only tested for a limited period of time.  

      We have added details of how many mice were excluded from each group and the reason why they were excluded.

      (7) Exclusion criteria for imaging data: The authors state (from line 402): 'Imaging sessions with large amounts of drift or bleaching were excluded from analysis (8 sessions for NET mice, 6 sessions for LC Mice).' What exactly were the quantitative exclusion criteria? Were these defined before the onset of the study or throughout the study?  

      Imaging sessions were first qualitatively assessed by looking for disappearance or movement of structures in the Z-plane throughout the imaging FOV. Additionally, following motion correction in suite2p, we used the registration metrics, which plots the first Principle Component of the motion corrected images, to assess for drift, bleaching, or heat bubbles. If this variable increased or decreased greatly throughout a session, to the point where any apparent activity was not visible in the first PC, the dataset was excluded. We have added these exclusion criteria to the methods section.

      Reviewer #3 (Recommendations For The Authors):  

      Please provide a justification or rationale for having two different criteria for immobility (< 5cm/sec) and freezing (<0.2 cm/sec). If VTA and LC axon activities are different between these two velocities, please provide some commentary on this difference.  

      This is a typo leftover from before we converted velocity from rotational units to cm/s.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewing editor’s list of items remaining to be addressed followed by our responses/actions:

      (1) The order and organization of supplemental figures and tables is almost impossible to navigate. Please put them in order. 

      All the sections from the previous Supplementary files have been divided into individual Supplementary files so that each can be referenced without confusion from the text. All of the references in the body of the text and the author responses have been updated to reflect this change.

      (2) The question of sample sizes was partially addressed, with authors stating that cell culture work in iPSCs and PGCLCs was done in replicates of 3. Sertoli and granulosa cells were generated from pooled preps - how many individuals, were they littermates? 

      Sertoli and granulosa primary cultures were generated from littermates and each prep used 5 animals (males for Sertoli cells and females for granulosa cells). These changes have been added to the body of the text on pages 39 and 40.

      (3) Authors need to discuss the limitations of doing work in triplicates. Their PCA (Supplement Figure 9) reveals that in several cases samples from the same treatment were not discriminated by PC1 and/or PC2. This is especially true in e and f, the variance of which was explained by PC1 for cell type, but for which treatments showed poor discrimination by PC2. Some discussion of the limitations of sample size should be provided.

      Additional text has been added to what is now Supplementary file 15 to acknowledge this limitation imposed by the limited number of replicates (three) and the ability to resolve the differences in treatments by PCA in subplots e and f. However, we also note that the differences were sufficient to identify significant DMCs/DMRs/DEGs.

      Reviwer 2 also noted a potential weakness that “exposures are more complicated in a whole organism than in an isolated cell line.”

      We note that in our revised manuscript we included wording noting that despite the advantages of using an in vitro approach to deduce underlying molecular mechanisms, results of such in vitro studies “ultimately warrant validation of results discerned from studies of in vitro models to ensure they also reflect functions ongoing in the more complex and heterogeneous environment of the intact animal in vivo.” Thus we have endeavored to acknowledge the reviewer’s point.

      Reviewer #1 (Public Review): 

      Critiques/Comments: 

      (1) A problem with in vitro work is that homogeneous cell lines/cultures are, by nature, absent from the rest of the microenvironment. The authors need to discuss this. 

      [Addressed on pages: 24-25] – We have added two sentences to the second paragraph of the Discussion section in which we now acknowledge this concern, but also point out that in vitro models of this sort also provide an experimental advantage in that they facilitate a deconvolution of the extensive complexity resident within the intact animal. Nevertheless, we acknowledge that this deconvolution requires ultimate validation of findings obtained within an in vitro model system to ensure they accurately recapitulate functions that occur in the intact animal in vivo.

      In response to Reviewer 2’s stated weakness of our study that “The weakness includes the fact that exposures are more complicated in a whole organism than in an isolated cell line,” please note that this added text includes the statement that despite the advantages of using an in vitro approach to deduce underlying molecular mechanisms, results of such in vitro studies “ultimately warrant validation of results discerned from studies of in vitro models to ensure they also reflect functions ongoing in the more complex and heterogeneous environment of the intact animal in vivo.” Thus we have endeavored to acknowledge the reviewer’s point.

      (2) What are n's/replicates for each study? Were the same or different samples used to generate the data for RNA sequencing, methylation beadchip analysis, and EM-seq? This clarification is important because if the same cultures were used, this would allow comparisons and correlations within samples.  

      Addressed on pages: 39-45 and in new Supplementary file 15 – Additional text has been added in the Methods section to indicate that all samples involving cell culture models which include iPSCs and PGCLCs came from a single XY iPS cell line aliquoted into replicates and all primary cultures which included Sertoli and granulosa cells were generated from pooled tissue preps from mice and then aliquoted into replicates. Finally, all experiments in the study were performed on three replicates. Because this experimental design did indeed allow for comparisons among samples, we have added a new Supplementary file 15

      which displays PCA plots showing clustering among control and treatment datasets, respectively, as well as distinctions between each cluster representing each experimental condition.

      (3) In Figure 1, it is interesting that the 50 uM BPS dose mainly resulted in hypermethylation whereas 100 uM appears to be mainly hypomethylation. (This is based on the subjective appearance of graphs). The authors should discuss and/or present these data more quantitatively. For example, what percentage of changes were hypo/hypermethylation for each treatment? How many DMRs did each dose induce? For the RNA-seq results, again, what were the number of up/down-regulated genes for each dose?  

      Addressed on pages: 6-7 and in new Supplementary files 1-3  – The experiment shown in Figure 1 was designed to 1) serve as proof of principle that cells maintained in culture could be susceptible to EDC-induced epimutagenesis at all, 2) determine if any response observed would be dose-dependent, and 3) identify a minimally effective dose of BPS to be used for the remaining experiments in this study (which we identified as 1 μM). We agree that it is interesting that the 50 µM dose of BPS induced predominantly hypermethylation changes whereas the 1 µM and 100 µM doses induced predominantly hypomethylation changes, but are not in a position to offer a mechanistic explanation for this outcome at this time. As the results shown satisfied our primary objectives of demonstrating that exposure of cells in culture to BPS could indeed induce DNA methylation epimutations, that this occurs in a dose-dependent manner, and that a dose of as low as 1 µM of BPS was sufficient to induce epimutagenesis, the data obtained satisfied all of the initial objectives of this experiment. That said, in response to the reviewer’s request we have now added text on pages 6-7 alluding to new Supplementary files 1-3 indicating the total number of DMCs and DMRs, as well as the number of DEGs, detected in response to exposure to each dose of BPS shown in Figure 1, as well as stratifying those results to indicate the numbers of hyper- and hypomethylation epimutations and up- and down-regulated DEGs induced in response to each dose of BPS. While, as noted above, investigating the mechanistic basis for the difference in responses induced by the 50 µM versus 1 and 100 µM doses of BPS was beyond the scope of the study presented in this manuscript, we do find this result reminiscent of the “U-shaped” response curves often observed in toxicology studies. Importantly, this result does demonstrate the elevated resolution and specificity of analysis facilitated by our in vitro cell culture model system.

      (4) Also in Figure 1, were there DMRs or genes in common across the doses? How did DMRs relate to gene expression results? This would be informative in verifying or refuting expectations that greater methylation is often associated with decreased gene expression.  

      Addressed on pages: 6-7 and new Supplementary files 1-6 – In general, we observed a coincidence between changes in DNA methylation and changes in gene expression (Supplementary files 1-3). Pertaining directly to the reviewer’s question about the extent to which we observed common DMRs and DEGs across all doses, while we only found 3 overlapping DMRs conserved across all doses tested, we did find an average of 51.25% overlap in DMCs and an average of 80.45% overlap in DEGs across iPSCs exposed to the different doses of BPS shown in Figure 1. In addition, within each dose of BPS tested in iPSCs, we also found that there was an overlap between DMCs and the promoters or gene bodies of many DEGs (Supplementary file 5). Specifically within gene promoters, we observed a correlation between hypermethylated DMCs and decreased gene expression and hypomethylated DMCs and increased gene expression, respectively (Supplementary file 6).

      (5) In Figure 2, was there an overlap in the hypo- and/or hyper-methylated DMCs? Please also add more description of the data in 2b to the legend including what the dot sizes/colors mean, etc. Some readers (including me) may not be familiar with this type of data presentation. Some of this comes up in Figure 4, so perhaps allude to this earlier on, or show these data earlier.  

      Addressed on pages: 8-9 and new Supplementary file 4 – We observed an average of 11.05% overlapping DMCs between different pairs of cell types, we did not observe any DMCs that were shared among all four cell types. Indeed, this limited overlap of DMCs among different cell types exposed to BPS was the primary motivation for the analysis described in Figure 2. Thus, instead of focusing solely on direct overlap between specific DMCs, we instead examined similarities among the different cell types tested in the occurrence of epimutations within different annotated genomic regions. To better describe this, we have now added additional text to page 9. We have also added more detail to the legend for Figure 2 on page 8 to more clearly explain the significance of the dot sizes and colors, explaining that the dot sizes are indicative of the relative number of differentially methylated probes that were detected within each specific annotated genomic region, and that the dot colors are indicative of the calculated enrichment score reflecting the relative abundance of epimutations occurring within a specific annotated genomic region. The relative score is calculated by iterating down the list of DMCs and increasing a running-sum statistic when encountering a DMC within the specific annotated genomic region of interest and decreasing the sum when the epimutation is not in that annotated region. The magnitude of the increment depends upon the relative occurrence of DMCs within a specific annotated genomic region.

      (6) iPSCs were derived from male mice MEFs, and subsequently used to differentiate into PGCLCs. The only cell type from an XX female is the granulosa cells. This might be important, and should be mentioned and its potential significance discussed (briefly).  

      Addressed on page: 29 – We have added a new paragraph just before the final paragraph of the Discussion section in which we acknowledge that most of the cell types analyzed during our study were XY-bearing “male” cells and that the manner in which XX-bearing “female” cells might respond to similar exposures could differ from the responses we observed in XY cells. However, we also noted that our assessment of XX-bearing granulosa cells yielded results very similar to those seen in XY Sertoli cells suggesting that, at least for differentiated somatic cell types, there does not appear to be a significant sex-specific difference in response to exposure to a similar dose of the same EDC. That said, we also acknowledged that in cell types in which dosage compensation based on X-chromosome inactivation is not in place, differences between XY- and XX-bearing cells could accrue.

      (7) EREs are only one type of hormone response element. The authors make the point that other mechanisms of BPS action are independent of canonical endocrine signaling. Would authors please briefly speculate on the possibility that other endocrine pathways including those utilizing AREs or other HREs may play a role? In other words, it may not be endocrine signaling independent. The statement that the differences between PGCLCs and other cells are largely due to the absence of ERs is overly simplistic.  

      Addressed on page: 11 and in a new Supplementary file 8  – Previous reports have indicated that BPS does not have the capacity to bind with the androgen receptor (Pelch et al., 2019; Yang et al., 2024). However there have been reports indicating that BPS can interact with other endocrine receptors including PPARγ and RXRα, which play a role in lipid accumulation and the potential to be linked to obesity phenotypes (Gao et al., 2020; Sharma et al., 2018). To address the reviewer’s comment we assessed the expression of a panel of hormone receptors including PPARγ, RXRα, and AR  in each of the cell types examined in our study and these results are now shown in a new Supplementary file 8. We show that in addition to not expressing either estrogen receptor (ERa or ERb), germ cells also do not express any of the other endocrine receptors we tested including AR, PPARγ, and RXRα. Thus we now note that these results support our suggestion that the induction of epimutations we observed in germ cells in response to exposure to BPS appears to reflect disruption of non-canonical endocrine signaling. We also note that non-canonical endocrine signaling is well established (Brenker et al., 2018; Ozgyin et al., 2015; Song et al., 2011; Thomas and Dong, 2006). Thus we feel the suggestion that the effects of BPS exposure could conceivably reflect either disruption of canonical or non-canonical signaling in any cell type is well justified and that our data suggests that both of these effects appear to have accrued in the cells examined in our study as suggested in the text of our manuscript.

      (8) Interpretation of data from the GO analysis is similarly overly simplistic. The pathways identified and discussed (e.g. PI3K/AKT and ubiquitin-like protease pathways) are involved in numerous functions, both endocrine and non-endocrine. Also, are the data shown in Figure 6a from all 4 cell types? I am confused by the heatmap in 6c, which genes were significantly affected by treatment in which cell types?  

      Addressed on pages: 19-21 – Per the reviewer’s request, we have added text to indicate that Figure 6a is indeed data from all four cell types examined. We have also modified the text to further clarify that Figure 6c displays the expression of other G-coupled protein receptors which are expressed at similar, if not higher, levels than either ER in all cell types examined, and that these have been shown to have the potential to bind to either 17β-estradiol or BPA in rat models. As alluded to by the reviewer, this is indicative of a wide variety of distinct pathways and/or functions that can potentially be impacted by exposure to an EDC such as BPS. Thus, we have attempted to acknowledge the reviewer’s primary point that BPS may interact with a variety of receptors or other factors involved with a wide variety of different pathways and functions. Importantly, this illustrates the strength of our model system in that it can be used to identify potential impacted target pathways that can then be subsequently pursued further as deemed appropriate.

      (9) In Figure 7, what were the 138 genes? Any commonalities among them? 

      Addressed on page: 22 and in a new Supplementary files 13 and 14 – We have now added a new supplemental Excel file (Supplementary file 13) that lists the 138 overlapping conserved DEGs that did not become reprogrammed/corrected during the transition from iPSCs to PGCLCs. In addition, we have added new text on page 22 and a new Supplementary file 14 which displays KEGG analysis of pathways associated with these 138 retained DEGs. We find that these genes are primarily involved with cell cycle and apoptosis pathways which, interestingly, have the potential to be linked to cancer development which is often linked to disruptions in chromatin architecture.

      (10) The Introduction is very long. The last paragraph, beginning line 105, is a long summary of results and interpretations that better fit in a Discussion section.

      Addressed on page: 6 – We have now significantly reduced the length and scope of the final paragraph of the Introduction per the reviewer’s recommendation.

      (11) Provide some details on husbandry: e.g. were they bred on-site? What food was given, and how was water treated? These questions are to get at efforts to minimize exposure to other chemicals.  

      Addressed on page: 37 – We have added additional text detailing that all mice used in the project were bred onsite, water was non-autoclaved conventional RO water, and our selection of 5V5R extruded feed for mice used in this study which was highly controlled for the presence of isoflavones and has been certified to be used for estrogen-sensitive animal protocols.

      Reviewer #2 (Public Review): 

      Summary: 

      This manuscript uses cell lines representative of germ line cells, somatic cells, and pluripotent cells to address the question of how the endocrine-disrupting compound BPS affects these various cells with respect to gene expression and DNA methylation. They find a relationship between the presence of estrogen receptor gene expression and the number of DNA methylation and gene expression changes. Notably, PGCLCs do not express estrogen receptors and although they do have fewer changes, changes are nevertheless detected, suggesting a nonconical pathway for BPS-induced perturbations. Additionally, there was a significant increase in the occurrence of BPS-induced epimutations near EREs in somatic and pluripotent cell types compared to germ cells. Epimutations in the somatic and pluripotent cell types were predominantly in enhancer regions whereas that in the germ cell type was predominantly in gene promoters. 

      Strengths: 

      The strengths of the paper include the use of various cell types to address the sensitivity of the lineages to BPS as well as the observed relationship between the presence of estrogen receptors and changes in gene expression and DNA methylation. 

      Weaknesses: 

      The weaknesses include the lack of reporting of replicates, superficial bioinformatic analysis, and the fact that exposures are more complicated in a whole organism than in an isolated cell line. 

      Recommendations for the authors: please note that you control which revisions to undertake from the public reviews and recommendations for the authors. 

      Reviewer #2 (Recommendations For The Authors): 

      Overall, this is an intriguing paper but more transparency in the replicates and methods and a more rigorous bioinformatic treatment of the data are required. 

      Specific comments: 

      (1) End of abstract "These results suggest a unique mechanism by which an EDC-induced epimutated state may be propagated transgenerationally following a single exposure to the causative EDC." This is overly speculative for an abstract. There is only epigenetic inheritance following mitosis or differentiation presented in this study. There is no meiosis and therefore no ability to assess multi- or transgenerational inheritance. 

      Addressed on page: 2 – We have modified the text at the end of the abstract to more precisely reflect our intended conclusions based on our data. In our view, the ability of induced epimutations to transcend meiosis per se is not as relevant to the mechanism of transgenerational inheritance as their ability to transcend major waves of epigenetic reprogramming that normally occur during development of the germ line. In this regard the transition from pluripotent iPSCs to germline PGCLCs has been shown to recapitulate at least the first portion of normal germline reprogramming, and now our data provide novel insight into the fate of induced epimutations during this process. Specifically, we show that a prevelance of epimutations was conserved during the iPSC à germ cell transition but that very few (< 5%) of the specific epimutations present in the the BPS-exposed iPSCs were retained when those cells were induced to form PGCLCs. Rather, we observed apparent correction of a large majority of the initially induced epimutations during this transition, but this was accompanied by the apparent de novo generation of novel epimutations in the PGCLCs. We suggest, based on other recent reports in the literature, that this is a result of the BPS exposure inducing changes in the chromatin architecture in the exposed iPSCs such that when the normal germline reprogramming mechanism is imposed on this disrupted chromatin template there is both correction of many existing epimutations and the genesis of many novel epimutations. This observation has the potential to explain the long-standing question of why the prevalence of epimutations persists across multiple generations despite the occurrence of epigenetic reprogramming during each generation. Nevertheless, as noted above, we have modified the text at the end of the abstract to temper this interpretation given that it is still somewhat speculative at this point.

      (2) Doses used in the experiments. One needs to be careful when stating that the dose used is "below FDA's suggested safe environmental level established for BPA" because a different bisphenol is being used here (BPA vs BPS) and the safe level is that which the entire organism experiences. It is likely that cell lines experience a higher effective dose.  

      Addressed on pages: 3, 5, and 26 – We have now made a point of noting that our reference to an EPA-recommended “safe dose” of BPA was for humans and/or intact animals. Changes to this effect have been made in the second and sixth paragraphs of the Introduction section. In addition, we have added text at the end of the fourth paragraph of the Discussion section acknowledging that, as the reviewer suggests, the same dose of an EDC could exert greater effects on cells in a homogeneous culture than on the same cell type within an intact animal given the potential for mitigating metabolic effects in the latter. However, we also note that the ability we demonstrated to quantify the effects of such exposures on the basis of numbers of epimutations (DMCs or DMRs) induced could potentially be used in future studies to study this question by assessing the effects of a specific dose of a specific EDC on a specific cell type when exposed either within a homogeneous culture or within an intact animal.

      (3) Figure 1: In the dose response, what was the overlap in DMCs and DEGs among the 3 doses? Are the responses additive, synergistic, or completely non-overlapping? This is an important point that should be addressed. 

      Addressed on page: 6-7 and in Supplementary files 1-5 – Please see our response to Reviewer 1 critique #4 above where we address similar concerns. While we do find overlap among different cell types with respect to the DMCs, DMRs, and DEGs displayed in Figure 1, we found the effect to be only partially additive as opposed to synergistic in any apparent manner. The fold increase in DMCs, DMRs, and DEGs resulting from exposure to doses of 1 μM or 50 μM ranged from 2.5x to 4.4x, which was well below the 50x increase that would have been expected from a strictly additive effect, and the effect increased even less, if at all, in response to exposure to doses of 50 μM versus 100 μM BPS. Finally, as now noted in the Discussion section on page 25, our conclusion is that these results display a limited dose-dependent effect that was partially additive but also plateaued at the highest doses tested.

      (4) Methods: How many times was each exposure performed on a given cell type? This information should be in the figure legends and methods. In the case of multiple exposures for a given line, do the biological replicates agree? 

      Addressed on pages: 39-45 and in new Supplementary file 15 –  Please see our response to Reviewer 1 critique #2 where we address similar concerns with newly added text and analysis. We now note repeatedly on pages 39-45 that each analysis was conducted on three replicate samples, and we display the similarity among those replicates graphically in a new Supplementary file 15.

      (5) DNA methylation analyses. Very little analysis is presented on the BeadChip array other than hypermethylated/hypomethylated and genomic regions of DMCs. What is the range of methylation changes? Does it vary between hypo vs. hyper DMCs? How many array experiments were performed (biological replicates) and what stats were used to determine the DMCs? Are there DMCs in common among the various cell types? As an example, if more meaningful analysis, one can plot the %5mC over a given array for comparisons between control and treated cell types. For more granularity, the %5mC can be presented according to the element type (enhancers vs promoters). 

      Addressed on pages: 10 and 39-45 and in new Supplementary files 1-5, 15 –  Please see our response to Reviewer 1 critique #2 above where we address similar concerns regarding the number of biological replicates used in this study. DMCs on the Infinium array are identified using mixed linear models. This general supervised learning framework identifies CpG loci at which differential methylation is associated with known control vs. treated co-variates. CpG probes on the array were defined as having differential changes that met both p-value and FDR (≤ 0.05) significant thresholds between treatment and control samples for each cell type analyzed. The range of medians across all samples was 0.0278 to 0.0059 for hypermethylated beta values and -0.0179 to -0.0033 for hypomethylated beta values. As noted above, we did observe an overlap in DMCs between cell types. Thus, we observed an average of 11.05% overlapping DMCs between two or more cell types but we did not observe any DMCs shared between all four cell types. We have added additional text on page 9 and new Supplementary files 1-5 to now more clearly describe that this limited similarity in direct overlap of DMCs was the underlying motivation for the analysis described in Figure 2. Finally, the enrichment dot plots shown in Figure 2 provide the information the reviewer requested regarding the %5mC observed at different annotated genomic element types.

      (6) The investigators correlate the number of DMCs in a given cell type with the presence of estrogen receptors. Does the correlation extend to the methylation difference (delta beta) at the statistically different probes?

      Addressed in a new Supplementary file 7 – We have added a new Supplementary file 7 in which we provide data addressing this question. In brief, we find that the delta betas of probes enriched at enhancer regions and associated with relative proximity to ERE elements in Sertoli cells, granulosa cells, and iPSCs appear very similar to those associated with DMCs not located within these enriched regions. However, when we compared the similarity of the two data sets with goodness of fit tests, we found these relatively small differences were, in fact, statistically significant based on a two-sample Kolmogorov-Smirnov test. These observed significant differences appear to indicate that there is higher variability among the delta betas associated with hypomethylated, but not hypermethylation changes occurring at DMCs associated with enhancers, potentially suggesting a greater tendency for exposure to BPS to induce hypomethylation rather than hypermethylation changes, at least in these specific regions.

      (7) Methylation changes relative to EREs are presented in multiple figures. Are other sequences enriched in the DMCs? 

      Addressed in a new Supplementary file 11. We profiled the genomic sequence within 500 bp of cell type-specific enriched DMCs that were either associated with enhancer regions in Sertoli, granulosa, or iPS cells or transcription factor binding sites in PGCLCs for the identification of higher abundance motif sequences. We then compared any motifs identified with the JASPAR database to potentially find transcription factors that could be binding to these regions. Interestingly we found that the two most common motifs across all cell types were associated with either the chromatin remodeling transcription factor HMG1A or the pluripotency factor KLF4.

      (8) Please present a correlation plot between the methylation differences and the adjacent DEGs. Again, the absence of consideration of the absolute changes in methylation and gene expression minimizes the impact of the data. 

      Addressed on pages 6, 7, and 17 and in a new Supplementary file 6 – We analyzed the relationship between DMCs at DEGs promoter regions and the corresponding change in expression of that DEG. Our data support a relationship between up-regulated genes showing decreased methylation in promoter regions and down-regulated genes showing increased methylation at promoter regions, although there were some exceptions to this relationship.

      (9) EM-Seq is mentioned in Figure 7 and in the material and methods. Where is it used in this study? 

      Addressed on page 22 – We now note in the text on page 22 that EM-seq was used during experiments assessing the propagation of BPS-induced epimutations during the iPSC à EpiLC à PGCLC cell state transitions to gather higher resolution data of changes to DNA methylation differences at the whole-epigenome level.

      References

      Brenker C, Rehfeld A, Schiffer C, Kierzek M, Kaupp UB, Skakkebæk NE, Strünker T. 2018. Synergistic activation of CatSper Ca2+ channels in human sperm by oviductal ligands and endocrine disrupting chemicals. Hum Reprod 33:1915–1923. doi:10.1093/humrep/dey275

      Gao P, Wang L, Yang N, Wen J, Zhao M, Su G, Zhang J, Weng D. 2020. Peroxisome proliferator-activated receptor gamma (PPARγ) activation and metabolism disturbance induced by bisphenol A and its replacement analog bisphenol S using in vitro macrophages and in vivo mouse models. Environ Int 134. doi:10.1016/J.ENVINT.2019.105328

      Ozgyin L, Erdos E, Bojcsuk D, Balint BL. 2015. Nuclear receptors in transgenerational epigenetic inheritance. Prog Biophys Mol Biol. doi:10.1016/j.pbiomolbio.2015.02.012

      Pelch KE, Li Y, Perera L, Thayer KA, Korach KS. 2019. Characterization of Estrogenic and Androgenic Activities for Bisphenol A-like Chemicals (BPs): In Vitro Estrogen and Androgen Receptors Transcriptional Activation, Gene Regulation, and Binding Profiles. Toxicol Sci 172:23–37. doi:10.1093/TOXSCI/KFZ173

      Sharma S, Ahmad S, Khan MF, Parvez S, Raisuddin S. 2018. In silico molecular interaction of bisphenol analogues with human nuclear receptors reveals their stronger affinity vs. classical bisphenol A. Toxicol Mech Methods 28:660–669. doi:10.1080/15376516.2018.1491663

      Song K-H, Lee K, Choi H-S. 2011. Endocrine Disrupter Bisphenol A Induces Orphan Nuclear Receptor Nur77 Gene Expression and Steroidogenesis in Mouse Testicular Leydig Cells. Endocrinology 143:2208–2215. doi:10.1210/endo.143.6.8847

      Thomas P, Dong J. 2006. Binding and activation of the seven-transmembrane estrogen receptor GPR30 by environmental estrogens: A potential novel mechanism of endocrine disruption. J Steroid Biochem Mol Biol 102:175–179. doi:10.1016/j.jsbmb.2006.09.017

      Yang Z, Wang L, Yang Y, Pang X, Sun Y, Liang Y, Cao H. 2024. Screening of the Antagonistic Activity of Potential Bisphenol A Alternatives toward the Androgen Receptor Using Machine Learning and Molecular Dynamics Simulation. Environ Sci Technol 58:2817–2829. doi:10.1021/ACS.EST.3C09779/ASSET/IMAGES/LARGE/ES3C09779_0004.JPEG

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      [...] Strengths:

      The authors have generated a novel transgenic mouse line to specifically label mature differentiated oligodendrocytes, which is very useful for tracing the final destiny of mature myelinating oligodendrocytes. Also, the authors carefully compared the distribution of three progenitor cre mouse lines and suggested that Gsh-cre also labeled dorsal OLs, contrary to the previous suggestion that it only marks LGE-derived OPCs. In addition, the author also analyzed the relative contributions of OLs derived from three distinct progenitor domains in other forebrain regions (e.g. Pir, ac). Finally, the new transgenic mouse lines and established multiple combinatorial genetic models will facilitate future investigations of the developmental origins of distinct OL populations and their functional and molecular heterogeneity.

      Weaknesses:

      Since OpalinP2A-Flpo-T2A-tTA2 only labels mature oligodendrocytes but not OPCs, the authors can not suggest that the lack of LGE/CGE-derived-OLs in the neocortex is less likely caused by competitive postnatal elimination, but more likely due to limited production and/or allocation (line 118-9). It remains possible that LGE/CGE-derived OPCs migrate into the cortex but are later eliminated.

      We are glad that the reviewer appreciates our work and are grateful for the positive comments and the constructive suggestion. We agree with the reviewer that our methodology by itself cannot suggest whether the lack of LGE/CGE-derived-OLs in the neocortex is caused by competitive postnatal elimination or not. That is why we cited a parallel work by Li et al. (ref [17] in the original manuscript; ref [19] in the revised manuscript), in which in utero electroporation (IUE) failed to label LGE-derived OL lineage cells in both embryonic and early postnatal brains. Although they did not directly explore CGE using IUE, their fate mapping results using Emx1-Cre; Nkx2.1-Cre; H2B-GFP at P0 and P10 revealed very low percentage of LGE/CGE-derived OL lineage cells. The lack of adult labeling in our study together with the lack of developmental labeling in the other study prompted us to hypothesize that the lack of LGE/CGE-derived-OLs in the neocortex is less likely caused by competitive postnatal elimination, but more likely due to limited production and/or allocation. In the revised manuscript, we have expanded the discussion to explain this point more clearly.

      Reviewer #2 (Public Review):

      [...] Strengths:

      The strength and novelty of the manuscript lies in the elegant tools generated and used and which have the potential to elegantly and accurately resolve the issue of the contribution of different progenitor zones to telencephalic regions.

      We are glad that the reviewer appreciates our work and are grateful for the overall positive comments.

      Weaknesses:

      (1) Throughout the manuscript (with one exception, lines 76-78), the authors quantified OL densities instead of contributions to the total OL population (as a % of ASPA for example). This means that the reader is left with only a rough estimation of the different contributions.

      We thank the reviewer for this constructive suggestion. We have replaced the density quantification (Figure 2F and 3D in the original manuscript) with contributions to the total OL population (% of ASPA) (Figure 2J and 2N in the revised manuscript).

      (2) All images and quantifications have been confined to one level of the cortex and the potential of the MGE and the LGE/CGE to produce oligodendrocytes for more anterior and more posterior cortical regions remains unexplored.

      The quantifications were not confined to one level of the cortex but were performed in brain sections ranging from Bregma +1.94 to -2.80 mm, as shown in Supplementary Figure 2A-B in the original manuscript. We apologize for not having stated and presented this information clearly enough, and for the confusions it may have caused. In the revised manuscript, we have added relevant descriptions in the “Material and Methods” section (line 199-200*) and schematics along with representative images of more anterior and more posterior cortical regions (Supplementary Figure 2A-D).

      (3) Hence, the statement that "In summary, our findings significantly revised the canonical model of forebrain OL origins (Figure 4A) and provided a new and more comprehensive view (Figure 4B )." (lines 111, 112) is not really accurate as the findings are neither new nor comprehensive. Published manuscripts have already shown that (a) cortical OLs are mostly generated from the cortex [Tripathi et al 2011 (https://doi.org/10.1523/JNEUROSCI.6474-10.2011), Winker et al 2018 (https://doi.org/10.1523/JNEUROSCI.3392-17.2018) and Li et al (https://doi.org/10.1101/2023.12.01.569674)] and (b) MGE-derived OLs persist in the cortex [Orduz et al 2019 (https://doi.org/10.1038/s41467-019-11904-4) and Li et al 2024 (https://doi.org/10.1101/2023.12.01.569674)]. Extending the current study to different rostro-caudal regions of the cortex would greatly improve the manuscript.

      As explained in the response to comment (2), our original quantifications included different rostro-caudal regions of the cortex. In the revised manuscript, we have added more schematics and representative images in the Supplementary Figure 2 for better illustration to resolve the concern of comprehensiveness.

      We thank the reviewer for listing and summarizing highly relevant published researches along with the parallel study by Li et al. submitted to eLife. We apologize for the omission of the first two references in our original manuscripts and have cited them in appropriate places (ref [10] and ref [11] in the revised manuscript). However, we believe these works do not compromise the novelty and significance of our work for the following reasons:

      (1) Tripathi et al. 2011 (ref [10] in the revised manuscript) analyzed OL lineage cells in the corpus callosum and the spinal cord, but not in the cortex and anterior commissure. Their analysis was performed in juvenile mice (P12/13), not in adulthood. Most importantly, their analysis of ventrally derived OL lineage cells relied on lineage tracing using Gsh2Cre, which in fact also label OLs derived from Gsh2+ dorsal progenitors. In contrast, we analyzed mature OLs in the cortex, corpus callosum and anterior commissure in 2-month-old adult mice. We used intersectional and subtractive strategy to label OLs derived from dorsal, LGE/CGE and MGE/POA origins. Our strategy differentiated the two different ventral lineages (LGE/CGE vs. MGE/POA) and avoided mixed labeling of OLs from ventral and dorsal Gsh2+ progenitors.

      (2) Winkler et al. 2018 (ref [11] in the revised manuscript) analyzed OLs derived from dorsal progenitors but only quantified those in the gray matter and the white matter of somatosensory cortex. Their quantification relied on co-staining with Olig2/Sox10, and thereby included both oligodendrocyte precursors (OPCs) and OLs. In contrast, we analyzed mature OLs from three origins and quantified not only neocortical regions (Mo and SS) but also an archicortical region (Pir). Our analysis revealed that although dorsally derived OLs dominate neocortex, ventrally derived OLs, especially the LGE/CGE-derived ones, dominate piriform cortex.

      (3) Orduz et al. 2019 (ref [7] in the original manuscript and the revised manuscript) mainly focused on POA-derived OLs in the somatosensory cortex. Although they performed limited analysis on MGE/POA-derived OPCs at postnatal day 10 and 19, no quantification of MGE/POA-derived OLs was performed in terms of their density, contribution to the total OL population and spatial distribution in the cortex. In contrast, we performed systematic quantification on these aspects to demonstrate that MGE/POA-derived OLs make small but sustained contribution to cortex with a distribution pattern distinctive from those derived from the dorsal origin.

      (4) Li et al. 2024 (ref [17] in the original manuscript and [19] in the revised manuscript) is a parallel study submitted to eLife. Their and our independent discoveries nicely complemented each other. Using different sets of techniques and experiments but some shared genetic mouse models, we both found that LGE/CGE made minimum contribution to neocortical OLs. Their analysis in the prenatal and early postnatal stages together with our analysis in the adult brain painted a more comprehensive picture of cortical oligodendrogenesis. The uniqueness of our work is that we performed systematic quantification of all three origins and uncovered the differential contributions to neocortex, piriform cortex, corpus callosum and anterior commissure.

      In summary, our work developed novel strategies to faithfully trace OLs from the three different origins and performed systematic analysis in the adult brain. Our data uncovered their differential contributions to neocortex, piriform cortex and the two commissural white matter tracts, which significantly differ not only from the canonical view but also from other previous studies in aspects discussed above. We believe our discoveries did significantly revise the canonical model of forebrain OL origins and provided a new and more comprehensive view.

      Reviewer #3 (Public Review):

      [...] Intriguingly, by using an indirect subtraction approach, they hypothesize that both Emx1-negative and Nkx2.1-negative cells represent the progenitors from lateral/caudal ganglionic eminences (LC), and conclude that neocortical OLs are not derived from the LC region.The authors claim that Gsh2 is not exclusive to progenitor cells in the LC region (PMID: 32234482). However, Gsh2 exhibits high enrichment in the LC during early embryonic development. The presence of a small population of Gsh2-positive cells in the late embryonic cortex could originate/migrate from Gsh2-positive cells in the LC at earlier stages (PMID: 32234482). Consequently, the possibility that cortical OLs derived from Gsh2+ progenitors in LC could not be conclusively ruled out. Notably, a population of OLs migrating from the ventral to the dorsal cortical region was detected after eliminating dorsal progenitor-derived OLs (PMID: 16436615).

      The indirect subtraction data for LC progenitors drawn from the OpalinFlp-tdTOM reporter in Emx1-negative and Nkx2.1-negative cells in the OpalinFlp::Emx1Cre::Nkx2.1Cre::RC::FLTG mouse line present some caveats that could influence their conclusion. The extent of activity from the two Cre lines in the OpalinFlp::Emx1Cre::Nkx2.1Cre::RC::FLTG mice remains uncertain. The OpalinFlp-tdTOM expression could occur in the presence of either Emx1Cre or Nkx2.1Cre, raising questions about the contribution of the individual Cre lines. To clarify, the authors should compare the tdTOM expression from each individual Cre line, OpalinFlp::Emx1Cre::RC::FLTG or OpalinFlp::Nkx2.1Cre::RC::FLTG, with the combined OpalinFlp::Emx1Cre::Nkx2.1Cre::RC::FLTG mouse line. This comparison is crucial as the results from the combined Cre lines could appear similar to only one Cre line active.

      Overall, the authors provided intriguing findings regarding the origin and fate of oligodendrocytes from different progenitor cells in embryonic brain regions. However, further analysis is necessary to substantiate their conclusion about the fate of LC-derived OLs convincingly.

      We thank the reviewer for these thoughtful comments. We agree with the reviewer that the presence of Gsh2-positive cells in the late embryonic cortex by itself could not rule out the possibility that they originate/migrate from Gsh2-positive cells in the LC at earlier stages. Staining dorsal-lineage intermediate progenitors with Gsh2, or performing intersectional lineage tracing using Gsh2Cre along with a dorsal-specific Flp driver, would provide more direct evidence on this issue. Nonetheless, as our lineage tracing of LGE/CGE-derive OLs did not employ Gsh2Cre, the doubt on the identity of Gsh2+ cortical progenitors should not affect the interpretation of our data.

      Regarding the subtractional LCOL labeling strategy used in our study, we wonder if there was any misunderstanding by the reviewer. As stated in our manuscript (line 59-61) and reiterated by the reviewer, OpalinFlp::Emx1Cre::Nkx2.1Cre::RC::FLTG labels OLs derived from progenitors that express neither Emx1Cre nor Nkx2.1Cre. As these two progenitor pools do not overlap with each other, there is a purely additive effect of their actions. If there is any concern about efficiency and specificity, it would be non-adequate Cre-mediated recombinations that lead to mislabeling of dOLs or MPOLs as LCOLs (i.e., OLs derived from Emx1 or Nkx2.1-expressing progenitors were not successfully “subtracted” and thereby “wrongly” retained RFP expression). Therefore, the bona-fide LGE/CGE-derive OLs would only be fewer but not more than RFP+ LCOLs labeled by our subtractional strategy, even if any of the Cre lines did not work efficiently enough. In any case, this would not affect our conclusion that LGE/CGE-derive OLs make a minimal contribution to neocortex, as the “ground truth” contribution by LGE/CGE could only be less but not more than what we have observed using the current strategy.

      In support of our conclusion, a parallel study by Li et al. 2024 (ref [17] in the original manuscript; ref [19] in the revised manuscript) also provided independent experimental evidence that “any contribution of oligodendrocyte precursors to the developing cortex from the lateral ganglionic eminence is minimal in scope (quoted from its eLife assessment).” In addition, in their revision, they performed Gsh2 immunostaining in P0 Emx1Cre::HG-loxP mouse and found nearly all Gsh2+ cells in the cortical SVZ were derived from the Emx1+ lineage. We are glad that this additional piece of evidence further clarified the case, but still want to emphasize that the subtractional strategy we took was designed purposefully to avoid the potential uncertainty of Gsh2Cre and to more faithfully label LGE/CGE-derived OLs. Therefore, the validity of our conclusion about the fate of LC-derived OLs should be independent from the question on the identity of Gsh2+ cortical progenitors and stands well by itself.

      We hope that these explanations have adequately addressed the reviewer’s concerns. 

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      In Figures 2C, 2D, 2E and 3D, the authors should provide counts of labelled cells as a % of ASPA+ cells. This will give an accurate picture of the contribution of the different progenitor regions to OLs.

      The graphs in Figure 2F are unnecessary since they are simply repeats of C-E but re-arranged.

      We thank the reviewer for the valuable suggestions. These two recommendations are sort of related, and thereby we made the following changes. We replaced the density quantification in Figure 2F and 3D with % of ASPA (Figure 2J and 2N in the revised manuscript) to give an accurate picture of the contribution of the different progenitor regions to OLs, as suggested by the reviewer. We still retained the density counts in Figure 2C-E (Figure 2G-I in the revised manuscript). Together with quantifications of rotral-caudal and larminar distributions presented in Supplementary Figure 2, these data demonstrated that OLs from differential origins display distinct spatial distribution patterns.

      At what ages were the quantifications performed in all the figures?

      We apologize for the omission of this information in the original manuscript. All quantifications were performed in 2-month-old adult mice. We have added this information in the “Material and Methods” section of the revised manuscript.

      In 2D, and 3B the GFP should have been activated but the authors do not show it or quantify it presumably because GFP would flood the sections in the presence of Emx1Cre. Nevertheless, since eGFP is shown in the diagram in 2B, the authors should mention why they chose not to show it.

      We thank the reviewer for the helpful comment and the suggestion. We have modified the schematic in Figure 2B and added explanation in the figure legend (line 308-313). We also added a schematic in Supplementary Figure 1A along with images of GFP channel in Supplementary Figure 1D (line 338-350).

      All the main figures and supplementary figures are too small to see properly.

      We are sorry that there was severe compression of images in the combined manuscript file at the conversion step during the initial submission. We apologize for the compromised image quality and have re-uploaded full-size figures as individual files on BioRxiv soon after receiving the reviews. For the revised manuscript, we also take care to upload full-size figures at high resolution as individual files to ensure their quality of presentation.

      Supplementary Figure 2E is unnecessary and perhaps misleading the reader that cortical-derived OLs have a preference for the lower layers whereas the distribution may simply reflect the distribution of OLs in the cortex.

      We thank the reviewer for the helpful comment and the suggestion. We have removed this panel and replaced it with quantifications of relative laminar distributions of the total (ASPA+) OLs along with those from the three different origins (Supplementary Figure 2G in the revised manuscript). Indeed, the preference for the lower layers of dorsally-derived OLs mirrored the distribution of total OLs in the cortex, while the MGE/POA-derived OLs deviate significantly from others and exhibit higher preference towards layer 4.

      Quantification of labelled cells as a % of ASPA should also be performed in Supplementary Figure 3.

      We thank the reviewer for this suggestion. In the revised manuscript, we have included quantifications of labelled cells as % of ASPA for both OpalinFlp::Emx1Cre::Ai65 and  OpalinFlp::Nkx2.1Cre::Ai65 (Figure 2J and N). The sum of the these two data sets will be equivalent to those of OpalinFlp::Emx1Cre::Nkx2.1Cre::Ai65 shown in Supplementary Figure 3, and thereby we did not perform additional quantifications to avoid redundant efforts.

      Imaging and quantification should be extended to more posterior regions of the cortex to find out whether the contribution is different from the areas already examined.

      We thank the reviewer for the suggestion on imaging and apologize for the confusion about the range of quantification. As explained in the response to comment (2) of weakness, the quantifications were not confined to one level of the cortex but were performed in brain sections ranging from Bregma +1.94 to -2.80 mm, as shown in Supplementary Figure 2A-B in the original manuscript. In the revised manuscript, we have added relevant descriptions in the “Material and Methods” section (line 199-200) and schematics along with representative images of more anterior and more posterior cortical regions (Supplementary Figure 2A-D).

      Reviewer #3 (Recommendations For The Authors):

      (1) The authors should provide Opalin reporter expression data across various brain regions at different developmental stages to clarify the expression pattern of the reporter.

      We appreciate the reviewer’s comment. We chose to performed all quantifications in adult mice as Opalin is a well-established marker for differentiated OLs and the recombinase-dependent reporter expression is accumulative and irreversible. If there is any non-specific labeling in any earlier developmental stage, it would be retained and manifested at the timepoint we examined as well. In another word, the fact that we did not detect any non-specific labeling in the current dataset but only confined labeling in mature OLs ensured that no non-OL labeling was present in earlier timepoint. As shown in Figure 1D-F, reporter expression activated by the Opalin driver is presented at high OL specificity in all analyzed brain regions. This is further corroborated by results from combinatorically labeled samples (Figure 2 and Supplementary Figure 2), in which only OLs but not any other cell types were labeled in all analyzed brain regions too. Following the reviewers’ suggestions, we have added representative images of more rostral and more caudal cortical regions (Supplementary Figure 2B-D), which also showed highly specific OL labeling.  

      (2) In Figure 1D, please specify the developmental stage of the mice used for staining.

      We apologize for the omission of this information in the original manuscript. All quantifications were performed in 2-month-old adult mice. We have added this information in the “Material and Methods” section (line 199-200) of the revised manuscript.

      (3) The authors should clarify if the Opalin reporter expressed in OPCs and astrocytes at developmental stages of mice, such as P0, P7, and P30.

      We appreciate the reviewer’s comment, but as explained in response to comment (1), Opalin is a well-established marker for differentiated OLs which is not expressed in OPCs or astrocytes. As shown in Figure 1D-E, reporter expression is confined to CC1+ differentiated OLs with no colocalization with Sox9 (astrocyte marker). In support with this observation, only ASPA+ differentiated OLs but no OPC or astrocyte were labeled in any of the combinatorial lineage tracing samples generated using this line combined with progenitor-Cre lines. In addition to marker staining, we also did not observe any RFP+ cells with OPC or astrocyte morphology. As the recombinase-dependent reporter expression is accumulative and irreversible, the fact no non-specific labeling was observed in adult brain retrospectively proved the specificity of Oplain-Flp in earlier developmental stages.

      (4) In Figure 1E, authors should address why the efficiency of the tdTomato line is notably lower compared to that of H2B-GFP and whether the stability of reporters could impact the conclusions drawn.

      The difference in reporting efficiency is mainly caused by differences inherent to the two reporting systems. The TRE-RFP reporter is derived from Ai62, composed of a Tet response element and tdTomato inserted into the T1 TIGRE locus. The tdTomato expression is driven by tTA-TRE transcriptional activation. The HG-loxP reporter is derived from HG-Dual, composed of a CAG promoter, a frt-flanked STOP cassette, and H2B-GFP inserted into the Rosa26 locus. The H2B-GFP expression is driven by CAG promoter after Flp-mediated removal of the STOP cassette. A Flp-dependent tdTomato reporter designed in the same way as the HG-FRT reporter would have similar efficiency. In fact, the RC::FLTG reporter can be viewed as such a reporter in the absence of Cre, which did show similarly high efficiency as HG-FRT and supported efficient subtractive labeling of LGE/CGE-derived OLs. We apologize for a typo in the title of the Y-axis of the right panel in the original Figure 1F which may have caused potential misunderstanding. The “RFP+CC1+/CC1” should be “XFP+CC1/CC1”. We have corrected this mistake and revised the figure legend for clearer description of the data (Line 293-302 in the revised manuscript).

      (5) In Figure 2, please clarify the developmental stage of the mice used for staining. Authors should present the eGFP image in addition to tdTOM.

      We apologize for the omission of the age information in the original manuscript. All quantifications were performed in 2-month-old adult mice. We have added this information in the “Material and Methods” section (line 199-200) of the revised manuscript. We thank the reviewer for the suggestion on eGFP image and have presented it in supplementary Figure 1 in the revised manuscript.

      (6) in Figure 2D, authors should display the eGFP image alongside the tdTomato image. It is difficult to assess the efficiency of Emx-Cre and Nkx2.1-Cre.

      We thank the reviewer for the suggestion on eGFP image and have presented eGFP image in Supplementary Figure 1D in the revised manuscript. There are two reasons why we chose to present it in the supplementary figure instead of main figure. First, we added ASPA staining in the green channel along with quantifications of RFP cells as % of ASPA in Figure 2 in the revised manuscript, following reviewer #2’s suggestion. Second, as pointed out by reviewer #2, GFP would flood the sections in the presence of Emx1Cre and could be quite distractive if it was shown together with RFP.

      We were not entirely sure what exactly the reviewer means by “assess the efficiency of Emx-Cre and Nkx2.1-Cre”, but we believe that the quantifications of RFP cells as % of ASPA clarified the contribution of each origin to the total OLs (Figure 2J and 2N in the revised manuscript).

      (7) Figure 3 depicts the entire brain, replicating the image presented in Figure 2. It would be beneficial to consolidate Figures 2 and 3, as they showcase identical brain scans of different regions.

      We thank the reviewer for the constructive suggestion and have consolidated Figures 2 and 3 in the original manuscript into Figure 2 in the revised manuscript.

    1. Author response

      Reviewer #1 (Public Review):

      […] Weaknesses:

      This work explores an interesting question on regulating myoD+ progenitors and the defects of this process in skeletal muscle differentiation by SRFS2 but spreads out in many directions rather than focusing on the key defects. A number of approaches are used, but they lack the robust mechanistic analysis of the defects that result in muscle differentiation. Specifically, the role of SRFS2 on splicing appears to be a misfit here and does not explain the primary defects in the migration of myoD+ progenitors. There are concerns about the scRNA-seq and many transcripts in muscle biology that are not expressed in muscle cells. Focusing on main defects and additional experimental evidence to clear the fusion vs. precocious differentiation vs. reduced differentiation will strengthen this work.

      (1) The analysis of RNA-seq data (Figure 2) is limited, and it is unclear how it relates to the work presented in this MS. The Go enrichment analysis is combined for both up and down-regulated DEG, thus making it difficult to understand the impact differently in both directions. Stac2 is a predominant neuronal isoform (while Stac3 is the muscle), and the Symm gene is not found in the HGNC or other databases. Could the authors provide the approved name for this gene? The premise of this work is based on defects in ECM processes resulting in the mis-targeting of the muscle progenitors to the nonmuscle regions. Which ECM proteins are differentially expressed?

      The GO enrichment analysis (Figure 2B) indicates that genes involved in skeletal muscle construction and function were significantly dysregulated, with both up-regulated and down-regulated genes observed, consistent with the phenotype analysis presented in Figure 1.

      We agree with the reviewer’s comments that Stac3 is the predominant muscle isoform with high expression in skeletal muscle tissues, while stac2 is expressed at low levels in these tissues. Therefore, we decided to delete the Stac2 data from the Figure 2C and will modify the text accordingly. We apologize for our errors.

      In response to the reviewer's comment regarding the Symm gene not being found in the HGNC or other databases, we carefully re-examined the genes presented in Figure 2C. We discovered that one of the genes is actually Synm, which encodes synemin, an intermediate filament protein. We will correct this in the manuscript.

      scRNA-seq analysis revealed defects in ECM processes in SRSF2-deficient myoblasts, which we believe likely resulted in the mis-targeting of muscle progenitors to non-muscle regions. However, comparing RNA-seq results from whole muscle tissues with scRNA-seq results is challenging.

      (2) Could authors quantify the muscle progenitors dispersed in nonmuscle regions before their differentiation? Which nonmuscle tissues MyoD+ progenitors are seen? Most of the tDT staining in the enlarged sections appears to be punctate without any nuclear staining seen in these cells (Figure 3 B, D E-F). Could authors provide high-resolution images? Also, in the diaphragm cross-sections in mutants, tdT labeling appears to be missing in some areas within the myofibers defined as cavities by the authors (marked by white arrows, Figure 3H). Could this polarized localization of tDT be contributing to specific defects?

      tdT staining revealed a substantial presence of MyoD-derived cells distributed beyond the muscle regions, as shown in Figure 3B. Quantify the number of MyoD+ progenitors dispersed in non-muscle regions is not meaningful.

      tdT+ cells also include those that previously expressed MyoD but have since differentiated into myotubes and myofibers, which is why many tdT+ staining is not nuclear.

      MyoD+ cells deficient in SRSF2 either undergo apoptosis or premature differentiation. Consequently, tdT staining in SRSF2-KO muscles showed many irregularities in the muscle fibers.

      (3) Is there a difference in the levels of tDT in the myoD" muscle progenitors that are mis-targeted vs the others that are present in the muscle tissues?

      tdT+ cells include those that previously expressed MyoD but have since differentiated into myotubes and myofibers, which are no longer MyoD+ cells. Additionally, tdT+ also include those currently expressing MyoD, which are MyoD+ cells.

      The fiber differences between WT and SRSF2-KO mice are easily discernible through tdT staining (Figure 2D and 3D), however, comparing the levels of tdT staining between the two groups is not meaningful.

      (4) scRNA is unsuitable for myotubes and myofibers due to their size exclusion from microfluidics. Could authors explain the basis for scRNA-seq vs SnRNA-seq in this work? How are SKM defined in scRNA-data in Figure 4? As the myofibers are small in KO, could the increased level of late differentiation markers be due to the enrichment of these small myotubes/myofibers in scRNA? A different approach, such as ISH/IF with the myogenic markers at E9.5-10.5, may be able to resolve if these markers are prematurely induced.

      SRSF2 is highly expressed in proliferative myoblasts, but its levels declined once differentiation begins. In our study, we used Myod1-Cre to delete the SRSF2 gene and performed the scRNA-seq analysis to examine the effects of SRSF2 deletion on the proliferation and differentiation of MyoD cells. Our analysis revealed that SRSF2 deletion caused proliferation defects and premature differentiation of MyoD cells (Figure 5G), leading to myofiber abnormalities.

      We determined that snRNA-seq analysis is not suitable for our study.

      Additionally, skeletal muscle cells (SKM) were defined based on the expression of skeletal muscle markers, as shown in Figure 4C.

      (5) TNC is a marker for tenocytes and is absent in skeletal muscle cells. The authors mentioned a downregulation of TNC in the KO SKM derived clusters. This suggests a contamination of the tenocytes in the control cells. In spite of the downregulation of multiple ECM genes showed by scRNA-seq data, the ECM staining by laminin in KO in Figure 3 appears to be similar to controls.

      Tenascin-C (Tnc) is also part of the extracellular matrix (ECM) family. scRNA-seq analysis revealed that multiple ECM genes were downregulated in SRSF2-KO myoblasts, however, this did not indicate that laminin was downregulated in the SRSF2-KO muscles.

      (6) The expression of many fusion genes, such as myomaker and myomerger, is reduced in KO, suggesting a primary fusion defect vs a primary differentiation defect. Many mature myofiber proteins exhibit an increased expression in disease states, suggesting them as a compensatory mechanism. Authors need to provide additional experimental evidence supporting precocious differentiation as the primary defect.

      Our analysis revealed that the deletion of SRSF2 caused premature differentiation of MyoD cells (Figure 5G), leading to abnormalities of myofiber formation. SRSF2 is highly expressed in proliferative myoblasts, but its expression declines quickly in myotubes. Therefore, it is unlikely that the low expression of SRSF2 in myotubes caused the primary fusion defect.

      (7) The fusion defects in KO are also evident in siRNA knockdown for SRSF2 and Aurka in C2C12, which mostly exhibits mononucleated myocytes in knockdowns. Also, a fusion index needs to be provided.

      SRSF2 knockdown and Aurka knockdown caused differentiation defects, including fusion defects. We quantified the percentages of both MyoG+ and MHC+ cells in the differentiation assay.

      (8) The last section of the role of SRSF2 on splicing appears to be a misfit in this study. Authors describe the Bin1 isoforms in centronuclear myopathy, but exon17 is not involved in myopathy. Is exon17 exclusion seen in other diseases/ splicing studies?

      Our study is the first to report that exon 17 inclusion of Bin1 is regulated by SRSF2. Specifically, the knockdown of Bin1 exon 17 caused severe differentiation defects in C2C12 myoblasts. The involvement of Bin1 exon 17 in myopathy requires further validation using clinical samples.

      Reviewer #2 (Public Review):

      […] Weaknesses: Although unbiased sequencing methods were used, their findings about SRSF2 served as a transcriptional regulator and functioned in alternative splicing events are not novel. The introductions and discussion is not clearly written. The authors did not raise clear scientific questions in the introduction part. The last paragraph is only copy-paste of the abstract. The discussion part is mainly the repeat of their results without clear discussion.

      While the role of SRSF2 as a transcriptional regulator involved in alternative splicing events is not novel, the specific SRSF2-regulated alternative splicing events and targeted genes in skeletal muscle have not been reported in other publications. We believe our interpretation of the data and comparison with related published studies are well presented in the Discussion section.

    1. Author response:

      Answers to Reviewer #1 (Public Review):

      (1) Tonic and phasic components in Figure 1 are not clear.

      We will reformulate Figure 1A to show how the tonic and phasic components were measured. As this point was also raised by Reviewer #2 (Comment 3), we will explicitly clarify this in the Methods section. We will modify the color scheme to improve clarity.

      (2) Labeling of traces in Figure 4.

      We will add labels to traces informing which sensory pathways were stimulated to produce each response.

      (3) Optic tectum instead of optical tectum.

      We apologize for the error. We will replace “optical tectum” with “optic tectum” as also suggested by Reviewer #2.

      Answers to Reviewer #2 (Public Review):<br /> (1) Complexity of tectum upstream circuitry (Comments 1 and 2).

      Processing of visual information is certainly a major role of the tectum, but it is true that it also receives sensory inputs from other structures including sensory pathways. We will acknowledge this complexity in our revised manuscript along with suggestions for heading titles.

      (2) Figure 1 and associated text. 

      As mentioned in the provisional answer point 1 to Reviewer #1, we will reformulate Figure 1A and clarify how tonic and phasic responses were calculated.

      (3) Figure 3 and associated text.

      We will perform the analysis suggested by the reviewer and move calculations to the Methods section as requested.

      (4) Figure 5C and lines 398-410.

      We will consider omitting Figure 5C or clearly stating its value in the context of the rest of the data and our previous behavioral experiments.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors explore mechanisms through which T-regs attenuate acute pain using a heat sensitivity paradigm. Analysis of available transcriptomic data revealed expression on the proenkephalin (Penk) gene in T-regs. The authors explore the contribution of T-reg Penk in the resolution of heat sensitivity.

      Strengths:

      Investigating the potential role of T-reg Penk in the resolution of acute pain is a strength.

      Weaknesses:

      The overall experimental design is superficial and lacks sufficient rigor to draw any meaningful conclusions.

      We hope that the reviewer will reconsider this severe criticism after examining the updated manuscript and results.

      For instance:

      (1) The were no TAM controls. What is the evidence that TAM does not alter heat-sensitive receptors.

      the impact of TMX on heat perception is not the object of this study. Nevertheless, it appears that heat-sensitivity in controls WT (blue dots) is slightly diminished after TMX administration (Figure 5A), suggesting that heat-sensitive receptors are moderately altered by TMX per se. This reduction is much more pronounced for LOX mice. Thus, although it is possible that TMX play a marginal role on heat sensitivity by itself, the results show a much more pronounced effect of TMX in LOX than in WT, in favor of a role for Penk Treg in heat sensitivity.

      (2) There are no controls demonstrating that recombination actually occurred. How do the authors know a single dose of TAM is sufficient?

      these results are now presented in figure S4. A 70% reduction in Penk mRNA is observed in Treg after a single administration of TMX.

      (3) Why was only heat sensitivity assessed? The behavioral tests are inadequate to derive any meaningful conclusions. Further, why wasn't the behavioral data plotted longitudinally

      The longitudinal data are presented in figure S5A. New behavioral tests have been performed and the results are now shown in figure S5E-H. Importantly, heat sensitivity was observed in two independent laboratory with two different tests.

      Reviewer #2 (Public Review):

      Summary:

      The present study addresses the role of enkephalins, which are specifically expressed by regulatory T cells (Treg), in sensory perception in mice. The authors used a combination of transcriptomic databases available online to characterize the molecular signature of Treg. The proenkephalin gene Penk is among the most enriched transcripts, suggesting that Treg plays an analgesic role through the release of endogenous opioids. In addition, in silico analysis suggests that Penk is regulated by the TNFR superfamily; this being experimentally confirmed. Using flow cytometry analysis, the authors then show that Penk is mostly expressed in Treg of the skin and colon, compared to other immune cells. Finally, genetic conditional excision of Penk, selectively in Treg, results in heat hypersensitivity, as assessed by behavior analysis.

      Strengths:

      The manuscript is clear and reveals a previously unappreciated role of enkephalins, as released by immune cells, in sensory perception. The rationale in this manuscript is easy to follow, and conclusions are well supported by data.

      Weaknesses:

      The sensory deficit of Penk cKO appears to be quite limited compared to control littermates.

      Reviewer #3 (Public Review):

      Summary:

      Aubert et al investigated the role of PENK in regulatory T cells. Through the mining of publicly available transcriptome data, the authors confirmed that PENK expression is selectively enriched in regulatory but not conventional T cells. Further data mining suggested that OX40, 4-1BB as well as BATF, can regulate PENK expression in Tregs. The authors generated fate-mapping mice to confirm selective PENK expression in Tregs and activated effector T cells in the colon and spleen. Interestingly, transgenic mice with conditional deletion of PENK in Tregs resulted in hypersensitivity to heat, which the authors attributed to heat hyperalgesia.

      Strengths:

      The generation of transgenic mice with conditional deletion of PENK in foxp3 and PENK fate-mapping is novel and can potentially yield significant findings. The identification of upstream signals that regulate PENK is interesting but unlikely to be the main reason why PENK is predominantly expressed in Tregs as both BATF and TNFR are expressed in effector T cells.

      Weaknesses:

      There is a lack of direct evidence and detailed analysis of Tregs in the control and transgenic mice to support the authors' hypothesis. PENK was previously reported to be expressed in skin Tregs and play a significant role in regulating skin homeostasis: this should be considered as an alternative mechanism that may explain the changed sensitivity to heat observed in the paper.

      We now provide a detailed analysis of Treg with or without Penk, from their immunosuppressive functions to their colocalization with sensory neurons in the skin, supporting their function as natural analgesics. The alternate hypothesis relative to skin homeostasis is now clearly presented and discussed.

      Recommendations for the authors):

      Reviewer #2 (Recommendations For The Authors):

      Most of my comments should be addressable in a revised manuscript but will require additional analysis.

      Major:

      - According to flow cytometry analysis, Penk is expressed mostly in Treg of the skin and colon. What may account for such restricted expression? Where could Treg-released enkephalins act?

      We now rephrased the paper to emphasize the known role of Batf in tissue Treg differentiation. We believe the Batf dependency of Penk expression is the reason why tissue Treg are more enriched in Penk than Treg from lymphoid organs. This is now clearly discussed.

      We also provide a new figure (Figure S1) that shows that binding of Batf and co factors AP1 and IRF4 were reported to bind to Penk regulatory regions. Altogether, the role of Batf in tissue Treg differentiation would explain why tissue Treg such as colon and skin are particularly enriched in Penk. This is now clearly stated in the revised manuscript. 

      As to know where Treg-released enkephalins act, we performed immunostainings in the skin and observed that Treg could colocalize with sensory neurons (shown in a new figure 5, panel D). This observation raise the hypothesis that  Treg-released enkephalins could act on sensory neurons locally.

      - Which mechanism can underlie heat hypersensitivity in Penk cKO mice? Which sensory neurons are involved? Are other sensory modalities affected, such as mechanical sensitivity?

      As stated above, we show that Treg can be in close contact with thermal sensors neurons producing CGRP. These data are shown in figure 5D. We have also tested may other nociceptive stimulus (innocuous and noxious) and did not detect significant differences. These data are presented as a supplementary figure S5. Whether enkephalins produced by Treg can change the stimulation threshold of various nervous fibers is currently performed by electrophysiology.

      - No control is provided to ensure that Penk is selectively excised in Treg cells in cKO mice.

      We have performed additional experiments with fluorescent probes to document Penk mRNA expression in cKO mice. The results on the specific expression of Penk mRNA in various subsets post-TMX are shown in a supplementary figure S4.

      - The authors acknowledge that Penk from Treg was previously studied in an animal model of inflammatory pain. However, which role these endogenous opioids play is unclear, especially since authors discovered that enkephalins are likely continuously released at steady states. This is not enough discussed in the narrative, which surprisingly does not separate the results from the discussion.

      The results and discussion are now separated in two sections.

      Minors:

      - Replace "Fox3 1" with "Fox31" (line 31), "functions 15" with "functions15" (line 43), "BATF 19" with "BATF19" (line 85).

      - Text mentions Figure S4 (line 125), which is most likely S3.

      Reviewer #3 (Recommendations For The Authors):

      Given the most significant finding of this paper is based on the heat-induced pain model, there is surprisingly little analysis of Tregs in this context. The authors analyzed spleen and colon Tregs at steady state, it is unclear whether any of these Tregs are involved in pain sensitivity directly. Skin Tregs or other relevant Tregs to this model should be analyzed in control and Lox mice. This is particularly relevant as PENK expression was previously reported in skin Tregs and plays a significant role in skin homeostasis (Yamazaki et al 2020 PNAS). Does PENK conditional deletion alter Treg frequencies, numbers, and immune suppressive function? Not even spleen or colon Treg were analyzed comparing control and lox mice.

      We now provide evidences showing unaltered immunosuppressive functions of Treg in the absence of Penk (Figure 4), and more importantly unaffected proportions of skin Treg in mice lacking Penk in Treg, at the very site of heat stimulation (Figure 5B-C). We also observed unaffected representation of Treg in the spleen and lymph nodes, but we do not feel that these data are necessary to interpret the results.

      Given the role of PENK in skin Tregs, could the observed effect in Figure 4 be due to altered skin homeostasis rather than sensitivity to pain?

      The reviewer is referring to a paper where Penk in skin Treg play a role on UV-damaged keratinocytes in vivo (Shime et al., 2020, PNAS). To our knowledge, a role for Penk produced by skin Treg on keratinocytes homeostasis at the steady state is currently unknown. Nevertheless, this hypothesis is now clearly stated and discussed in the manuscript.

      The authors stated that only after 7 days post tamoxifen treatment was heat hyperalgesia observed: deletion of PENK in Treg but not Tconv should be confirmed: is deletion only complete after 7 days or is the effect observed due to indirect effects of altered "normal" Treg function?

      We have performed a kinetics to document Penk deletion at D3, D7 and 30 post-TMX. Results show a specific deletion of Penk in Treg at all time points so we combined all the time points for the representation of the results (Figure S4). As for the indirect effects of “altered” normal function, we now provide the reader with a new figure (Figure 4), showing that Penk deficient Treg are not impaired in their suppressive function in vitro and in vivo.  

      Actual protein/peptide production of enkephalins by Tregs should be confirmed. It is also unclear which peptide(s) can be secreted and presumably responsible for the changes in heat sensitivity.

      This is a very interesting question that we addressed with a MENK ELISA but without success at reproducing the results. An ongoing project will use mass spectrometry to fully characterize the peptides produced by Treg and activated Tconv.

      The analysis of PENK regulation by Tregs is interesting despite them being entirely based on data mining. BATF is a pioneering factor expressed by all activated effector T cells. While the connection between BATF and PENK may explain why the authors observed PENK expression chiefly in activated effectors and Tregs, BATF cannot be the reason why PENK is "predominantly" expressed by Tregs. Similarly, 4-1BB and OX40 can be induced on effector T cells. Is PENK under the control of Foxp3? There are lots of publically available datasets on Foxp3/IL-2 dependent Treg signatures through which this can be addressed.

      We now provide a supplementary figure (Figure S1), showing a compilation of ChIP Seq studies for various transcription factors in various T cell subsets. We provide the reader with a list of all the TF that have been reported to bind in the regulatory regions of Penk. In agreement with our hypothesis, BATF, FOXP3, IRF4 and several others are present in that list. Further work is needed to decipher the exact contribution of each of those TF to the regulation of Penk in Treg vs activated Tconv that is beyond the scope of this report.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This work identified new NMD inhibitors and tested them for cancer treatment, based on the hypothesis that inhibiting NMD could lead to the production of cancer neoantigens from the stabilized mutant mRNAs, thereby enhancing the immune system's ability to recognize and kill cancer cells. Key points of the study include:

      • Development of an RNA-seq based method for NMD analysis using mixed isogenic cells that express WT or mutant transcripts of STAG2 and TP53 with engineered truncation mutations.

      • Application of this method for a drug screen and identified several potential NMD inhibitors.

      • Demonstration that one of the identified compounds, LY3023414, inhibits NMD by targeting the SMG1 protein kinase in the NMD pathway in cultured cells and mouse xenografts.

      • Due to the in vivo toxicity observed for LY3023414, the authors developed 11 new SMG1 inhibitors (KVS0001-KVS0011) based on the structures of the known SMG1 inhibitor SMG1i-11 and the SMG1 protein itself.

      • Among these, KVS0001 stood out for its high potency, excellent bioavailability, and low toxicity in mice. Treatment with KVS0001 caused NMD inhibition and increased presentation of neoantigens on MHC-I molecules, resulting in the clearance of cancer cells in vitro by co-cultured T cells and cancer xenografts in mice by the immune system.

      These findings support the strategy of targeting the NMD pathway for cancer treatment and provide new research tools and potential lead compounds for further exploration.

      Strengths:

      The RNA-seq-based NMD analysis, using isogenic cell lines with specific NMD-inducing mutations, represents a novel approach for the high-throughput identification of potential NMD modulators or genetic regulators. The effectiveness of this method is exemplified by the identification of a new activity of AKT1/mTOR inhibitor LY3023414 in inhibiting NMD.

      The properties of KVS0001 described in the manuscript as a novel SMG1 inhibitor suggest its potential as a lead compound for further testing the NMD-targeting strategies in cancer treatment. Additionally, this compound may serve as a useful research tool.

      The results of the in vitro cell killing assay and in vivo xenograft experiments in both immuno-proficient and immune-deficient mice indicate that inhibiting NMD could be a viable therapeutic strategy for certain cancers.

      Weaknesses:

      The authors did not address the potential effects of NMD/SMG1 inhibitors on RNA splicing. Given that the transcripts of many RNA-binding proteins are natural targets of NMD, inhibiting NMD could significantly alter splicing patterns. This, in turn, might influence the outcomes of the RNA-seq-based method for NMD analysis and result interpretation.

      This is a very important comment that highlights an important aspect of NMD and potential exciting downstream studies. We did not systematically assess RNA splicing in our work as we are not sure if inhibition of NMD would induce cancer specific splicing that would allow for tumor targeting. It is well established that NMD can impact splicing, including modulating cryptic exon expression, but finding and assessing antigenicity of targetable tumor specific antigens constitutes a study in and of its own. Our own data in figure 4C-F supports this, as a point mutation near a splice site in TP53 strongly induced NMD which was subsequently stopped by KVS0001 treatment. Doing a systematic review of this effect we feel is outside the scope of this manuscript. We’ve incorporated a comment into our discussion highlighting this deficiency, but certainly find the idea of mining RNA-splicing changes an exciting next endeavor.

      While the RNA-seq-based approach offers several advantages for analyzing NMD, the effects of NMD/SMG1 inhibitors observed through this method should be confirmed using established NMD reporters. This step is crucial to rule out the possibility that mutations in STAG2 or TP53 affect NMD in cells, as well as to address potential clonal variations between different engineered cell lines.

      This is possible, but we want to highlight that all hits from the screen were confirmed in a separate cell line with different clones. While this will not rule out effects to NMD due to STAG2 and TP53 knockdown, the final lead compound was also tested on different endogenous transcripts in both indel and normal transcripts controlled by NMD (i.e., ATF4) in multiple species (human and mouse).  Importantly, many of these assays employed the non-mutated transcripts from heterozygous mutant cells to ensure that cis-acting NMD was being measured and to control for any trans-acting splicing or other unanticipated biochemical effects.

      The results from the SMG1/UPF1 knockdown and SMG1i-11 experiments presented in Figure 3 correlate with the effects seen for LY3023414, but they do not conclusively establish SMG1 as the direct target of LY3023414 in NMD inhibition. An epistatic analysis with LY3023414 and SMG1-knockdown is needed.

      This is a great comment, and is supported by the recent push to confirm drug targets by chemical probes or knockout followed by loss of further effect due to the application of the drug in question. We attempted to knockout SMG1 in multiple cells lines used in this study, including RPE1, MCF10A, NCI-H358 and LS180, and were unable to obtain clones that have biallelic out of frame indels. We were able to obtain multiple clones with in frame indels. Based on our results and those in the publicly available database DepMap we suspect this gene is likely essential, making a simple knockout unfeasible. While this uncertainty is important to keep in mind, we feel it does not detract from the reporting of a novel NMD screen that is mechanistically agnostic and of a novel in vivo active NMD inhibitor.

      Reviewer #2 (Public Review):

      Summary:

      Several publications during the past years provided evidence that NMD protects tumor cells from being recognized by the immune system by suppressing the display of neoantigens, and hence NMD inhibition is emerging as a promising anti-cancer approach. However, the lack of an efficacious and specific small-molecule NMD inhibitor with suitable pharmacological properties is currently a major bottleneck in the development of therapies that rely on NMD inhibition. In this manuscript, the authors describe their screen for identifying NMD inhibitors, which is based on isogenic cell lines that either express wild-type or NMD-sensitive transcript isoforms of p53 and STAG2. Using this setup, they screened a library of 2658 FDA-approved or late-phase clinical trial drugs and had 8 hits. Among them they further characterized LY3023414, showing that it inhibits NMD in cultured cells and in a mouse xenograft model, where it, however, was very toxic. Because LY3023414 was originally developed as a PI3K inhibitor, the authors claim that it inhibits NMD by inhibiting SMG1. While this is most likely true, the authors do not provide experimental evidence for this claim. Instead, they use this statement to switch their attention to another previously developed SMG1 inhibitor (SMG1i-11), of which they design and test several derivatives. Of these derivatives, KVS0001 showed the best pharmacological behavior. It upregulated NMD-sensitive transcripts in cultured cells and the xenograft mouse model and two predicted neoantigens could indeed be detected by mass spectrometry when the respective cells were treated with KVS0001. A bispecific antibody targeting T cells to a specific antigen-HLA complex led to increased IFN-gamma release and killing of cancer cells expressing this antigen-HLA complex when they were treated with KVS0001. Finally, the authors show that renal (RENCA) or lung cancer cells (LLC) were significantly inhibited in tumor growth in immunocompetent mice treated with KVS0001. Overall, this establishes KVS0001 as a novel and promising ant-cancer drug that by inhibiting SMG1 (and therewith NMD) increases the neoantigen production in the cancer cells and reveals them to the body's immune system as "foreign".

      Strengths:

      The novelty and significance of this work consists in the development of a novel and - judging from the presented data - very promising NMD inhibiting drug that is suitable for applications in animals. This is an important advance for the field, as previous NMD inhibitors were not specific, lacked efficacy, or were very toxic and hence not suitable for animal application. It will be still a long way with many challenges ahead towards an efficacious NMD inhibitor that is safe for use in humans, but KVS0001 appears to be a molecule that bears promise for follow-up studies. In addition, while the idea of inhibiting NMD to trigger neoantigen production in cancer cells and so reveal them to the immune system has been around for quite some time, this work provides ample and compelling support for the feasibility of this approach, at least for tumors with a high mutational burden.

      Main weaknesses:

      There is a disconnect between the screen and the KVS0001 compound, that they describe and test in the second part of the manuscript since KVS0001 is a derivative of the SMG1 inhibitors developed by Gopalsamy et al. in 2012 and not of the lead compound identified in the screen (LY3023414). Because of high toxicity in the mouse xenograft experiments, the authors did not follow up LY3023414 but instead switched to the published SMG1i-11 drug of Gopalsamy and colleagues, a molecule that is widely used among NMD researchers for NMD inhibition in cultured cells. Therefore, in my view, the description of the screen is obsolete, and the paper could just start with the optimization of the pharmacological properties of SMG1i-11 and the characterization of KVS0001. Even though the screen is based on an elegant setup and was executed successfully, it was ultimately a failure as it didn't reveal a useful lead compound that could be further optimized.

      This is a helpful observation from an outside perspective. From our point of view, we were only alerted to the targeting SMG1 due to the previously reported off-target effects of LY3023414 on SMG and lack of plausible explanation for PIK3CA inhibition to efficiently inhibit NMD. We do feel that the screen is worth including for two reasons. First, it offers an unbiased approach for querying the entire NMD pathway for vulnerabilities useful to target. The library chosen was quite small, so the screen itself could be useful to others with larger libraries to test. Second, it did help identify SMG1 as the ideal target for NMD disruption. While targeting SMG1 is not novel, we felt it highlighted why we chose to develop KVS0001. To address this reviewer’s comment, we’ve included a couple sentences in the results and discussion strengthening the point that the screen provided an unbiased approach to finding the best target in the pathway to disrupt NMD and elaborating on the transition from LY3023414 and the screen to development of KVS0001.

      Additional points:

      - Compared to SMG1i-11, KVS0001 seems less potent in inhibiting SMG1 (higher IC50). It would therefore be important to also compare the specificity of both drugs for SMG1 over other kinases at the applied concentrations (1 uM for SMG1i-11, 5 uM for KVS0001). The Kinativ Assay (Fig. S13) was performed with 100 nM KVS0001, which is 50-fold less than the concentration used for functional assays and hence not really meaningful. In addition, more information on the pharmacokinetic properties and toxicology of KVS0001 would allow a better judgment of the potential of this molecule as a future therapeutic agent.

      We agree that the Kinativ assay may have poorly represented the activity of KVS0001 at the bioactive concentration. We have now added 1uM Kinativ data, the highest concentration we were able to run to figure S13.

      - On many figures, the concentrations of the used drugs are missing. Please ensure that for every experiment that includes drugs, the drug concentration is indicated.

      We apologize for this oversight and have added all drug concentrations on the appropriate plots.

      - Do the authors have an explanation for why LY3023414 has a much stronger effect on the p53 than on the STAG2 nonsense allele (Figure 1B, S8), whereas emetine upregulates the STAG2 nonsense alleles more than the p53 nonsense allele (Figure S5). I find this curious, but the authors do not comment on it.

      This is an interesting observation. The short answer is we’re not sure. The speculative answer is that it is related to the distinctly different mechanisms of actions of the two inhibitors (see comments from reviewing editor below).

      - While it is a strength of the study that the NMD inhibitors were validated on many different truncation mutations in different cell lines, it would help readers if a table or graphic illustration was included that gives an overview of all mutant alleles tested in this study (which gene, type of mutation, in which cell type). In the current version, this information is scattered throughout the manuscript.

      This is an excellent suggestion. We’ve included a new table S1 which incorporates the details of each cell line and the genes used in each for this study.

      - Lines 194 and 302: That SMG1i-11 was highly insoluble in the hands of the authors is surprising. It is unclear why they used variant 11j, since variant 11e of this inhibitor is widely used among NMD researchers and readily dissolves in DMSO.

      As this referee notes SMG1i-11 is soluble in DMSO in our hands as well, which enabled us to use it for our in vitro work. Unfortunately, the concentrations of DMSO required to dissolve the compound to suitable concentrations for in vivo work were too high to safely use in mice with our animal protocols. We also attempted to use ethanol, which also did dissolve SMG1i-11, but led to a significant amount of toxicity in both the drug and vehicle control arms.

      - Line 296: The authors claim that they were able to show that LY3023414 inhibited the SMG1 kinase, which is not true. To show this, they would have for example to show that LY3023414 prevents SMG1-mediated UPF1 phosphorylation, as they did for KVS0001 and SMG1i-11 in Fig. 3F. Unless the authors provide this data, the statement should be deleted or modified.

      We’ve modified this statement as requested by the referee, now saying we suspected SMG1 was the target based on previously published work.

      Recommendations for the authors:

      Reviewing Editor (Recommendations For The Authors):

      Your paper has been assessed by two reviewers with expertise in the NMD field. They both find the identification and characterization of a new potent and selective inhibitor of the SMG1 NMD kinase with in vivo activity to represent a significant advance in the field, and one that could ultimately be of value as the basis for a novel cancer therapy. However, as you will see both reviewers have concerns about whether the SMG1 inhibitor screen you developed belongs in the paper because it was not used to identify the KVS0001 inhibitor, which instead was generated based on a previously published set of SMG1 inhibitors, and because the NMD inhibitor that did emerge from your screen, LY3023414, was not shown to be a direct inhibitor of SMG1 kinase activity. While it is an elegant screen, during the revision of the paper you could consider streamlining the manuscript by emphasizing how the screening assay was used to validate KVS0001, and bolstering the characterization of the new KVS0001 NMD inhibitor by conducting the proposed additional experiments.

      Each of the reviewers raises additional points that should be addressed in a revised version.

      The reviewing editor has two additional points:

      (1) While emetine inhibits NMD, it is not really a direct NMD inhibitor, as implied, but rather a potent protein synthesis elongation inhibitor that acts by binding to the E-site of the 40S ribosomal subunit, and is therefore, like anisomycin, another protein synthesis inhibitor, working indirectly to inhibit NMD. This should be acknowledged in the section where emetine is first used as an "NMD inhibitor".

      This has been included in the indicated section at the referee’s request.  

      (2) To establish that the observed phenotypic effects of KVS0001 are due to on-target inhibition of SMG1, the authors could generate and express an SMG1 point mutant that is resistant to KVS0001 inhibition, which could be based on the SMG1 catalytic domain structure that the authors used originally to design KVS001. Inhibitor-resistant kinase mutants are the gold standard for demonstrating that the biological consequences of a novel protein kinase inhibitor are due to on-target effects. Admittedly, because SMG1 is such a huge protein, this may be technically challenging and is likely beyond the scope of the present paper.

      -We agree with the reviewing editor on all accounts: this would be an ideal experiment to run, but also that it is beyond the scope of the present paper. As indicated in our discussion above with reviewer 1, SMG1 knockout was not possible in our hands, and we suspect it may be due to the gene being essential. Creating an inhibitor resistant mutant could overcome this issue and create an ideal model to test the target for KVS0001. Unfortunately finding such a mutant would likely require significant amounts of trial and error to create a resistant mutant that did not lose SMG1 function. And SMG1 is huge, creating technical issues for experimenting. Due to the anticipated amount of work for such a study we believe this would be better accomplished in future studies.

      Reviewer #1 (Recommendations For The Authors):

      (1) The authors did not mention a new SMG1 inhibitor and its effects described in Cheruiyot et al, Cancer Res 2019 (PMID: 34215620).

      A comment regarding this discovery and its implications for our work was added to the discussion.

      (2) There is an inconsistency between the manuscript text and methods sections regarding the time of drug treatment (16 hours vs 14 hours) in the HTS screen.

      This has been double checked in our notebook and fixed to reflect 16hrs as the correct incubation time. Thank you for identifying that clerical oversight.

      Reviewer #2 (Recommendations For The Authors):

      (1) Line 61: The references to NMD reviews are very old (refs 20 and 21). I suggest citing more recent, up-to-date reviews instead.

      Two additional references, one from 2016 and another from 2023, have been added to increase support for this statement in the introduction.

      (2) Figure S1: Shouldn't the caption of the right panel (TP32 data) say "clone 221" rather than "clone 22"?

      This has been fixed.

      (3) Figure S18: Please indicate on the y-axis that you are displaying RPKM for p53.

      This has been fixed.

      (4) Figures 4D and S19: Please indicate concentrations used for all drugs.

      This has been fixed.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      The authors investigate pleiotropy in the genetic loci previously associated to a range of neuropsychiatric disorders: Alzheimer's disease, amyotrophic lateral sclerosis (ALS), frontotemporal dementia, Parkinson's disease, and schizophrenia. The local statistical fine-mapping and variant colocalisation approaches they use have the potential to uncover not only shared loci but also shared causal variants between these disorders. There is existing literature describing the pleiotropy between ALS and these other disorders but here the authors apply state of the art, local genetic correlation approaches to further refine any relationships. 

      Complex disease and GWAS is not my area of expertise but the authors managed to present their methods and results in a clear, easy to follow manner. Their results statistically support several correlations between the disorders and, for ALS and AD, a shared variant in the vicinity of the lead SNP from the original ALS GWAS. Such findings could have important implications for our understanding of the mechanisms of such disorders and eventually the possibility of managing and treating them. 

      The authors have built a useful pipeline that plugs together all the gold-standard, existing software to perform this analysis and made it openly available which is commendable. However, there is little discussion of what software is available to perform global and local correlation analysis and, if there are multiple tools available, why they consider the ones they selected to be the gold-standard. 

      There is some mention of previous findings of genetic pleiotropy between ALS and these other disorders in the introduction, and discussion of their improved ALS-AD evidence relative to previous work. However, detailed comparisons of their other correlations to what was described before for the same pairs of disorders (if any) is missing. Adding this would strengthen the impact of this paper. 

      Finally, being new to this approach I found the abstract a little confusing. Initially, the shared causal variant between ALS and AD is mentioned but immediately in the following sentence they describe how their study "suggested that disease- implicated variants in these loci often differ between traits". After reading the whole paper I understood that the ALS-AD shared variant was the exception but it may be best to restructure this part of the abstract. Additionally, in the abstract the authors state that different variants "suggests the role of distinct mechanisms across diseases despite shared loci". Is it not possible that different variants in the same regulatory region or protein-coding parts of a gene could be having the same effect and mechanism? Or does the methodology to establish that different variants are involved automatically mean that the variants are too distant for this to be possible? 

      We thank reviewer one for their considered review of this manuscript and for highlighting points that would benefit from further exploration. Itemised responses are provided below.

      (1) The reviewer noted that we did not adequately explain our choice of software for global and local genetic correlation analysis, and why we consider the techniques chosen as gold standard. We agree that the paper would benefit from clarification around this aspect of the study.

      Briefly, we firstly selected LAVA for the local genetic correlation analysis because it offers several advantages above competing software and was developed by a reputable team previously known for developing MAGMA, which is well-established in the statistical genetics field. In the manuscript (page 8), we added the following clarification: “LAVA was the most appropriate local genetic correlation approach for this study for several reasons. First, unlike SUPERGNOVA and rho-HESS, LAVA makes specific accommodations for analysis of binary traits. Second, other tools focus on bivariate correlation between traits whilst LAVA offers this alongside multivariate tests such as multiple regression and partial correlation, enabling rigorous testing of pleiotropic effects. Lastly, LAVA is shown to provide results which are less biased than those from other tools.”

      LDSC was selected for the global genetic correlation analysis because the software is well-established and likely the most widely adopted global genetic correlation tool. Reflecting its prevalence, the software is also compatible with LAVA, which adjusts for sample overlap based on the bivariate intercept estimate returned by LDSC. Since global genetic correlations were not the primary focus of this study, having been tested across several previous investigations (see response 2), we did not prioritise comparison of correlation estimates from LDSC against other available software. In the manuscript (pages 7-8) we now include the following statement: “[LDSC] was also applied to derive ‘global’ (i.e., genome-wide) genetic correlation estimates between trait pairs and estimate sample overlap from the bivariate intercept. The latter of these outputs was taken forward as an input for the local genetic correlation analysis using LAVA (see 2.2.2.2). Since global genetic correlation analysis across the traits studied here is not novel and associations reported in past studies are congruent across different tools, the compatibility between LDSC and LAVA motivated our use of LDSC for this analysis”.

      (2) The second comment was that the paper would be strengthened by contextualising our study with detail around what is previously known about associations between the studied traits. Accordingly, we have added clarifying text at the end of the introduction, stating: “although previous studies have performed global genetic correlation analyses between various combinations of these traits {references}, this is the first to compare them at a genome-wide scale using a local genetic correlation approach“. In the discussion, we link back to these studies, stating that “Through genetic correlation analysis, we replicated genome-wide correlations previously described between the studied traits {references}”.

      (3) The reviewer highlighted that the abstract as originally written may mislead or confuse the reader and we agree that clarity could be improved with some restructuring. This has now been revised and should read more logically.

      (4) They also enquired about our reasons for suggesting that the implication of distinct variants for each trait from a colocalisation analysis suggests a distinct causal mechanism. We thank them for this question as it encouraged us to reconsider how best to present the results of this analysis. To answer their question:

      It is certainly true that nearby but distinct variants can confer the same effect. In a scenario where multiple distinct variants result in the same effect and thus increase susceptibility towards two or more related phenotypes, you would expect to find evidence of association to each relevant variant in GWAS across these related traits (even if the magnitude of the associations differ). Where biological mechanisms are shared, post-GWAS finemapping analysis would be expected to yield credible sets overlapping across the traits, and likewise, colocalisation analysis should converge on a set of credible SNPs that are candidates for the shared effect. Where multiple distinct variants confer the same effect, you would expect to see separate fine-mapping credible sets for these distinct variants that colocalise pairwise between the jointly-affected traits. Generally, therefore, evidence supporting the two distinct variants hypothesis would suggest the role of two distinct mechanisms except when certain credible sets identified through fine-mapping converge on a colocalised effect.

      There is a further caveat which we also explored in response to Reviewer two: if a region includes long-spanning LD (and hence a larger number of variants are considered in the analysis), then the colocalisation analysis is more likely to favour the two distinct variants hypothesis since the probability of the variants implicated in both traits being shared decreases. It is likely that support for the two independent variants hypothesis is correct in most of the comparisons from this study that favour this conclusion. This is because, generally, the fine-mapping credible sets do not overlap across trait pairs (Figure S4) and consequently the colocalisation analysis does not find any support for the shared variant hypothesis. An exception is the analysis of PD and schizophrenia at the MAPT locus on chromosome 17. We have accordingly added the following clarification to the (page 18): “However, the colocalisation analysis will increasingly favour the two independent variants hypothesis as the number of analysed variants increases. Hence, the wide-spanning LD of this region may have obstructed identification of variants and mechanisms shared between the traits.”

      Reviewer #2 (Public Review): 

      Summary: 

      Spargo and colleagues present an analysis of the shared genetic architectures of Schizoprehnia and several late-onset neurological disorders. In contrast to many polygenic traits for which global genetic correlation estimates are substantial, global genetic correlation estimates for neurological conditions are relatively small, likely for several reasons. One is that assortative mating, which will spuriously inflate genetic correlation estimates, is likely to be less salient for late-onset conditions. Another, which the authors explore in the current manuscript, is that some loci affecting two or more conditions (i.e., pleiotropic loci) may have effects in opposite directions, or shared loci are sparse, such that the global genetic correlation signal washes out. 

      The authors apply a local genetic correlation approach that assesses the presence and direction of pleiotropy in much smaller spatial windows across the genome. Then, within regions evidencing local genetic correlations for a given trait pair, they apply fine-mapping and colocalization methods to attempt to differentiate between two scenarios: that the two traits share the same causal variant in the region or that distinct loci within the region influence the traits. Interestingly, the authors only discover one instance of the former: an SNP in the HLA region appearing to confer risk for both AD and ALS. This is in contrast to six regions with distinct causal loci, and twenty regions with no clear shared loci. 

      Finally, the authors have published their analysis pipeline such that other researchers might easily apply the same techniques to other collections of traits. 

      Strengths: 

      - All such analysis pipelines involve many decision points where there is often no clear correct option. Nonetheless, the authors clearly present their reasoning behind each such decision. <br /> - The authors have published their analytic pipeline such that future researchers might easily replicate and extend their findings. 

      Weaknesses:

      - The majority of regions display no clear candidate causal variants for the traits, whether shared or distinct. Further, despite the potential of local genetic correlation analysis to identify regions with effects in opposing directions, all of the regions for causal variants were identified for both traits evidenced positive correlations. The reasons for this aren't clear and the authors would do well to explore this in greater detail. 

      - The authors very briefly discuss how their findings differ from previous analyses because of their strict inclusion for "high-quality" variants. This might be the case, but the authors do not attempt to demonstrate this via simulation or otherwise, making it difficult to evaluate their explanation. 

      We thank Reviewer two for their appraisal of this manuscript and kind comments regarding its strengths. We will now aim to address the identified weaknesses.

      (1) The reviewer comments that we did not adequately investigate why loci with causal variants identified in both traits all had positive local genetic correlations. We agree that it would be helpful to better understand the underlying reasons. To address this issue, we have added a new supplementary figure to compare the positive and negative local genetic correlation results (see Figure S2). In the main-text we add the following clarification. ”Although both positive and negative local genetic correlations passed the FDR-adjusted significance threshold, we observed only positive local genetic correlations in loci where fine-mapping credible sets were identified for both traits in the pair. This reflects that the correlation coefficients and variant associations from the analysed GWAS studies were generally stronger in the positively correlated loci (see Figure S2).”

      (2) The reviewer rightly suggests that the manuscript would benefit from an improved explanation of the somewhat inconsistent results for the colocalisation analysis of ALS and AD at the locus around the rs9275477 SNP from this work and a previous study.  We have now further investigated this and believe that the discrepancy results partly from an inherent empirical characteristic of the colocalisation analysis. We have explained this in the manuscript (page 22) as follows: “The previous study analysed a 200Kb window of over 2,000 SNPs around the lead genome-wide significant SNP from the ALS GWAS, rs9275477, and found ~0.50 posterior probability for each of the shared and two independent variant(s) hypotheses. The current analysis used 475 SNPs occurring within a semi-independent LD block of ~50kb in this locus. Since the posterior probability of the two independent variants hypothesis (H3) increases exponentially with the number of variants in the region whilst the shared variant hypothesis (H4) scales linearly, it is expected that our analysis would give stronger support for the latter. Given that the previous study defined regions for analysis based on an arbitrary window of ±100kb around each lead genome-wide significant SNP from the ALS GWAS and we defined each analysis region based on patterns of LD in European ancestry populations, it is reasonable to favour the current finding.”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Responses to recommendations

      Reviewer #1 (Recommendations For The Authors):

      Describe more precisely how gene expression graphs are built (tissues, reads counts). For example, how were read counts normalized? Were they from DESeq2 data, which only works by comparing two samples? If so, all samples should be independently compared to a reference and the normalized expression value of the reference will change from sample to sample... thus introducing a pure technical artifact.

      We have added additional information about the normalisation method to the

      Material and Methods section (Lines 597-598: “Lastly, expression levels shown in figures 2-5 are normalised gene counts produced by DESeq2.”) and figure legends

      (lines 247, 286, 372, 404: “Gene expression data was generated from whole fish.

      Expression levels were derived from DESeq2 normalised gene counts.”) to address this recommendation. 

      DESeq2 provides a reference independent normalisation through a median of ratios method (a good explanation can be found here:

      https://hbctraining.github.io/DGE_workshop/lessons/02_DGE_count_normalization.h tml). The normalised expression values are independent of any reference, and therefore will not change from sample and sample as suggested in this comment. In contrast, the pairwise comparisons are done when analysing significantly differentially expressed genes between two treatments using a Wald test, which is done against a reference and generates log2 fold change information and p-values.; however, this is different to the normalisation we described above.

      Provide bioinformatics workflows and, if possible, the set of parameters used, the computing resources, etc. Were some assembly finishing steps carried out (by long-range PCR?) and experimental validations (especially for allelespecific transcripts, by conventional RT-PCR based on diagnostic mutations)?

      We have added additional information on the bioinformatics workflows where required, including parameters used (Lines 530, 536, 549-551, and 574-583.). No finishing steps other than HiC scaffolding were performed. No allele-specific analysis was done as part of this manuscript.

      To further improve transparency, we have also uploaded all the scripts used for this study to https://github.com/R-Huerlimann/Malabar_grouper_genome and the gene models and functional annotation to https://figshare.com/projects/Malabar_grouper_Epinephelus_malabaricus_genome_ annotation/199909. This information has been added to the manuscript in lines 600601 and 609-611.

      Reviewer #3 (Recommendations For The Authors):

      General author response:

      All the recommendations of this reviewer are very relevant and would certainly provide a lot of information, but they are constituting a full project in themselves as they would imply establishing this grouper species as an experimental model in our lab. Currently we only have access to the larval and juvenile stages via a collaboration with the Okinawa Prefectural Sea Farming Center, which is an hour drive from our lab, and is limited to the grouper spawning season. If we want to do all what is suggested, we need to have a regular and easy access to the fishes. This would require establishing this model in our marine station, which is not possible due to space and time issues. These groupers grow to a very large size (1-2 m in length, and up to 150 kg in weight) and only mature into males after > 6 years.

      First and foremost, I would advise the authors to extend their TH and cortisol levels measurements to the entire developmental time considered in their analysis.

      For the reasons stated above we could not perform these experiments. We must emphasize that the data regarding TH are available for a closely related species (e.g., Epinephelus coioides, de Jesus et al. 1998) and there is no reason to think that the situation will be drastically different in E. malabaricus. In addition, given that we have now studied several coral reef fish species in the same context (clownfish, surgeonfish, damselfish, gobies) we observed that the transcriptomic data are more robust, more sensitive, and more precise than hormone measurements. 

      Consider carrying out in situ hybridisation of TSH with putative CRH receptors to determine if thyrotrophin could be competent to respond to HPA axis signals.

      We agree studying the interplay between corticoids and thyroid hormones at the neuroendocrine level would be desirable and we fully agree with the experiment suggested by the reviewer, but this is impossible in our current situation. We are not working with an establish animal model like zebrafish or Xenopus, but with a large, long-lived marine fish that reproduces in spawning aggregations and whose husbandry is notoriously difficult.

      Consider conducting cortisol treatment experiments to functionally determine if indeed cortisol is involved in grouper metamorphosis.

      We tried to do TH and cortisol treatments specifically on the early larval stages corresponding to the early TH peak to see how this would impact the development of the fin spines, but our trials were unsuccessful. The larvae at that stage are extremely fragile and even putting them into small volumes of treatment drugs induced massive mortalities. Again, this would mean establishing this grouper species as a model organism and would require a massive effort to improve larval rearing as discussed above. We feel that our data stands on its own in the meantime and adds valuable information to the existing literature by studying a rarely investigated species.

      Responses to comments

      Reviewer #1 (Public Review):

      Weaknesses:

      The manuscript needs proper editing and is not complete. Some wordings lack precision and make it difficult to follow (e.g. line 98 "we assembled a chromosome-scale genome of ..." should read instead "we assembled a chromsome-scla genome sequence of ...". Also, panel Figure 2E is missing.

      We made the suggested change of adding “sequence” in lines 32 and 121. Concerning additional changes, we have carefully edited our manuscript and looked for any incomplete sections. Unfortunately, it is difficult to see what other issues are being raised here without any further information. 

      As for panel E of figure 2, it is not missing. The panel is located to the right, just below “Target Cells”.

      The shortcomings of the manuscripts are not limited to the writing style, and important technical and technological information is missing or not clear enough, thereby preventing a proper evaluation of the resolution of the genomic resources provided:

      Several RNASeq libraries from different tissues have been built to help annotate the genome and identify transcribed regions. This is fine. But all along the manuscript, gene expression changes are summarized into a single panel where it is not clear at all which tissue this comes from (whole embryo or a specific tissue ?), or whether it is a cumulative expression level computed across several tissues (and how it was computed) etc. This is essential information needed for data interpretation.

      No fertilised eggs or embryos have been sequenced. The individual tissues derived from juvenile fish were used for the genome annotation only, using ISOseq. The whole larval fish were used for the developmental analysis using RNAseq, as well as the genome annotation. We have added additional information in the figures and text that the results shown are from whole larvae, and added more detail to the material and methods section about which type of sample was analysed in which way.

      Specifically, we have added “Lastly, expression levels shown in figures 2-5 are normalised gene counts produced by DESeq2.” to lines 597-598 in the Material and Methods section, “Gene expression data was generated from whole larvae.” to line 191, and “Gene expression data was generated from whole fish. Expression levels were derived from DESeq2 normalised gene counts.” to the figure legends in lines 247, 286, 372, 404). Additionally, we have added clarifications in lines 489, 497, 530, and 536. 

      The bioinformatic processing, especially of the assemble and annotation, is very poorly described. This is also a sensitive topic, as illustrated by the numerous "assemblathon" and "annotathon" initiatives to evaluate tools and workflows. Importantly, providing configuration files and in-depth description of workflows and parameter settings is highly recommended. This can be made available through data store services and documents even benefit from DOIs. This provides others with more information to evaluate the resolution of this work. No doubt that it is well done,but especially in the field of genome assembly and annotation, high resolution is VERY cost and time-intensive. Not surprisingly, most projects are conditioned by trade-offs between cost, time, and labor. The authors should provide others with the information needed to evaluate this.

      We have added additional information on parameters used in the genome assembly, annotation and transcriptome analysis in lines 549-551, 577, 579, 580, and 582. Additionally, we have uploaded all scripts to github as outlined in the Code and Data Availability section (lines 599-614).

      The genome assembly did not use a specific workflow (e.g., nextflow), but was done with a simple command and standard parameters in IPA. Scaffolding was carried out by Phase Genomics using their standardised proprietary workflow, of which a detailed description provided by Phase Genomics can be found in the supplementary material.

      Quantifications of T3 and T4 levels look fairly low and not so convincing. The work would clearly benefit from a discussion about why the signal is so low and what are the current technological limitations of these quantifications.

      This would really help (general) readers.

      The T3/T4 levels are consistent with other published work in fish. In the present manuscript for grouper we have a peak level of 1.2 ng/g (1,200 pg/g) of T4 and 0.06 ng/g (60 pg/g) of T3. This is a higher level of T4 and comparable level of T3 to what was found in convict tang (Holzer et al. 2017; Figure 2) with 30 pg/g of T4 and 100 pg/g of T3. Of course, there are also examples with higher levels, such as clownfish (Roux et al. 2023; Figure 1), with 10 ng/g (10,000 pg/g) of T4 and 2 ng/g (2,000 pg/g) of T3.

      The differences could be due to different structure of fish tissues and therefore different hormone extraction efficiency, different hormone measurement protocols, different fish physiology, different fish size (e.g., the weighting of tiny grouper larvae is difficult and less precise than in convict tang). What is important is not the absolute level but the relative level, which shows the change within different larval stages of a species with identical extraction and measurement protocols. Which means our data is internally consistent and coherent with what the grouper literature says.

      Holzer, Guillaume, et al. "Fish larval recruitment to reefs is a thyroid hormonemediated metamorphosis sensitive to the pesticide chlorpyrifos." Elife 6 (2017): e27595.

      Roux, Natacha, et al. "The multi-level regulation of clownfish metamorphosis by thyroid hormones." Cell Reports 42.7 (2023).

      Differential analysis highlights up to ~ 15,000 differentially expressed genes (DEG), out of a predicted 26k genes. This corresponds to more than half of all genes. ANOVA-based differential analysis relies on the simple fact that only a minority of genes are DEG. Having >50% DEG is well beyond the validity of the method. This should be addressed, or at least discussed.

      The large number of differentially expressed genes is due to the fact that this is coming from a larval developmental transcriptome going from one day old larva to fully metamorphosed juveniles at around day 60. 

      While DESeq2 indeed works on an assumption that most genes are not differentially expressed, this affects normalization but not hypothesis testing (Wald-test, LRT tests or ANOVA). However, normalisation in DESeq2 is fairly robust to this assumption. According to the author of DESeq2, Micheal Love, DESeq2 is using the median ratio for normalisation, and as long as the number of up and down regulated genes is relatively even, DESeq2 will be able to handle the data. As part of our general quality control for this project we consulted the MA plots, which do not show any overrepresented up or down expression patterns. Additionally see Michael Love comment on comparing different tissues, which is also applicable here when comparing vastly different larval stages (https://support.bioconductor.org/p/63630/):

      “For experiments where all genes increase in expression across conditions, the median ratio method will not be able to capture this difference, but this is typically not the case for a tissue comparison, as there are many "housekeeping" genes with relatively similar expression pattern across tissues.”

      Reviewer #3 (Public Review):

      Weaknesses:

      However, the authors make substantial considerations that are not proven by experimental or functional data. In fact, this is a descriptive study that does not provide any functional evidence to support the claims made.

      We agree with the reviewer that our paper lacks functional experiments but despite that, the transcriptomic data clearly show the activation of TH and corticoid pathways during two distinct periods: an early activation between D1 and D10, and a second one between D32 and juvenile stage. These data are interesting as they call for further examination of 1) the existence of an early larval developmental step also involving TH and corticosteroids and 2) the possible interaction of corticoids and TH during metamorphosis. This is a question that is certainly not settled yet in teleost fishes and which is of great interest.

      Especially 1) is of interest and importance, since this early activation (unique to our knowledge in any teleost fish studied so far) raises a lot of new questions and once again will certainly be scrutinised by other groups in the years to come, therefore ensuring a good citation impact of this study. We hope that the reviewer, while disagreeing with some our statements, will recognize that our study will be stimulating at that level and that this is what scientific studies should do.

      We acknowledge the descriptive nature of the data and the lack of functional experiments in the Discussion in lines 443 to 445: “This may suggest that in some aspect, cortisol synthesis could work in concert with TH, as has been shown in several different contexts in amphibians, but functional experiments need to be conducted to confirm this hypothesis.” As stated above doing such functional experiment would require establishing the grouper as an experimental model in our husbandry, which currently is not possible due to the large size of the adult fish.

      The consideration that cortisol is involved in metamorphosis in teleosts has never been shown, and the only example cited by the authors (REF 20) clearly states that cortisol alone does not induce flatfish metamorphosis. In that work, the authors clearly state that in vivo cortisol treatment had no synergistic effect with TH in inducing metamorphosis. Moreover, in Senegalensis, the sole pre-otic CRH neuron number decreases during metamorphosis, further arguing that, at least in flatfish, cortisol is not involved in flatfish metamorphosis (PMID: 25575457).  

      We will do our best to improve the clarity of the revised manuscript to avoid any misunderstanding about our claims. However, we would like to point out the semantic shift in the reviewer first sentence: Indeed “being involved” is not the same as “cortisol alone does not induce”. In ref 20 the authors explicitly wrote that “Cortisol further enhanced the effects of both T4 and T3, but was ineffective in the absence of thyroid hormones” and in our view this indeed corresponds to ”being involved in metamorphosis”.

      We are not claiming that cortisol alone is involved in metamorphosis as the reviewer suggests, but simply that there is a possible involvement of cortisol together with TH in metamorphosis. We stand on this claim as we indeed observed an activation of corticoid pathway genes around D32, which is sufficient to say it is involved. We do agree that functional experiments will be needed to properly demonstrate the involvement of corticoids in grouper metamorphosis, but this was not possible in the current study as it would imply to set up a full grouper life cycle in lab conditions which is impossible for the scope of this manuscript.

      We also mentioned in the discussion that the role of corticoids in fish larval development is still debated, and we agree that this remains a contentious issue. We have clarified the Discussion on this point (lines 375-376, lines 439-464).

      We wrote that “There is contrasting evidence of communication between these two pathways during teleost fish larval development with some data suggesting a synergic and other an antagonistic relationship. In terms of synergy, an increase in cortisol level concomitantly with an increase in TH levels has been observed in flatfish [26], golden sea bream [64] and silver sea bream [65]. Cortisol was also shown to enhance in vitro the action of TH on fin ray resorption (phenomenon occurring during flatfish metamorphosis) in flounder[27]. It has also been shown that cortisol regulates local T3 bioavailability in the juvenile sole via regulation of deiodinase 2 in an organ-specific manner [66]. On the antagonistic side, it has been shown that experimentally induced hyperthyroidism in common carp decreases cortisol levels[67], whereas cortisol exposure decreases TH levels in European eel [68]. Given this scattered evidence, the existence of a crosstalk active during teleost larval development and metamorphosis has never been formally demonstrated. The results we obtained in grouper are clearly indicating that HPI axis is activated during both early development and metamorphosis and that cortisol synthesis is activated during early development. This may suggest that in some aspect, cortisol synthesis could work in concert with TH, as has been shown in several different contexts in amphibians [25], but functional experiments need to be conducted to confirm this hypothesis.” In the revised manuscript, we have also added the interesting case of the Senegal sole mentioned by the reviewer.

      In the last revision, we had also added that our results “brought a first insight into the potential role of corticoids in the metamorphosis of E. malabaricus and call for functional experiments directly testing a possible synergy” meaning that we clearly acknowledge that we are only revealing a hypothesis that remains to be tested. We later follow up with a discussion about the most novel observation and focus of our study, the increase in THs and cortisol during early development, which was unexpected and very intriguing. Again, these results suggest that there might be a link between the two, as has been shown in amphibians. This is typically the kind of results that should encourage more investigations into other fish species. Indeed, this has been pointed out by other authors and in particular by Bob Denver (probably the foremost expert on this topic) in Crespi and Denver 2012: “Elevation in HPA/I axis activity has been described prior to Metamorphosis in amphibians and fish, birth in mammals (reviewed in Crespi & Denver 2005a; Wada 2008)”. B. Denver also adds that: “Experiments in which GCs were elevated prior to metamorphosis or prior to hatching or birth (e.g. Weiss, Johnston & Moore 2007) or inhibited by treatments with GC synthesis blockers (e.g. metyrapone) or receptor antagonists (e.g. RU486, Glennemeir & Denver 2002) demonstrate that GCs play a causal role in precipitating these life-history transitions (also reviewed in Crespi & Denver 2005a; Wada 2008).” We believe the reviewer will be convinced by these elements coming from a colleague unanimously respected in the field. 

      Furthermore, the authors need to recognise that the transcriptomic analysis is whole-body and that HPA axis genes are upregulated, which does not mean they are involved in regulating the HPT axis. The authors do not show that in thyrotrophs, any CRH receptor is expressed or in any other HPT axis-relevant cells and that changes in these genes correlate with changes in TSH expression. An in-situ hybridisation experiment showing co-expression on thyrotrophs of HPA genes and TSH could be a good start. However, the best scenario would be conducting cortisol treatment experiments to see if this hormone affects grouper metamorphosis.

      We agree that functional experiments are needed to validate our hypothesis. As the early peaks of expression levels observed for many genes were very intriguing for us, we did carry out thyroid hormones and goitrogenic treatment on young grouper larvae to test their effect on the morphological changes. Unfortunately, such experiments, already tricky on metamorphosing larvae, are even more risky on such tiny individuals just after hatching and we encountered high mortality rates. We must add that because we cannot establish a full grouper life cycle under lab conditions, we have done these experiments in the context of a commercial husbandry system in Japan, which while excellent limits the scope of possible experiments. We were thus not able to provide functional validation of our hypothesis. Such experiments will be a full project in itself, requiring setting up a rearing system suitable for both larval survival and economical constraints related to drug treatments. We were further limited by the spawning times of the grouper in the operational aquaculture farm, which are limited to a short time during each year. So even if we strongly agree with the necessity of conducting such experiments, we think that this is not in the scope of the present paper, but something future research can explore.

      High TSH and Tg levels usually parallel whole-body TH levels during teleost metamorphosis. However, in this study, high Tg expression levels are only achieved at the juvenile stage, whereas high TSH is achieved at D32, and at the juvenile stage, they are already at their lowest levels.

      This is exactly our point. We observe two peaks in TSH expression, one at D3 and one at D32. The peak at D3 coincides with high thyroid hormone levels on the same day, and while we have not measured TH at D32, existing literature shows that there is a peak in TH during that time (e.g., de Jesus et al., 1998). Similarly, there is a small peak of Tg at D3. Our manuscript focused more on the upregulation of these genes at D3, which has not been reported before in the literature and raised the question of the role of TH so early in the larval development, outside of the metamorphosis period. 

      Regarding the respective levels of TSH and Tg, we first would like to add that their respective order of appearance before metamorphosis (TSH at D32, Tg after) is consistent with what we would expect. We agree however that the strong increase of Tg and TPO expression is later than expected. Therefore, we have added the following sentence in lines 212 to 216: “The respective order of appearance of TSH and Tg (TSH at D32, Tg after) is consistent with what we would expect but a bit later than expected given the morphologicl transformation. It would be interesting to revisit this in a future series of experiments, with tighter temporal sampling to study how gene expression and morphological transformation aligned.“.

      It is very difficult to conclude anything with the TH and cortisol levels measurements. The authors only measured up until D10, whereas they argue that metamorphosis occurs at D32. In this way, these measurements could be more helpful if they focus on the correct developmental time. The data is irrelevant to their hypothesis.

      We respectfully disagree with the reviewer, considering that 1) TH levels have already been investigated in groupers coinciding with pigmentation changes and fin rays resorption (Figure 4 in de Jesus et al, 1998), 2) there is also evidence in numerous fish species that TH level increase is concomitant with increase of TH related genes, and 3) we observed in our data an increase in the expression of TH related genes as well as pigmentation changes and fin rays resorption. Based on our experience in fish metamorphosis and the literature we can say confidently that those observations indicate that metamorphosis is occurring between D32 and the juvenile stage. This clearly shows that our inference is correct. Additionally, we would like to reemphasize that from our experience in several fish species transcriptomic data are more robust and precise than hormone measurements.

      However, as we were surprised by the activation of TH and corticoid pathway genes very early in the larval development (at D3), which is clearly outside of the metamorphosis period, we decided to measure TH and cortisol levels during this period of time to determine if whether or not there this surprising early activation was indeed corresponding to an increase in both TH and cortisol. As such observation has never been made in other teleost species (to our knowledge), and as we were wondering if gene activation was accompanied by hormonal increase, the measurements we did for TH and cortisol between D1 and D10 are relevant. In order to clarify our message further, we have changed some of the mentions of

      “metamorphosis” to “larval development” throughout the manuscript and added other improvements to avoid any confusion between the two periods we are studying: early larval development (between D1 and D10) and metamorphosis (between D32 and juvenile stage).  

      Moreover, as stated in the previous review, a classical sign of teleost metamorphosis is the upregulation of TSHb and Tg, which does not occur at D32 therefore, it is very hard for me to accept that this is the metamorphic stage. With the lack of TH measurements, I cannot agree with the authors. I think this has to be toned down and made clear in the manuscript that D32 might be a putative metamorphic climax but that several aspects of biology work against it. Moreover, in D10, the authors show the highest cortisol level and lowest T4 and T3 levels. These observations are irreconcilable, with cortisol enhancing or participating in TH-driven metamorphosis.

      We thank the reviewer for this comment, but we think that there might be a misunderstanding here. 

      (1) We clearly observed an increase of TSHb (that occurs between D18 and juvenile stage) and an increase of tg from D32 which coincide with the activation of other genes involved in TH pathway (dio2, dio3, and also a strong increase of TRb). All this and put in the context of what we know from previous grouper studies, clearly supports our conclusion that TH-regulated metamorphosis is starting at around D32 in grouper. We also observed morphological changes such as fin rays resorption and pigmentation changes between D32 and juvenile stage. Such morphological changes have already been associated as corresponding to metamorphosis in groupers (De Jesus et al 1998) as they occur during TH level increase, and they also happen to be under the control of TH in grouper (De Jesus et al 1998). Based on this study but also on studies (conducted on many other teleost species) showing that the increase of TH levels is always associated with an activation of TH pathway genes and morphological and pigmentation changes we concluded that metamorphosis of E. malabaricus occurs between D32 and juvenile stage. We have improved the clarity of the manuscript in several places to make sure that our conclusion is based on our transcriptomic and morphological data plus the available literature.

      (2) We clearly observed another activation of TH related gene earlier in the development (between D1 and D10, with a surge of trhrs, tg and tpo at D3. As this activation was very unexpected for us, we decided to focus the analysis of TH levels between D1 and D10 and very interestingly we observed high level of T4 at D3 indicating that THs are instrumental very precociously in the larval development of the malabar grouper which has never been shown before. We declared lines 224-225 that our “data reinforce the existence of two distinct periods of TH signalling activity, one early on at D3 and one late corresponding to classic metamorphosis at D32”. However, we agree that we could have been clearer and clearly explained that this early activation was very intriguing for us and that we wanted to investigate hormonal levels around that period. However, we never claimed anywhere in the manuscript

      that this early developmental period corresponds to metamorphosis. Something else is occurring and both TH and cortisol seem to be involved but further experiments need to be conducted to understand their role and their possible interaction. We have added corresponding statements in the abstract (lines 39-43) and discussion (lines 447 to 449).

      (3) Finally, regarding the comment about cortisol enhancing or participating in TH driven metamorphosis, our data clearly showed an activation of the corticoid pathway genes around metamorphosis (between D32 and juvenile stage) suggesting a potential implication of corticoids in metamorphosis, but we agree with the reviewer that further experiment are needed to test that. We never claimed that cortisol was enhancing or participating in metamorphosis, on the contrary we are “suggesting a possible interaction between TH and corticoid pathway during metamorphosis”. And we also say that our “results brought a first insight into the potential role of corticoids in the metamorphosis of E. malabaricus and call for functional experiments directly testing a possible synergy.” Nonetheless, we agree that some parts of our manuscript can be confusing in regards of cortisol synthesis during metamorphosis as we did not measure cortisol levels between D32 and juvenile stage. We have therefore made changes throughout the Introduction and Discussion to make this clearer.

      Given this, the authors should quantify whole-body TH levels throughout the entire developmental window considered to determine where the peak is observed and how it correlates with the other hormonal genes/systems in the analysis.

      We did not measure TH levels at later stages as it has already been measured during Epinephelus coioides metamorphosis and the morphological changes observed in this species around the TH peak corresponds to what we observed in Epinephelus malabaricus around the peak of expression of TH pathway genes (see De Jesus et al., 1998 General and Comparative Endocrinology, 112:10-16). The main focus of this manuscript is the novel observation of the existence of an early activation period observed at D3, and for which we needed TH levels to determine if they were involved in another early developmental process (not related to metamorphosis). Our hypothesis is that this early activation might be related to the growth of fin rays necessary to enhance floatability during the oceanic larval dispersal. As we may have arrived at the explanation of this hypothesis too rapidly without setting up the context well enough, we have made changes to the introduction and discussion.

      Even though this is a solid technical paper and the data obtained is excellent, the conclusions drawn by the authors are not supported by their data, and at least hormonal levels should be present in parallel to the transcriptomic data. Furthermore, toning down some affirmations or even considering the different hypotheses available that are different from the ones suggested would be very positive.

      We thank the reviewer for acknowledging the solidity of the method of our paper and the quality of the results. We agree that there were several parts where our message was unclear. We have addressed these points in the revised version of the manuscript to make sure there is no more confusion between the two distinct periods we studied in this paper (early larval development and metamorphosis). We also made sure that our claims about TH/corticoids interaction during both periods remain hypothetical as we cannot yet, despite trials, sustain them with functional experiment.

    1. Author response:

      eLife assessment

      This study offers a useful treatment of how the population of excitatory and inhibitory neurons integrates principles of energy efficiency in their coding strategies. The analysis provides a comprehensive characterisation of the model, highlighting the structured connectivity between excitatory and inhibitory neurons. However, the manuscript provides an incomplete motivation for parameter choices. Furthermore, the work is insufficiently contextualized within the literature, and some of the findings appear overlapping and incremental given previous work.

      We thank the Reviewers and the Reviewing Editor for taking time to provide extremely valuable suggestions and comments, which will help us to substantially improve our paper. In what follows we summarize our current plan to improve the paper taking up on their suggestions.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary: Koren et al. derive and analyse a spiking network model optimised to represent external signals using the minimum number of spikes. Unlike most prior work using a similar setup, the network includes separate populations of excitatory and inhibitory neurons. The authors show that the optimised connectivity has a like-to-like structure, leading to the experimentally observed phenomenon of feature competition. They also characterise the impact of various (hyper)parameters, such as adaptation timescale, ratio of excitatory to inhibitory cells, regularisation strength, and background current. These results add useful biological realism to a particular model of efficient coding. However, not all claims seem fully supported by the evidence. Specifically, several biological features, such as the ratio of excitatory to inhibitory neurons, which the authors claim to explain through efficient coding, might be contingent on arbitrary modelling choices. In addition, earlier work has already established the importance of structured connectivity for feature competition. A clearer presentation of modelling choices, limitations, and prior work could improve the manuscript.

      Thanks for these insights and for this summary of our work.

      Major comments:

      (1) Much is made of the 4:1 ratio between excitatory and inhibitory neurons, which the authors claim to explain through efficient coding. I see two issues with this conclusion: (i) The 4:1 ratio is specific to rodents; humans have an approximate 2:1 ratio (see Fang & Xia et al., Science 2022 and references therein); (ii) the optimal ratio in the model depends on a seemingly arbitrary choice of hyperparameters, particularly the weighting of encoding error versus metabolic cost. This second concern applies to several other results, including the strength of inhibitory versus excitatory synapses. While the model can, therefore, be made consistent with biological data, this requires auxiliary assumptions.

      We will describe better the ratio of numbers of E and I neurons found in real data, as suggested. The first submission already contained an analysis of how this ratio of neuron numbers depends on the weighting of the loss of E and I neurons and on the relative weighting of the encoding error vs the metabolic cost in the loss function (see Fig 6E). We will make sure that these results are suitably expanded and better emphasized in revision. We will also include new analysis of dependence of optimal parameters on the relative weighting of encoding error vs metabolic cost in the loss function when studying other parameters (namely: noise intensity, metabolic constant, ratio of mean I-I to E-I connectivity, time constants of single E and I neurons).

      (2) A growing body of evidence supports the importance of structured E-I and I-E connectivity for feature selectivity and response to perturbations. For example, this is a major conclusion from the Oldenburg paper (reference 62 in the manuscript), which includes extensive modelling work. Similar conclusions can be found in work from Znamenskiy and colleagues (experiments and spiking network model; bioRxiv 2018, Neuron 2023 (ref. 82)), Sadeh & Clopath (rate network; eLife, 2020), and Mackwood et al. (rate network with plasticity; eLife, 2021). The current manuscript adds to this evidence by showing that (a particular implementation of) efficient coding in spiking networks leads to structured connectivity. The fact that this structured connectivity then explains perturbation responses is, in the light of earlier findings, not new.

      We agree that the main contribution of our manuscript in this respect is to show how efficient coding in spiking networks can lead to structured connectivity similar to those proposed in the above papers. We apologize if this was not clear enough in the previous version. We will make it clearer in revision.  We nevertheless think it useful to report the effects of perturbations within this network because the structure derived in our network is not identical to those studied in the above paper, and because these results give information about how lateral inhibition works in this network. Thus, we will keep presenting it in the revised version, although we will de-emphasize and simplify its presentation to give more emphasis to the novelty of the derivation of this connectivity rule from the principles of efficient coding.

      (3) The model's limitations are hard to discern, being relegated to the manuscript's last and rather equivocal paragraph. For instance, the lack of recurrent excitation, crucial in neural dynamics and computation, likely influences the results: neuronal time constants must be as large as the target readout (Figure 4), presumably because the network cannot integrate the signal without recurrent excitation. However, this and other results are not presented in tandem with relevant caveats.

      We will improve the Limitations paragraph in Discussion, and also anticipate caveats in tandem with results when needed, as suggested.

      (4) On repeated occasions, results from the model are referred to as predictions claimed to match the data. A prediction is a statement about what will happen in the future - but most of the "predictions" from the model are actually findings that broadly match earlier experimental results, making them "postdictions".

      This distinction is important: compared to postdictions, predictions are a much stronger test because they are falsifiable. This is especially relevant given (my impression) that key parameters of the model were tweaked to match the data.

      We will better distinguish between pre- and post-dictions  in revision.

      Reviewer #2 (Public Review):

      Summary: In this work, the authors present a biologically plausible, efficient E-I spiking network model and study various aspects of the model and its relation to experimental observations. This includes a derivation of the network into two (E-I) populations, the study of single-neuron perturbations and lateral-inhibition, the study of the effects of adaptation and metabolic cost, and considerations of optimal parameters. From this, they conclude that their work puts forth a plausible implementation of efficient coding that matches several experimental findings, including feature-specific inhibition, tight instantaneous balance, a 4 to 1 ratio of excitatory to inhibitory neurons, and a 3 to 1 ratio of I-I to E-I connectivity strength. It thus argues that some of these observations may come as a direct consequence of efficient coding.

      Strengths:

      While many network implementations of efficient coding have been developed, such normative models are often abstract and lacking sufficient detail to compare directly to experiments. The intention of this work to produce a more plausible and efficient spiking model and compare it with experimental data is important and necessary in order to test these models.

      In rigorously deriving the model with real physical units, this work maps efficient spiking networks onto other more classical biophysical spiking neuron models. It also attempts to compare the model to recent single-neuron perturbation experiments, as well as some long-standing puzzles about neural circuits, such as the presence of separate excitatory and inhibitory neurons, the ratio of excitatory to inhibitory neurons, and E/I balance. One of the primary goals of this paper, to determine if these are merely biological constraints or come from some normative efficient coding objective, is also important.

      Though several of the observations have been reported and studied before (see below), this work arguably studies them in more depth, which could be useful for comparing more directly to experiments.

      Thanks for these insights and for the kind words of appreciation of the strengths of our work.

      Weaknesses:

      Though the text of the paper may suggest otherwise, many of the modeling choices and observations found in the paper have been introduced in previous work on efficient spiking models, thereby making this work somewhat repetitive and incremental at times. This includes the derivation of the network into separate excitatory and inhibitory populations, discussion of physical units, comparison of voltage versus spike-timing correlations, and instantaneous E/I balance, all of which can be found in one of the first efficient spiking network papers (Boerlin et al. 2013), as well as in subsequent papers. Metabolic cost and slow adaptation currents were also presented in a previous study (Gutierrez & Deneve 2019). Though it is perfectly fine and reasonable to build upon these previous studies, the language of the text gives them insufficient credit.

      We will improve the text to make sure that credit to previous studies is more precisely and more clearly given.

      Furthermore, the paper makes several claims of optimality that are not convincing enough, as they are only verified by a limited parameter sweep of single parameters at a time, are unintuitive and may be in conflict with previous findings of efficient spiking networks. This includes the following. Coding error (RMSE) has a minimum at intermediate metabolic cost (Figure 5B), despite the fact that intuitively, zero metabolic cost would indicate that the network is solely minimizing coding error and that previous work has suggested that additional costs bias the output. Coding error also appears to have a minimum at intermediate values of the ratio of E to I neurons (effectively the number of I neurons) and the number of encoded variables (Figures 6D, 7B). These both have to do with the redundancy in the network (number of neurons for each encoded variable), and previous work suggests that networks can code for arbitrary numbers of variables provided the redundancy is high enough (e.g., Calaim et al. 2022). Lastly, the performance of the E-I variant of the network is shown to be better than that of a single cell type (1CT: Figure 7C, D). Given that the E-I network is performing a similar computation as to the 1CT model but with more neurons (i.e., instead of an E neuron directly providing lateral inhibition to its neighbor, it goes through an interneuron), this is unintuitive and again not supported by previous work. These may be valid emergent properties of the E-I spiking network derived here, but their presentation and description are not sufficient to determine this.

      We are addressing this issue in two ways. First, we will present results of joint sweeps of variations of pairs of parameters whose joint variations are expected to influence optimality in a way that cannot be understood varying one parameter at a time. Namely we plan to vary jointly the noise intensity and the metabolic constant, as well as the ratio of E to I neuron numbers and the ratio of mean I-I to E-I connectivity. Second, we will individuate a reasonable/realistic range of possible variations of each individual parameter and then perform a Monte Carlo search for the optimal point within this range, and compare the so-obtained results with those obtained from the understanding gained from varying one or two parameters at a time.  We will also add the suggested citation to Calaim et al. 2022 in regard to the points discussed above.

      We will improve the comparison between the Excitatory-Inhibitory and the 1-Cell-Type model (see reply to the suggestions of Referee 3 for more details).

      Alternatively, the methodology of the model suggests that ad hoc modeling choices may be playing a role. For example, an arbitrary weighting of coding error and metabolic cost of 0.7 to 0.3, respectively, is chosen without mention of how this affects the results. Furthermore, the scaling of synaptic weights appears to be controlled separately for each connection type in the network (Table 1), despite the fact that some of these quantities are likely linked in the optimal network derivation. Finally, the optimal threshold and metabolic constants are an order of magnitude larger than the synaptic weights (Table 1). All of these considerations suggest one of the following two possibilities. One, the model has a substantial number of unconstrained parameters to tune, in which case more parameter sweeps would be necessary to definitively make claims of optimality. Or two, parameters are being decoupled from those constrained by the optimal derivation, and the optima simply corresponds to the values that should come out of the derivation.

      In the previously submitted manuscript we presented both the encoding error and the metabolic cost separately as a function of the parameters, so that readers could get an understanding of how stable optimal parameters would be to the change of the relative weighting of encoding error and metabolic cost. We will improve this work by adding the suggested calculations to provide quantitative measures of the dependence of the optimal network parameters and configurations on this relative weighting.

      Reviewer #3 (Public Review):

      Summary: In their paper the authors tackle three things at once in a theoretical model: how can spiking neural networks perform efficient coding, how can such networks limit the energy use at the same time, and how can this be done in a more biologically realistic way than previous work?

      They start by working from a long-running theory on how networks operating in a precisely balanced state can perform efficient coding. First, they assume split networks of excitatory (E) and inhibitory (I) neurons. The E neurons have the task to represent some lower dimensional input signal, and the I neurons have the task to represent the signal represented by the E neurons. Additionally, the E and I populations should minimize an energy cost represented by the sum of all spikes. All this results in two loss functions for the E and I populations, and the networks are then derived by assuming E and I neurons should only spike if this improves their respective loss. This results in networks of spiking neurons that live in a balanced state, and can accurately represent the network inputs.

      They then investigate in-depth different aspects of the resulting networks, such as responses to perturbations, the effect of following Dale's law, spiking statistics, the excitation (E)/inhibition (I) balance, optimal E/I cell ratios, and others. Overall, they expand on previous work by taking a more biological angle on the theory and showing the networks can operate in a biologically realistic regime.

      Strengths:

      (1) The authors take a much more biological angle on the efficient spiking networks theory than previous work, which is an essential contribution to the field.

      (2) They make a very extensive investigation of many aspects of the network in this context, and do so thoroughly.

      (3) They put sensible constraints on their networks, while still maintaining the good properties these networks should have.

      Thanks for this summary and for these kind words of appreciation of the strengths of our work.

      Weaknesses:

      (1) The paper has somewhat overstated the significance of their theoretical contributions, and should make much clearer what aspects of the derivations are novel. Large parts were done in very similar ways in previous papers. Specifically: the split into E and I neurons was also done in Boerlin et al (2008) and in Barrett et al (2016). Defining the networks in terms of realistic units was already done by Boerlin et al (2008). It would also be worth it to discuss Barrett et al (2016) specifically more, as there they also use split E/I networks and perform biologically relevant experiments.

      We will improve the text to make sure that credit to previous studies is more precisely and more clearly given.

      (2) It is not clear from an optimization perspective why the split into E and I neurons and following Dale's law would be beneficial. While the constraints of Dale's law are sensible (splitting the population in E and I neurons, and removing any non-Dalian connection), they are imposed from biology and not from any coding principles. A discussion of how this could be done would be much appreciated, and in the main text, this should be made clear.

      We indeed removed non-Dalian connections because having only connections respecting Dale’s law is a major constraint for biological plausibility. Our logic was to consider efficient coding within the space of networks that satisfy this (and other) biological plausibility constraints. We did not intend to claim that removing the non-Dalian connections was the result of an analytical optimization. However, to get better insights into how Dale’s Law constrains or influences the design of efficient networks, we added a comparison of the coding properties of networks that either do or do not satisfy Dale’s law. We apologize if this was not sufficiently clear in the previous version and we will clarify this in revision. 

      (3) Related to the previous point, the claim that the network with split E and I neurons has a lower average loss than a 1 cell-type (1-CT) network seems incorrect to me. Only the E population coding error should be compared to the 1-CT network loss, or the sum of the E and I populations (not their average). In my author recommendations, I go more in-depth on this point.

      We will perform the suggested detailed comparisons between the network loss in the 1CT-model and E-I model and then revise or refine conclusions if and as needed, according to the results we will obtain.

      (4) While the paper is supposed to bring the balanced spiking networks they consider in a more experimentally relevant context, for experimental audiences I don't think it is easy to follow how the model works, and I recommend reworking both the main text and methods to improve on that aspect.

      We will try to make the presentation of the model more accessible to a non-computational audience.

      Assessment and context: Overall, although much of the underlying theory is not necessarily new, the work provides an important addition to the field. The authors succeeded well in their goal of making the networks more biologically realistic, and incorporating aspects of energy efficiency. For computational neuroscientists, this paper is a good example of how to build models that link well to experimental knowledge and constraints, while still being computationally and mathematically tractable. For experimental readers, the model provides a clearer link between efficient coding spiking networks to known experimental constraints and provides a few predictions.

      Thanks for these kind words. We will make sure that these points emerge more clearly and in a more accessible way from the revised paper.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The endocannabinoid system (ECS) components are dysregulated within the lesion microenvironment and systemic circulation of endometriosis patients. Using endometriosis mouse models and genetic loss of function approaches, Lingegowda et al. report that canonical ECS receptors, CNR1 and CNR2, are required for disease initiation, progression, and T-cell dysfunction.

      Strengths:

      The approach uses genetic approaches to establish in vivo causal relationships between dysregulated ECS and endometriosis pathogenesis. The experimental design incorporates both bulk and single-cell RNAseq approaches, as well as imaging mass spectrometry to characterize the mouse lesions. The identification of immune-related and T-cell-specific changes in the lesion microenvironment of CNR1 and CNR2 knockout (KO) mice represents a significant advance

      Weaknesses:

      Although the mouse phenotypic analyses involve a detailed molecular characterization of the lesion microenvironment using genomic approaches, detailed measurements of lesion size/burden and histopathology would provide a better understanding of how CNR1 or CNR2 loss contributes to endometriosis initiation and progression. The cell or tissue-specific effects of the CNR1 and CNR2 are not incorporated into the experimental design of the studies. Although this aspect of the approach is recognized as a major limitation, global CNR1 and CNR2 KO may affect normal female reproductive tract function, ovarian steroid hormone levels, decidualization response, or lead to preexisting alterations in host or donor tissues, which could affect lesion establishment and development in the surgically induced, syngeneic mouse model of endometriosis.

      We appreciate the reviewer's thoughtful and constructive feedback. We agree that the additional measurements of lesion size/burden and histopathology would provide valuable insights into the specific contributions of CNR1 and CNR2 to endometriosis progression. However, the focus of this study was on assessing the alterations in complex immune microenvironment due to the absence of CNR1 and CNR2, given their close relation in regulating immune cell populations. We will plan to incorporate these measurements in future studies to further strengthen the understanding of the disease pathogenesis. Regarding the potential effects of global knockout, the reviewer raises a valid concern. To address this, we will explore cell and/or tissue-specific knockout models in future experiments to better isolate the direct effects of CNR1 and CNR2 on the disease process, while minimizing potential confounding factors from systemic alterations.

      Reviewer #2 (Public Review):

      Summary:

      The endocannabinoid system (ECS) regulates many critical functions, including reproductive function. Recent evidence indicates that dysregulated ECS contributes to endometriosis pathophysiology and the microenvironment. Therefore, the authors further examined the dysregulated ECS and its mechanisms in endometriosis lesion establishment and progression using two different endometrial sources of mouse models of endometriosis with CNR1 and CNR2 knockout mice. The authors presented differential gene expressions and altered pathways, especially those related to the adaptive immune response in CNR1 and CNR2 ko lesions. Interestingly, the T-cell population was dramatically reduced in the peritoneal cavity lacking CNR2, and the loss of proliferative activity of CD4+ T helper cells. Imaging mass cytometry analysis provided spatial profiling of cell populations and potential relationships among immune cells and other cell types. This study provided fundamental knowledge of the endocannabinoid system in endometriosis pathophysiology.

      Strengths:

      Dysregulated ECS and its mechanisms in endometriosis pathogenesis were assessed using two different endometrial sources of mouse models of endometriosis with CNR1 and CNR2 knockout mice. Not only endometriotic lesions, but also peritoneal exudate (and splenic) cells were analyzed to understand the specific local disease environment under the dysregulated ECS.

      Providing the results of transcriptional profiles and pathways, immune cell profiles, and spatial profiles of cell populations support altered immune cell population and their disrupted functions in endometriosis pathogenesis via dysregulation of ECS.

      In line 386: Role of CNR2 in T cells. The finding that nearly absent CD3+ T cells in the peritoneal cavity of CNR2 ko mice is intriguing.

      The interpretation of the results is well-described in the Discussion.

      Weaknesses:

      The study was terminated and characterized 7 days after EM induction surgery without the details for selecting the time point to perform the experiments.

      The authors also mentioned that altered eutopic endometrium contributes to the establishment and progression of endometriosis. This reviewer agrees with lines 324-325. If so, DEGs are likely identified between eutopic endometrium (with/without endometriosis lesion induction) and ectopic lesions. It would be nice to see the data (even though using publicly available data sets).

      Figure 7 CDEF. The results of the statistical analyses and analyzed sample numbers should be added. Lines 444-450 cannot be reviewed without them.

      This reviewer agrees with lines 498-500. In contrast, retrograded menstrual debris is not decidualized. The section could be modified to avoid misunderstanding.

      We would like to thank the reviewer for insightful comments, suggestions and acknowledging the importance of the work presented in this manuscript.

      Regarding 7-day time point, we have provided rationale in lines 479-481, but agree that it isn’t sufficient and hence we have provided additional details on the selection of the 7-day time point for the experiments in methods section (Mouse model of EM). We have also noted the suggestion on providing comparison of differentially expressed genes in the eutopic endometrium vs ectopic lesions. Since there are publications comparing the eutopic vs ectopic gene expression patterns (PMIDs: 33868805 and 18818281), including a study exploring the ECS genes in the endometrium throughout different menstrual cycles (PMID: 35672435), we believe additional analysis using the same dataset may not yield new information. However, we see the value in reviewer’s comment, and we will look at the gene expression patterns in the uterine vs endometriosis like lesions in our future studies with tissue or cell specific CNR1 and CNR2 knockout models to understand functional relevance of ECS in endometriosis initiation.

      Since the IMC study was exploratory for proof of concept, we did not have enough biological replicates for meaningful statistical validation (n = 2-3). We have clarified this information in the methods, results, and figure legends for appropriately representing the limitations of the current setup.

      Finally, we appreciate the feedback on the section discussing retrograded menstrual debris. Even though the menstrual debris may not be decidualized, some endometriotic lesions have the ability to decidualize based on their response to estrogen and progesterone in a cycling manner (PMID: 26450609), similar to the endometrium in the uterine cavity. We have clarified this in the revised MS.

      Recommendations for the Authors:

      Reviewer #1 (Recommendations For The Authors):

      The mechanism of how alterations in ECS contribute to the observed cellular and molecular changes is unclear. Connecting CNR1 or CNR2 function to a specific cell type or cellular process would provide a more detailed understanding of how dysregulated ECS contributes to endometriosis pathogenesis.

      We agree that integrating the functions of CNR1 or CNR2 to specific cell types or cellular processes would strengthen the mechanistic insights presented in our study. This would help elucidate specific pathways by which dysregulated ECS leads to the alterations in immune cell populations, gene expression profiles, and other key aspects of endometriosis development and progression. This is a rapidly evolving field and at this stage, we do not have published information to reflect on this aspect in the revised manuscript.

      (1) As mentioned in the text, the ECS components being studied are widely expressed and may affect multiple aspects of endometriosis pathogenesis and symptomatology. However, the cell or tissue-specific effects of the CNR1 and CNR2 are not incorporated into the experimental design of the studies. Although these limitations are mentioned in the discussion, it is important to know if global CNR1 and CNR2 KO affect normal female reproductive tract function, ovarian steroid hormone levels, decidualization response, or if preexisting alterations in host or donor tissues affect lesion development in the surgically induced, syngeneic mouse model of endometriosis. This would also be the case in studies on immune system dysfunction or lesion microenvironment, as it is possible preexisting immune system dysfunction following CNR1 or CNR2 loss could alter the disease trajectory and lead to a misinterpretation of the findings. Some of these potential confounders could be addressed using crossover approaches in Figure 1A experimental design, but the donor tissues are reported to be matched to the recipients based on genotype.

      The reviewer raised an excellent point that the widespread expression of the ECS components studied in our manuscript may affect multiple aspects of endometriosis pathogenesis and symptomatology. Indeed, the cell or tissue-specific effects of CNR1 and CNR2 knockout are not fully incorporated into our experimental design, which could lead to potential confounding factors that may affect the interpretation of some of our findings. However, as outlined in our previous comments, we will incorporate the tissue/cell specific knockout, as well the crossover approaches to elucidate if the loss of CNR1 and CNR2 function is lesion driven in future studies. We agree that it is important to understand the impact of global CNR1 and CNR2 knockout on normal female reproductive tract function, ovarian steroid hormone levels, decidualization response, and other potential preexisting alterations in the host or donor tissues that could influence lesion development in the syngeneic mouse model of endometriosis. As outlined in the MS (lines 59-62), there are studies highlighting pregnancy specific impact including implantation and impaired primary decidual zone formation. We did not find any baseline alterations in the systemic immune profiles between the CNR1 and CNR2 knockout mice and the WT mice without EM induction. However, the uterine environment has not been assessed to understand the baseline immune profile between the knockout mice and WT mice. We agree with the reviewer that, the possibility of preexisting immune system dysfunction following CNR1 or CNR2 loss could alter the disease trajectory related to immune system dysfunction or lesion microenvironment. We have highlighted this in the limitations section.

      (2) The phenotypic characterization of the endometriosis mouse model with or without CNR1 or CNR2 KO is very limited. To better understand how the observed cellular and molecular alterations correlate with endometriosis pathogenesis and severity CNR1 and CNR2 K/O mice, a detailed characterization of lesion size differences and histopathology should be made. Importantly, the histopathological characterization of the lesions would complement the imaging mass spectrometry findings.

      We agree that more detailed characterization of the endometriosis lesions in our CNR1 and CNR2 knockout mouse models are required. As evident for our several previous publications, we have focused on detailed histopathological characterization of endometriotic lesions in our syngeneic mouse model of endometriosis including a multiple time course study (Symons et al, 2020, FASEB). In the present investigation, we focused on cataloging spatial and transcriptomic changes as we do not currently have any information on the global influence of CNR1 and CNR2 knockout on endometriosis lesion microenvironment, since we prioritized this aspect, we were not able to provide detailed histological assessment of lesions. However, the IMC analysis provides a detailed, spatially resolved profile of the cellular composition and interactions within the endometriotic lesions, which we believe offers valuable insights into the mechanisms by which the dysregulated ECS may contribute to endometriosis pathogenesis. This quantitative, high-dimensional approach complements the transcriptional profiling and other analyses we have performed.

      (3) Given the effect sizes and variance observed with the ECS ligand measurements, an N = 4-5 biological samples for mouse phenotypic studies seems too low.

      The reviewer raises a valid point about low sample size. As elaborated earlier, this was a proof of principle study to capture biologically significant alterations within lesion and surrounding peritoneal microenvironment in the absence of CNR1, CNR2 receptors. This information is crucial for establishing the potential mechanisms by which the dysregulated ECS may contribute to the pathogenesis of endometriosis. Now that we have established the framework and baseline understanding of immune-inflammatory alterations, we will refine our future experimental approaches and include more samples if becomes necessary.

      Reviewer #2 (Recommendations For The Authors):

      It is hard to read the labeling of figures. Please increase the font size of each figure.

      We have increased the font size of the labels where necessary to improve the readability.

      Supplementary Data 1, Table 1 seems like Supplementary Table 1. Please use the same labeling of the Supplementary tables and figures to avoid confusion.

      We have updated the labeling accordingly and ensured that all supplementary tables and figures are consistently labeled.

      This reviewer suggests depositing RNA-seq and IMC data to NCBI etc. and listing the accession number in the MS.

      Thank you for your recommendation to deposit the RNA-seq and imaging mass cytometry (IMC) data from our study in public repositories such as NCBI. We appreciate your suggestion, as data sharing is an important aspect of scientific transparency and reproducibility. Bulk mRNA sequencing data has been attached as a supplementary file and IMC data has been deposited on Mendeley Data (DOI: 10.17632/2ptns5yhzh.1).

      Please clarify L363.

      We have clarified this in the revised MS. The revised text now reads: “However, we did not find the same differences (T cell-related genes) in the UnD lesions of CNR2 k/o mice. Moreover, UnD lesions of CNR2 k/o mice showed significantly low number of DEGs (11 compared to 65 in the DD lesions from CNR2 k/o mice) suggesting a decidualization dependent response (Supplementary Data 3).”

      Figure 7B: It is hard to see/understand the results in L438-440. It might be helpful if % is added to the figure.

      We have added more tick marks to the y-axis of Figure 7B to make it easier for the reader to interpret the percentages of the different cell types.

      Figure 7 legend: 2nd D should be G.

      We have revised the legend accordingly.

      Supplementary Figure 6: It seems immune cells are clustered in CN1, which is different from Figure 7. To easily understand Suppl Fig 6AB, please add some details in the legend.

      We have revised the legend as suggested.

      The revised legend now reads: “A, B Representative image of 8 distinct cell types from CN analysis of DD and UnD lesions from WT, CNR1 k/o, and CNR2 k/o mice, respectively. C Heatmap representation of CN analysis shows distinct clustering patterns observed in the UnD lesions among the different genotypes. The clustering reveals distinct spatial patterns of immune cell populations within the UnD lesions, which appear to differ from the observations in Figure 7G. This suggests potential spatial heterogeneity in the immune landscape of EM like lesions under conditions of decidualization.”

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The study reports on a previously unrecognized function of ATG6 in plant immunity. The work is valuable because it proposes a direct interaction between ATG6 and a well-studied salicylic acid receptor protein, NPR1, which may interest researchers investigating plant immunity regulation. While the data presented are compelling, more information regarding the specificity of ATG6's role would improve the overall impact of the study, especially with an eye towards consistency with prior work.

      We also genuinely thank the editor and reviewers for the constructive and helpful suggestions and comments. These comments have greatly improved the quality and thoroughness of our manuscript. We have carefully studied these comments and have made the appropriate changes as far as possible. Additionally, some minor errors were also corrected during the revision process. New text is shown in blue in the revised manuscript. Our responses to the reviewer's comments are provided below each respective comment.

      Public Reviews:

      Reviewer #1 (Public Review):<br /> Summary:<br /> The authors showed that autophagy-related genes are involved in plant immunity by regulating the protein level of the salicylic acid receptor, NPR1.<br /> Strengths:<br /> The experiments are carefully designed and the data is convincing. The authors did a good job of understanding the relationship between ATG6 and NRP1.

      Thank you very much for recognizing our research.

      Weaknesses:<br /> - The authors can do a few additional experiments to test the role of ATG6 in plant immunity.<br /> I recommend the authors to test the interaction between ATGs and other NPR1 homologs (such as NPR2).

      Thanks to your valuable feedback, it was discovered that the Arabidopsis NPRs family comprises six members: NPR1, NPR2, NPR3, NPR4, NPR5/PETIOLE 1 (BOP1), and NPR6/BOP2. NPR3/4 function in tandem as negative regulators to modulate SA signaling and plant immune responses (Ding et al., 2018). Similar to NPR1, NPR2 acts as a positive regulator of SA signaling (Castello et al., 2018). NPR5/BOP1 and NPR6/BOP2 primarily participate in the regulation of plant growth and development (McKim et al., 2008). This study specifically investigates the correlation between ATG6 and NPRs in plant resistance to pathogenic bacteria. Consequently, we experimentally confirmed the interaction between ATG6 and NPR1, NPR3, and NPR4 (Fig. 1 and Fig. S1 in the revised manuscript). It would be intriguing to further explore the interactions between ATG6 and other NPRs in the context of regulating plant growth and development in future research endeavors.

      -The concentration of SA used in the experiment (0.5-1 mM) seems pretty high. Does a lower concentration of SA induce ATG6 accumulation in the nucleus?

      Thank you for pointing this out. The NPR1 protein is known to be unstable and prone to degradation through the 26S proteasome pathway (Spoel et al., 2009; Saleh et al., 2015). Consequently, to investigate the function of NPR1, many scientists and research groups typically employ higher concentrations of SA (e.g., 0.5 mM, 1 mM, or even 5 mM) to elucidate its role (Spoel et al., 2009; Fu et al., 2012; Lee et al., 2015; Saleh et al., 2015; Skelly et al., 2019; Zavaliev et al., 2020; Chen et al., 2021a). In our study, we observed an interaction between ATG6 and NPR1. To enhance the detection of the NPR1 protein, we standardized the SA concentration (Arabidopsis was treated with 0.5 mM SA; Tobacco was treated with 1 mM SA) used in our experiments. Subsequently, we analyzed the nuclear accumulation ATG6 or NPR1 using a relatively high SA concentration (Arabidopsis was treated with 0.5 mM SA; Tobacco was treated with 1 mM SA), consistent with concentrations used in previous studies (Spoel et al., 2009; Lee et al., 2015; Saleh et al., 2015; Skelly et al., 2019; Zavaliev et al., 2020; Chen et al., 2021a).

      -Does the silencing of ATG6 affect the cell death (or HR) triggered by AvrRPS4?

      Thank you for pointing this out. In this study, we examined changes in Pst DC3000/avrRps4-induced cell death in Col, amiRNAATG6 # 1, amiRNAATG6 # 2, npr1, NPR1-GFP, ATG6-mCherry and ATG6-mCherry × NPR1-GFP plants. The results of Taipan blue staining showed that Pst DC3000/avrRps4-induced cell death in npr1, amiRNAATG6 # 1 and amiRNAATG6 # 2 was significantly higher compared to Col (Fig. S15 in the revised manuscript). Conversely, Pst DC3000/avrRps4-induced cell death in ATG6-mCherry, NPR1-GFP and ATG6-mCherry × NPR1-GFP was significantly lower compared to Col. Notably, Pst DC3000/avrRps4-induced cell death in ATG6-mCherry × NPR1-GFP was significantly lower compared ATG6-mCherry and NPR1-GFP (Fig. S15 in the revised manuscript). These results suggest that ATG6 and NPR1 cooperatively inhibit Pst DC3000/avrRps4-induced cell dead. The relevant description can be found in lines 394-404 of the revised manuscript.

      -SA and NPR1 are also required for immunity and are activated by other NLRs (such as RPS2 and RPM1). Is ATG6 also involved in immunity activated by these NLRs?

      Thank you for your valuable comments. The most notable event in the NLR-mediated ETI immune response is the induction of hypersensitive response-programmed cell death (HR-PCD) (Jones and Dangl, 2006; Yuan et al., 2021). SA plays a dual role in the ETI response. On one hand, the accumulation of SA during the R gene-mediated ETI defense response is directly linked to the onset of HR-PCD (Nawrath and Metraux, 1999). SA and NPR1 can enhance the ETI response by regulating the expression of downstream target genes (Falk et al., 1999; Feys et al., 2001; Ding et al., 2018; Liu et al., 2020). On the other hand, the activation of SA signaling can have a negative regulatory effect on HR-PCD during the ETI response. High levels of SA have been shown to significantly inhibit HR-PCD triggered by the avrRpt2 effector (Rate and Greenberg, 2001; Devadas and Raina, 2002; Jurkowski et al., 2004). Rate et al. discovered that the inhibition of HR-PCD by SA relies on NPR1 (Rate and Greenberg, 2001).

      Arabidopsis AtATG6 or its homologs in other species (such as NbBECLIN1, TaATG6s, etc.) have been identified as positive regulators in plant immunity, playing a crucial role in inhibiting cell death and preventing invasion by pathogenic microorganisms (Liu et al., 2005; Patel and Dinesh-Kumar, 2008; Yue et al., 2015). Patel et al. demonstrated that, akin to autophagy-deficient mutants previously documented, AtATG6 antisense (AtATG6-AS) plants treated with Pst DC3000/avrRpm1 exhibited diffuse cell death, indicating the necessity of ATG6 in restricting cell death (Patel and Dinesh-Kumar, 2008). In tobacco, deficiencies in BECLIN 1 result in the onset of diffuse HR-PCD, underscoring the essential role of BECLIN 1 in limiting HR-PCD (Liu et al., 2005). Despite the genetic evidence supporting the critical function of ATG6 in plant immunity, the precise molecular mechanisms through which ATG6 impedes the invasion of pathogenic microorganisms remain elusive.

      In our study, we uncovered that ATG6 interacts with NPR1 to hinder pathogen invasion and inhibit the initiation of cell death. In animals, members of the NLR family have been observed to interact with the autophagy-related protein LC3 to inhibit the survival of pathogen (Zhang et al., 2019). Similar mechanisms may exist in plants. However, it remains to be explored whether NLR directly induces the activation of ATG6 through interaction or the relationship between NPR1-ATG6 interactions and NLR-mediated plant immunity, necessitating further investigation.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript by Zhang et al. explores the effect of autophagy regulator ATG6 on NPR1-mediated immunity. The authors propose that ATG6 directly interacts with NPR1 in the nucleus to increase its stability and promote NPR1-dependent immune gene expression and pathogen resistance. This novel role of ATG6 is proposed to be independent of its role in autophagy in the cytoplasm. The authors demonstrate through biochemical analysis that ATG6 interacts with NPR1 in yeast and very weakly in vitro. They further demonstrate using overexpression transgenic plants that in the presence of ATG6-mcherry the stability of NPR1-GFP and its nuclear pool is increased.

      However, the overall conclusions of the study are not well supported experimentally. The significance of the findings is low because of their mostly correlational nature, and lack of consistency with earlier reports on the same protein.

      Thank you for your valuable and constructive suggestions. In this article, we unveil a novel relationship in which ATG6 positively regulates NPR1 in plant immunity (Fig. 8 in the revised manuscript). ATG6 interacts with NPR1 to synergistically enhance plant resistance by regulating NPR1 protein levels, stability, nuclear accumulation, and formation of SINCs-like condensates. This may be of interest to researchers studying the regulation of plant immunity. While there may be minor flaws in our current study, the significance of these findings cannot be overstated, as they have the potential to redirect scientific attention towards uncovering novel functions for autophagy genes.

      Based on the integrity and quality of the data as well as the depth of analysis, it is not yet clear if ATG6 is a specific regulator of NPR1 or if it is affecting NPR1's stability indirectly, through inducing an elevation of SA levels in plants. As such, the current study demonstrates a correlation between overexpression of ATG6, SA accumulation, and NPR1 stability, however, whether and how these components work together is not yet demonstrated.

      Thanks to your valuable feedback. Although as the reviewer said there may be some flaws in our data from the current results, scientific research is an ongoing process and I am confident that future studies will be even better. From the results given to us at the moment at least this study reports a previously undiscovered function of ATG6 in plant immunity. We propose a direct interaction between ATG6 and a well-studied salicylic acid receptor protein, NPR1. We unveil a novel relationship in which ATG6 positively regulates NPR1 in plant immunity (Fig. 8 in the revised manuscript). ATG6 interacts with NPR1 to synergistically enhance plant resistance by regulating NPR1 protein levels, stability, nuclear accumulation, and formation of SINCs-like condensates. This may be of interest to researchers studying the regulation of plant immunity.

      Based on the provided biochemical data, it is not yet clear if the ATG6 functions specifically through NPR1 or through its paralogs NPR3 and NPR4, which are negative regulators of immunity. It is quite possible that interaction with NPR1 (or any NPR) is not the major regulatory step in the activity of ATG6 in plant immunity. The effect of ATG6 on NPR1 could well be indirect, through a change in the SA level and redox environment of the cell during the immune response. Both SA level and redox state of the cell were reported to induce accumulation of NPR1 in the nucleus and increase in stability.

      Thanks to your valuable feedback. In this study, we validated the interaction between ATG6 and NPR1 through various approaches and identified the key regions mediating their interaction. Our findings indicate that ATG6 interacts with NPR1 to synergistically enhance plant resistance by regulating NPR1 protein levels, stability, nuclear accumulation, and the formation of SINC-like condensates. These results clearly demonstrate the involvement of ATG6 in the regulation of NPR1.Furthermore, we also found that ATG6 interacts with NPR3/4 (Fig. S1 in the revised manuscript). This is particularly relevant given that NPR3 and NPR4 have been shown to act as adaptors for the ubiquitin E3 ligase Cullin 3 (CUL3) to regulate the degradation of NPR1. Therefore, whether ATG6 regulates NPR1 through its interactions with NPR3/4 is an intriguing question worth exploring in future studies. We appreciate the reviewer's concerns and are committed to addressing them in our future research to further elucidate the complex regulatory mechanisms involving ATG6, NPR1, and other key players in plant immunity.

      Another major issue is the poor quality of the subcellular analyses. In contradiction to previous studies, ATG6 in this study is not localized to autophagosome puncta, which suggests that the soluble localization pattern presented here does not reflect the true localization of ATG6. Even if the authors propose a novel, non-canonical nuclear localization for ATG6, they still should have detected the canonical autophagy-like localization of this protein.

      Thanks to your valuable feedback. We conducted predictions at NLS Mapper (https://nls-mapper.iab.keio.ac.jp/cgi-bin/NLS_Mapper_form.cgi) and identified two bipartite NLSs in ATG6, with the sequences "MRKEEIPDKSRTIPIDPNLPKWVCQNCHHS" and "DPNLPKWVCQNCHHS LTIVGVDSYAGKFFNDP". To further elucidate the nuclear localization of ATG6, we introduced Agrobacterium tumefaciens carrying ATG6-GFP into nls-mCherry tobacco leaves through transient transformation. Subsequently, we observed the localization of ATG6-GFP, along with the canonical autophagy-like patterns. Our findings revealed fluorescence signals of ATG6-GFP in both the cytoplasm and nuclei (Figure 2b). The nuclear-localized ATG6-GFP overlapping with the nuclear-localized marker, nls-mCherry (indicated by white arrows). Additionally, we observed punctate patterns indicative of canonical autophagy-like localization of ATG6-GFP fluorescence signals (indicated by red circles). Based on these results, we are more confident about the authenticity of ATG6's nuclear localization. The revised manuscript includes clearer images to support our observations.

      Recommendations for the Authors:

      Reviewer #2 (Recommendations For The Authors):

      The duration and concentration of SA treatments are quite variable between experiments which makes comparisons difficult.

      Thank you for pointing this out. The NPR1 protein is known to be unstable and prone to degradation through the 26S proteasome pathway (Spoel et al., 2009; Saleh et al., 2015). Consequently, to investigate the function of NPR1, many scientists and research groups typically employ higher concentrations of SA (e.g., 0.5 mM, 1 mM, or even 5 mM) to elucidate its role (Spoel et al., 2009; Fu et al., 2012; Lee et al., 2015; Saleh et al., 2015; Skelly et al., 2019; Zavaliev et al., 2020; Chen et al., 2021a). In our study, we observed an interaction between ATG6 and NPR1. To enhance the detection of the NPR1 protein, we standardized the SA concentration used in our experiments. In this study, for the treatment of Arabidopsis, we followed the protocols outlined in Saleh et al. and Spoel et al., utilizing 0.5 mM SA (Spoel et al., 2009; Saleh et al., 2015). For tobacco treatment, we adopted the methodology described in the study by Zavaliev et al., administering 1 mM SA (Zavaliev et al., 2020).

      The methods section does not explain some of the essential experimental conditions and reagents used in the study.

      Thank you for pointing this out. Due to word limitations we have placed the detailed experimental methods and reagents in Supplemental Data 1. In Supplemental Data 1, we provide a comprehensive overview of the experimental flow and conditions employed in our study.

      Lines 62-63: the C-terminal domain of all NPRs has a name (already defined as SA-binding domain (SBD)). Also, it would be worth referring to the structure of NPR1 (Kumar et al 2022, Nat) as the source of information about its domains.

      Thank you for pointing this out, we have changed this description in the revised manuscript (lines 62-63).

      Lines 66-69: NPR1 doesn't form monomers. A recent study showed that the basic functional unit of NPR1 is a dimer (Kumar et al 2022, Nat).

      Thank you for pointing this out. In the revised manuscript (line 67) " monomers " has been changed to “dimer”.

      Lines 89-95 and elsewhere: the term "invasion" has a very specific meaning and it doesn't necessarily refer to disease. A pathogen can invade the plant but cause no disease (e.g. ETI). Most plant genetic immune mechanisms act after pathogen invasion, not before it. Those cited works reported the disease resistance, not the invasion resistance.

      Thank you for pointing this out. We've changed the incorrect description in the revised manuscript (line 91).

      Lines 113-119: the truncation at the aa328 includes half of the ANK domain (repeats 1 and 2), not just BTB. The C-terminal truncation variant contains the other half (repeats 3 and 4) of the ANK domain, not the entire ANK domain. It also contains the SBD, not just the NLS. So, this kind of analysis cannot determine the role of ANK domain in the interaction, nor it can conclusively determine if the interaction is through SBD. The interaction should be tested with the SBD domain only in order to make this conclusion.

      Thank you for pointing this out, we have removed the inappropriate description and made the appropriate changes in the revised manuscript (lines 114 and 115).

      In Figure S1, the equally strong interaction of atg6 is found for NPR3/NPR4. Does that mean that atg6 functions also through these other NPRs? What's the significance of these data compared to NPR1-ATG6 interaction? This is especially important, because both NPR3 and NPR4 are predominantly nuclear proteins, and they are unlikely to significantly overlap with autophagy components in the cytoplasm.

      NPR1 and its paralogues NPR3/NPR4, which frequently interact with other proteins to regulate plant immune responses (Backer et al., 2019; Chen et al., 2019). To identify ATGs that interact with NPRs, we performed yeast two-hybrid (Y2H) screens using NPRs as bait. Interestingly, ATG6 interacted with NPR1, NPR3 and NPR4, respectively, and different concentrations of SA treatment did not significantly affect their interaction (Fig. S1a). NPR1 is an important positive regulator of the plant immune response (Chen et al., 2021b). In Arabidopsis and N. benthamian, ATG6 or its homologues was reported to act as a positive regulator to enhance plant disease resistance to P. syringae pv. tomato (Pst) DC3000 and Pst DC3000/avrRpm1 bacteria (Patel and Dinesh-Kumar, 2008), N. benthamiana mosaic virus (TMV) (Liu et al., 2005). Therefore, in this study we focused on investigating the biological significance of the interaction between ATG6 and NPR1. Whether the interaction between ATG6 and NPR3/4 also has an effect on plant immunity is a question that remains to be explored in future studies.

      In Figure 1c and elsewhere: why not use the anti-mCherry antibody to detect atg6-mcherry? Are we seeing the correct protein band of atg6-mcherry? Also, it is not clear what antibodies they used throughout the study: the sources and specificities of antibodies are not provided.

      Thank you for pointing this out. We initially synthesized the ATG6 antibody (anti-ATG6, 1:200, peptide, C-KEKKKIEEEERK, Abmart) in order to detect the endogenous ATG6 protein, and we also tested the specificity and potency of the ATG6 antibody (results are shown in Fig. S17). Additionally, in order to determine the location of the ATG6-mCherry bands, we also detected ATG6-mCherry in ATG6-mCherry Arabidopsis using the ATG6 antibody, and we also used Col as a control (results are shown in Fig. S4). These results show that our synthesized ATG6 antibody can effectively and clearly immunize to both ATG6 and ATG6-mCherry. Therefore, in this study, we used the ATG6 antibody to analyze both ATG6-mCherry and endogenous ATG6. Detailed antibody information is presented in Supplementary Data 1, table S4

      In Figures 1d, 2a, and 2b, the subcellular localization pattern of atg6 contradicts what was published before (Fujiki et al 2007, Plant Phys; Liu et al 2018, FPlS; Xu et al 2017, Autophagy; Li et al 2018, Nat. Comm.). As an autophagy protein, atg6 was shown to localize to cytoplasmic puncta (autophagosomes), like atg8. No nuclear localization was found in those studies. The lack of puncta and the strong nuclear accumulation are signs that the localization of atg6 reported here has to be interpreted with caution. With the data provided, I am not convinced yet that we are looking at the correct ATG6 subcellular localization. Even if the authors propose a novel, non-canonical localization for atg6, they still should have detected the canonical autophagy-like localization of this protein.

      Thanks to your valuable feedback. To further elucidate the nuclear localization of ATG6, we introduced Agrobacterium tumefaciens carrying ATG6-GFP into nls-mCherry tobacco leaves through transient transformation. Subsequently, we observed the localization of ATG6-GFP, along with the canonical autophagy-like patterns. Our findings revealed fluorescence signals of ATG6-GFP in both the cytoplasm and nuclei (Figure 2b). The nuclear-localized ATG6-GFP overlapping with the nuclear-localized marker, nls-mCherry (indicated by white arrows). Additionally, we observed punctate patterns indicative of canonical autophagy-like localization of ATG6-GFP fluorescence signals (indicated by red circles). Based on these results, we are more confident about the authenticity of ATG6's nuclear localization. The revised manuscript includes clearer images to support our observations.

      It would make more sense to include the BiFC data (fig. S2) in the main figure, instead of the co-localization (fig. 1d) which cannot serve as evidence for interaction.

      Thank you for the feedback. We accept your suggestion. In Fig.1, we have replaced the co-localization image with a BiFC (Bimolecular Fluorescence Complementation) image to better illustrate the interaction.

      In Figure S2, the bifc signals have to be quantified to qualify as evidence for interaction. also, a subcellular marker has to be used (e.g. nuclear mcherry). From the current poor-quality images, one cannot determine where in the cell the presumed interaction takes place, nucleus or cytoplasm, or both. Also, no puncta are seen in these images.

      Thank you for pointing this out. Despite the lack of clarity in the images we provided, our BiFC results unequivocally demonstrate the interaction between ATG6 and NPR1 in both the cytoplasm and nucleus. Notably, as the reviewer pointed out, punctate signals were not observed in our images. This lack of punctate signals is consistent with previous studies (Figure 2) that have also shown BiFC results between autophagy-associated proteins ATG8s and their interacting partners. For instance, Fig 1G (Marshall et al. 2019, Cell), Fig 2F (Marshall et al. 2019, Cell), Fig 4B (Macharia et al. 2019, BMC Plant Biology), and Fig 3 (Zhou et al. 2018, Autophagy) all did not exhibit punctate signals, aligning closely with our findings.

      In Figure S3a, the nuclear localization is shown for stomata. It is known that stomata are especially strong expressors of the transgenes, and localization there could be an artefact of overaccumulation of the fusion protein. Also, why do they present the localization of atg6-gfp, if the analysis and the cross were made with atg6-mcherry?

      Thank you for pointing this out. In our previous experiments, we observed the localization of ATG6 in the nucleus of Arabidopsis thaliana plants overexpressing ATG6-GFP (Fig. S3a). To clearly visualize the location of the nucleus, we used the cytosolic DAPI dye, which readily stained the nuclei of the stomatal guard cells. This allowed us to easily identify the nuclear regions for our observations. Additionally, in Fig. 2a and Fig.S3b, we detected the fluorescence signal of ATG6-mCherry within the nucleus, further confirming the nuclear localization of ATG6. Moreover, the nuclear and cytoplasmic fractions were separated. Under SA treatment, ATG6-mCherry and ATG6-GFP were detected in the cytoplasmic and nuclear fractions in N. benthamiana (Fig. 2c and d). Similarly, ATG6 was also detected in the nuclear fraction of UBQ10::ATG6-GFP and UBQ10::ATG6-mCherry overexpressing plants (Fig. 2e and f).

      In Figure S3b, the images are low resolution and of poor quality. Why atg6-mcherry is expressed in a single cell if these are transgenic plants? The nuclear co-localization with npr1-gfp has to be shown more clearly with high res. images and also be quantified, because the expression of atg6-mcherry is not as uniform as npr1-gfp.

      Thank you for pointing this out. Contrary to the reviewer's assertion, the ATG6-mCherry fluorescence signal depicted in Figure S3b was not exclusive to a single cell. In fact, this fluorescence was also evident in other cells, albeit with relatively weaker intensity. This disparity in fluorescence intensity may be attributed to the irregularities in leaf structure at the time of image capture using the microscope. To bolster our conclusion, we further examined the fluorescence signals in the cells of the root elongation zone in ATG6-mCherry x NPR1-GFP, as depicted in the figure below. Our observations revealed that the fluorescence signals of ATG6-mCherry exhibited uniform distribution, with detection in both the cytoplasm and nucleus. We have replaced the original unclear image with a high-quality image.

      Lines 138-143: In fig. S3d, it would make more sense to show the WB on the hybrid npr1-gfp/atg6-mcherry plants with both anti-gfp and anti-mcherry antibodies to detect the free mcherry/gfp. Since the analysis of the level of free FP is done, then why didn't they test the free mcherry levels in Figure S4a? This would be more important than testing the free GFP in ATG6-GFP plants, because the imaging of atg6-mcherry was done in the hybrid plants (fig. S3b).

      Thank you for pointing this out. We initially synthesized the ATG6 antibody (anti-ATG6, 1:200, peptide, C-KEKKKIEEEERK, Abmart) in order to detect the endogenous ATG6 protein, and we also tested the specificity and potency of the ATG6 antibody (results are shown in Fig. S17). Additionally, in order to determine the location of the ATG6-mCherry bands, we also detected ATG6-mCherry in ATG6-mCherry Arabidopsis using the ATG6 antibody, and we also used Col as a control (results are shown in Fig. S4). These results show that our synthesized ATG6 antibody can effectively and clearly immunize to both ATG6 and ATG6-mCherry. Therefore, in this study, we used the ATG6 antibody to analyze both ATG6-mCherry and endogenous ATG6. Detailed antibody information is presented in Supplementary Data 1, table S4. In the previous experiments, we procured the mCherry antibody (mCherry-Tag Monoclonal Antibody(6B3), BD-PM2113, China) to immunolabel ATG6-mCherry. However, we encountered challenges with the potency of this mCherry antibody, and considering our budget constraints, as well as the availability of our self-synthesized ATG6 antibody, we chose not to pursue the purchase of another antibody from a different company for the continuation of the Western Blot experiment.

      In Figure 2c, there's no atg6-mcherry detected at time 0, in either cytoplasm or nucleus, yet the microscope images in panel a show strong accumulation in both compartments.

      Thank you for pointing this out. Previous studies ATG6 can also be degraded via the 26s proteasome pathway (Qi et al., 2017). We speculate that this phenomenon might be attributed to the rapid turnover of ATG6 at time 0.

      Lines 156-160: this statement is unsupported by the data. In fig. S5, the bands for native atg6 in the nuclear fraction are extremely weak, and they do not show the reverse pattern of change along the time points compared to the cytoplasmic fraction, which would indicate that the nuclear fraction is complementary to the cytoplasmic pool of the protein. The result more likely suggests that the majority of the ATG6 is in the cytoplasm, and that the weak bands detected in the nucleus are either background signal, or a contamination from the cytoplasmic pool. At this low protein level or poor immuno-detection the background signal is inevitable due to overexposure. Even though the actin marker is not detected in the nuclear fraction, it doesn't necessarily mean that there's no contamination from the cytoplasm in the nuclear fraction. The actin is just too abundant and can be detected at lower exposure.

      Thank you for pointing this out. In Fig. S5, we detected the subcellular localization of endogenous ATG6, although the image quality was somewhat low. Nevertheless, the cytosolic and nuclear localization of ATG6 could be clearly observed. In addition to this, we also verified the cytosolic and nuclear localization of ATG6 in Arabidopsis using confocal fluorescence microscopy and nucleoplasmic separation experiments. Actin and H3 were used as cytoplasmic and nucleus internal reference, respectively. (Fig. 2e and f). Furthermore, we observed the cytosolic and nuclear localization of ATG6 when we expressed ATG6-GFP or ATG6-mCherry in tobacco leaves through cis-transfection experiments (Fig. 2a-d). These results are consistent with the prediction of the subcellular location of ATG6 in the Arabidopsis subcellular database (https://suba.live/) (Fig. S3c). The reviewer's feedback has been valuable in helping us present these findings more clearly. We acknowledge the limitations in the image quality for the endogenous ATG6 localization, but we believe the combination of multiple experimental approaches, including the use of fluorescent protein fusions, provides robust evidence for the cytosolic localization of ATG6 in plant cells. Moving forward, we will continue to investigate the significance of ATG6's subcellular distribution and its potential dual roles in both the nucleus and the cytosol, particularly in the context of its interaction with the key immune regulator NPR1. We appreciate the reviewer's constructive comments, as they will help us strengthen the presentation and interpretation of our findings.

      In Figure 3a the images are of too low resolution to see the co-localization. The focal planes of the top and bottom panels are quite different: the top is focused on stomata, the bottom - on pavement cells. So, the number of the NPR1-GFP nuclei between these two focal planes is dramatically different. Also, it looks like the atg6-mcherry in these plants are predominantly in the cytoplasm, not the nucleus as the authors claim. A higher resolution and higher quality of images are required to determine this.

      Thank you for pointing this out. To ensure the clarity and accuracy of our confocal images, we have supplied a clearer image as supplementary evidence. The Bright images distinctly show that both sets of images are in the same plane of focus. Furthermore, in the figure (third one in the fourth column), the nucleus localization of ATG6-mCherry is clearly visible, and that ATG6-mCherry is co-localized with NPR1-GFP in the nucleus, as indicated by the white arrow.

      In Figure 3b, it is not indicated what exactly was measured and in what condition, mock or SA. If these are numbers of nuclei, then it should be indicated what size of the area was sampled, not just "section", and both mock and SA should be included in the measurements. Also, how many independent images have been sampled? what does the error bar represent? What does "normal" mean? Shouldn't this be a mock treatment?

      Thank you for pointing out this. The term "Normal" in this context refers to mock treatment, and we have revised the description for clarity. In Figure 3b, the graph illustrates the count of nuclear localizations of NPR1-GFP in ATG6-mCherry × NPR1-GFP and NPR1-GFP Arabidopsis plants following SA treatment. Statistical data were obtained from three independent experiments, each comprising five individual images, resulting in a total of 15 images analyzed for this comparison. Detailed descriptions were also added to the revised manuscript (Lines 568-570, 800-804).

      Lines 167-168: the proposed increase of NPR1-GFP in the nucleus could be simply due to a higher accumulation of SA in the hybrid plants, not because of the direct interaction of atg6.

      Thank you for pointing out this. Our results confirmed that ATG6 overexpression significantly increased nuclear accumulation of NPR1 (Fig. 3). Notably, the ratio (nucleus NPR1/total NPR1) in ATG6-mCherry × NPR1-GFP was not significantly different from that in NPR1-GFP, and there is a similar phenomenon in N. benthamiana (Fig. 3c-f). These results suggested that the increased nuclear accumulation of NPR1 by ATG6 might result from higher levels and more stable NPR1, rather than the enhanced nuclear translocation of NPR1 facilitated by ATG6. Furthermore, we found that under SA treatment, the protein levels of NPR1 were significantly higher in the ATG6-mCherry × NPR1-GFP line compared to the NPR1-GFP line (Fig. 5a). Notably, even in the absence of differences in SA levels between the two lines, we observed that ATG6 could delay the degradation of NPR1 under normal conditions (Fig. 6). These findings suggest that ATG6 employs both SA-dependent and SA-independent mechanisms to maintain the stability of the key immune regulator NPR1. In summary, we therefore suggest that the increased nuclear accumulation in NPR1 cells is a dual effect of SA and ATG6.

      Lines 202-204: "Increased nuclear accumulation" implies increased translocation. However, they found that the ratio of NPR1-GFP does not change (Figure 3), so the reason for higher nuclear accumulation is not translocation, but abundance.

      Thank you for pointing out this. Our results confirmed that ATG6 overexpression significantly increased nuclear accumulation of NPR1 (Fig. 3). ATG6 also increases NPR1 protein levels and improves NPR1 stability (Fig. 5 and 6). Therefore, we consider that the increased nuclear accumulation of NPR1 in ATG6-mCherry x NPR1-GFP plants might result from higher levels and more stable NPR1 rather than the enhanced nuclear translocation of NPR1 facilitated by ATG6. To verify this possibility, we determined the ratio of NPR1-GFP in the nuclear localization versus total NPR1-GFP. Notably, the ratio (nucleus NPR1/total NPR1) in ATG6-mCherry × NPR1-GFP was not significantly different from that in NPR1-GFP, and there is a similar phenomenon in N. benthamiana (Fig. 3c-f). These results suggested that the increased nuclear accumulation of NPR1 by ATG6 might result from higher levels and more stable NPR1, rather than the enhanced nuclear translocation of NPR1 facilitated by ATG6. Further we analyzed whether ATG6 affects NPR1 protein levels and protein stability. Our results show that ATG6 increases NPR1 protein levels under SA treatment and ATG6 maintains the protein stability of NPR1 (Fig. 5 and 6). These results suggested that the increased nuclear accumulation of NPR1 by ATG6 result from higher levels and more stable NPR1. The corresponding description is shown in revised manuscript (lines 338~352).

      Lines 204-205: the co-localization in Figure 1d cannot be interpreted as interaction.

      Thank you for the feedback. We have replaced the co-localization image with a BiFC (Bimolecular Fluorescence Complementation) image to better illustrate the interaction in Fig 1d.

      What age of plants were used for the analysis in Figures 4 and S7? The age of the plant might significantly affect the free SA levels under control conditions.

      Thank you for the feedback. In Figures 4 and S7, 3-week-old plants were used to determine salicylic acid (SA) levels and the expression of target genes. Figures 4 and S7 figure notes provide detailed descriptions (lines 818-819).

      In Figure 5a they treat with SA, but the analysis in Figure S10 is done with the pathogen, so how can these data be correlated?

      Thank you for pointing out this. Previous studies have demonstrated that pathogen infestation rapidly increases the salicylic acid (SA) content in plants, and the elevated SA then activates plant immune responses. Therefore, both pathogen treatment and direct SA treatment can activate SA-dependent plant immune responses. The NPR1 protein is known for its instability. In Figure 5a, we utilized a 0.5 mM SA treatment to assess the changes in NPR1 protein levels, as the impact of SA treatment is more immediate and pronounced.

      Lines 241-242: In Figure 5b, it is not clear why there's no detection of NPR1-GFP and atg6-mcherry at time 0?? The levels of proteins in the transient assay are sufficiently high for detection by WB.

      Thank you for pointing this out. The NPR1 protein is known to be unstable and prone to degradation through the 26S proteasome pathway (Spoel et al., 2009; Saleh et al., 2015). In addition, previous studies ATG6 can also be degraded via the 26s proteasome pathway (Qi et al., 2017). We speculate that this phenomenon might be attributed to the rapid turnover of NPR1 and ATG6 at time 0.

      In Figures 5c-d, the quality of these images is very poor, and they do not clearly show the signs. What structure was exactly measured in these images? There are so many fluorescent bodies there, that it is not clear what are we looking at. Also, it is not clear why they did not show the mcherry channel? It would be important to see if the bodies in SA-treated plants show co-localization with atg6-mcherry autophagosomes (if these exist at all).

      Thank you for pointing this out. Interestingly, similar to previous reports (Zavaliev et al., 2020), SA promoted the translocation of NPR1 into the nucleus, but still a significant amount of NPR1 was present in the cytoplasm (Fig. 3c and e). Previous studies have shown that SA increased NPR1 protein levels and facilitated the formation of SINCs in the cytoplasm, which are known to promote cell survival (Zavaliev et al., 2020). We therefore observed the fluorescence signal of SINCs-like condensates in the cytoplasm of tobacco leaves. After 1mM SA treatment, more SINCs-like condensates fluorescence were observed in N. benthamiana co-transformed with ATG6-mCherry + NPR1-GFP compared to mCherry + NPR1-GFP (Fig. 5c-d and Supplemental movie 1-2). We have a clearer demonstration in the supplemental video movie 1-2. Additionally, we observed that SINCs-like condensates signaling partial co-localized with certain ATG6-mCherry autophagosomes fluorescence signals.

      Lines 245-247: so, is it atg6 or SA that increases the NPR1 levels? If this is due to SA, then the whole study doesn't have novelty, because we already know from previous works that SA increases the stability of npr1.

      Thank you for pointing this out. Indeed, previous studies have shown that salicylic acid (SA) increases NPR1 levels and protein stability (Spoel et al., 2009; Saleh et al., 2015). In our experiments, we found that under SA treatment, the protein levels of NPR1 were significantly higher in the ATG6-mCherry × NPR1-GFP line compared to the NPR1-GFP line (Fig. 5a). Additionally, free SA levels were also significantly elevated in the ATG6-mCherry × NPR1-GFP line under pathogen challenge (Pst DC3000/avrRps4), but not under normal conditions (Fig. 4a). Furthermore, even in the absence of differences in SA levels between the two lines, we observed that ATG6 could delay the degradation of NPR1 under normal conditions (Fig. 6). These findings represent one of our new discoveries. These findings suggest that ATG6 employs both SA-dependent and SA-independent mechanisms to maintain the stability of the key immune regulator NPR1.

      Lines 313-316: npr1 and atg6 can function independently from each other, so the term "jointly" is misleading. Based on the overall data provided in this manuscript it cannot be concluded that the two proteins work in one complex to control plant immunity.

      Thank you for pointing this out. In the revised manuscript "jointly" has been changed to “cooperatively”.

      Lines 369-374: this speculation is beyond the main hypothesis claiming that atg6 functions through npr1. If atg6 can activate the transcription alone, then what is the significance of its activation of npr1? How can one distinguish between the two?

      Thank you for pointing this out. Transcription activation by transcription factors typically requires at least two conserved structural domains: a transcription activation domain and a DNA-binding domain. However, ATG6 does not possess these two typical conserved structural domains found in canonical transcription factors. Given this structural context, it is unlikely that ATG6 would be able to directly activate transcription on its own. The lack of the canonical transcription factor domains in ATG6 suggests that it may not be able to function as a direct transcriptional activator. Previous studies have shown that acidic activation domains (AADs) in transcriptional activators (such as Gal4, Gcn4 and VP16) play important roles in activating downstream target genes. Acidic amino acids and hydrophobic residues are the key structural elements of AAD (Pennica et al., 1984; Cress and Triezenberg, 1991; Van Hoy et al., 1993). Chen et al. found that EDS1 contains two ADD domains and confirmed that EDS1 is a transcriptional activator with AAD (Chen et al., 2021a). Here, we also have similar results that ATG6 overexpression significantly enhanced the expression of PR1 and PR5 (Fig. 4b-c and S9), and that the ADD domain containing acidic and hydrophobic amino acids is also found in ATG6 (148-295 AA) (Fig. S14). We speculate that ATG6 might act as a transcriptional coactivator to activate PRs expression synergistically with NPR1.

      Lines 389-400: the cell death due to AvrRPS4 in Col-0 ecotype is extremely weak as there's no complete receptor complex for this effector. So, one has to use a very high dose to induce cell death in Col-0, certainly higher than the one used for bacterial growth. The authors used the same dose in both assays, so it is likely that what we see as "cell death" is not an effector-triggered response, but rather symptom-associated for the virulent pathogen.

      Thank you for pointing this out. Indeed, as the reviewer pointed out, most cell death assays use higher concentrations of Pst DC3000/avrRps4 or Pst DC3000/avrRpt2, but they typically treat Arabidopsis for a relatively short period, usually less than 1 day(Hofius et al., 2009; Zavaliev et al., 2020). In this study, although we used relatively low Pst DC3000/avrRps4 (0.001) injections, we detected cell death under a relatively long period of Pst DC3000/avrRps4 infestation (3 days). Pst DC3000/avrRps4-infested plants multiply significantly in host cells, and therefore we assumed that the propagated pathogens after 3 days of incubation would be sufficient to induce intense cell death. Consequently, we chose this concentration of Pst DC3000/avrRps4 for the experiment.

      Lines 407-416: why do you expect "delay of degradation" with autophagy inhibitor? Shouldn't it be the opposite? In Figure S14, if we compare the bands between 120min and 120min+ConA+WM, the effect of autophagy inhibitors is actually quite strong (0.47 vs 0.22), with about 50% more degradation of NPR1 in their presence. So, the conclusion that the degradation of NPR1 is autophagy-independent is wrong according to this result.

      Thank you for pointing this out. We have revised the inaccurate description, as outlined in the revised manuscript (lines 413-425).

      References

      Backer R, Naidoo S, van den Berg N. 2019. The NONEXPRESSOR OF PATHOGENESIS-RELATED GENES 1 (NPR1) and Related Family: Mechanistic Insights in Plant Disease Resistance. Front Plant Sci 10, 102.

      Castello MJ, Medina-Puche L, Lamilla J, et al. 2018. NPR1 paralogs of Arabidopsis and their role in salicylic acid perception. PLoS One 13, e0209835.

      Chen H, Li M, Qi G, et al. 2021a. Two interacting transcriptional coactivators cooperatively control plant immune responses. Sci Adv 7, eabl7173.

      Chen J, Mohan R, Zhang Y, et al. 2019. NPR1 Promotes Its Own and Target Gene Expression in Plant Defense by Recruiting CDK8. Plant Physiol 181, 289-304.

      Chen J, Zhang J, Kong M, et al. 2021b. More stories to tell: NONEXPRESSOR OF PATHOGENESIS-RELATED GENES1, a salicylic acid receptor. Plant Cell Environ.

      Cress WD, Triezenberg SJ. 1991. Critical structural elements of the VP16 transcriptional activation domain. Science 251, 87-90.

      Devadas SK, Raina R. 2002. Preexisting systemic acquired resistance suppresses hypersensitive response-associated cell death in Arabidopsis hrl1 mutant. Plant Physiol 128, 1234-1244.

      Ding Y, Sun T, Ao K, et al. 2018. Opposite Roles of Salicylic Acid Receptors NPR1 and NPR3/NPR4 in Transcriptional Regulation of Plant Immunity. Cell 173, 1454-1467 e1415.

      Falk A, Feys BJ, Frost LN, et al. 1999. EDS1, an essential component of R gene-mediated disease resistance in Arabidopsis has homology to eukaryotic lipases. Proc Natl Acad Sci U S A 96, 3292-3297.

      Feys BJ, Moisan LJ, Newman MA, et al. 2001. Direct interaction between the Arabidopsis disease resistance signaling proteins, EDS1 and PAD4. EMBO J 20, 5400-5411.

      Fu ZQ, Yan S, Saleh A, et al. 2012. NPR3 and NPR4 are receptors for the immune signal salicylic acid in plants. Nature 486, 228-232.

      Hofius D, Schultz-Larsen T, Joensen J, et al. 2009. Autophagic components contribute to hypersensitive cell death in Arabidopsis. Cell 137, 773-783.

      Jones JD, Dangl JL. 2006. The plant immune system. Nature 444, 323-329.

      Jurkowski GI, Smith RK, Jr., Yu IC, et al. 2004. Arabidopsis DND2, a second cyclic nucleotide-gated ion channel gene for which mutation causes the "defense, no death" phenotype. Mol Plant Microbe Interact 17, 511-520.

      Lee HJ, Park YJ, Seo PJ, et al. 2015. Systemic Immunity Requires SnRK2.8-Mediated Nuclear Import of NPR1 in Arabidopsis. Plant Cell 27, 3425-3438.

      Liu Y, Schiff M, Czymmek K, et al. 2005. Autophagy regulates programmed cell death during the plant innate immune response. Cell 121, 567-577.

      Liu Y, Sun T, Sun Y, et al. 2020. Diverse Roles of the Salicylic Acid Receptors NPR1 and NPR3/NPR4 in Plant Immunity. Plant Cell 32, 4002-4016.

      McKim SM, Stenvik GE, Butenko MA, et al. 2008. The BLADE-ON-PETIOLE genes are essential for abscission zone formation in Arabidopsis. Development 135, 1537-1546.

      Nawrath C, Metraux JP. 1999. Salicylic acid induction-deficient mutants of Arabidopsis express PR-2 and PR-5 and accumulate high levels of camalexin after pathogen inoculation. Plant Cell 11, 1393-1404.

      Patel S, Dinesh-Kumar SP. 2008. Arabidopsis ATG6 is required to limit the pathogen-associated cell death response. Autophagy 4, 20-27.

      Pennica D, Goeddel DV, Hayflick JS, et al. 1984. The amino acid sequence of murine p53 determined from a c-DNA clone. Virology 134, 477-482.

      Qi H, Xia FN, Xie LJ, et al. 2017. TRAF Family Proteins Regulate Autophagy Dynamics by Modulating AUTOPHAGY PROTEIN6 Stability in Arabidopsis. Plant Cell 29, 890-911.

      Rate DN, Greenberg JT. 2001. The Arabidopsis aberrant growth and death2 mutant shows resistance to Pseudomonas syringae and reveals a role for NPR1 in suppressing hypersensitive cell death. Plant J 27, 203-211.

      Saleh A, Withers J, Mohan R, et al. 2015. Posttranslational Modifications of the Master Transcriptional Regulator NPR1 Enable Dynamic but Tight Control of Plant Immune Responses. Cell Host Microbe 18, 169-182.

      Skelly MJ, Furniss JJ, Grey H, et al. 2019. Dynamic ubiquitination determines transcriptional activity of the plant immune coactivator NPR1. Elife 8.

      Spoel SH, Mou Z, Tada Y, et al. 2009. Proteasome-mediated turnover of the transcription coactivator NPR1 plays dual roles in regulating plant immunity. Cell 137, 860-872.

      Van Hoy M, Leuther KK, Kodadek T, et al. 1993. The acidic activation domains of the GCN4 and GAL4 proteins are not alpha helical but form beta sheets. Cell 72, 587-594.

      Yuan M, Ngou BPM, Ding P, et al. 2021. PTI-ETI crosstalk: an integrative view of plant immunity. Curr Opin Plant Biol 62, 102030.

      Yue J, Sun H, Zhang W, et al. 2015. Wheat homologs of yeast ATG6 function in autophagy and are implicated in powdery mildew immunity. BMC Plant Biol 15, 95.

      Zavaliev R, Mohan R, Chen T, et al. 2020. Formation of NPR1 Condensates Promotes Cell Survival during the Plant Immune Response. Cell 182, 1093-1108 e1018.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors): 

      The authors should possibly discuss more the other cases when LTPs of the same type of ORP9 and ORP10 have been found to dimerise. They should definitely cite and discuss the evidence reported in February this year in CMLS (see https://link.springer.com/article/10.1007/s00018-023-04728-5). In this paper, authors reported very similar findings as those the authors have in Figures 3, 4, S6, S7, and S8. Specifically, in this CMLS paper the authors find that ORP9 and ORP10 (not ORP11) interact through a central helical region and that ORP9 localises ORP10 to the ER-Golgi MCSs by providing ORP10 with a binding site for VAPs, where the heterodimer mediates the exchange of PtdIns(4)P for PtdSer. 

      We thank the reviewer for their recommendations. The mentioned paper has simply gone unnoticed by us and is now referred in the revised manuscript. Various other papers reporting on LTP dimerizations are already cited in our manuscript: ORP9-ORP10 dimerization (Kawasaki et al. 2022), ORP9-ORP11 dimerization (Zhou et al. 2010), and ORP9-ORP10/11 dimerization (Tan and Finkel 2022). Revised manuscript now discusses the dimerization of CERT and OSBP while citing Gehin et al. 2023, Ridgway et al. 1992 and de la Mora et al. 2021.

      Reviewer #2 (Recommendations For The Authors): 

      Model and Discussion: 

      Give an idea about the aspect of SMS1 function that is being affected. Even if no further experiments were carried out, the authors could discuss possibilities. One might speculate what the PS is being used for. For example, is it a co-factor for integral membrane proteins, such as flippases? Is it a co-factor for peripheral membrane proteins, such as yet more LTPs? The model could include the work of Peretti et al (2008), which linked Nir2 activity exchanging PI:PA (Yadav et al, 2015) to the eventual function of CERT. Could the PS have a role in removing/reducing DAG produced by CERT? 

      We thank the reviewer for their recommendations. The same recommendations were also scripted in the public review, which we believe we answered sufficiently. 

      Other, Minor: 

      Make clear that there is no sterol readout (Fig 1C) 

      We would like to point out that Figure 1C has a sterol readout as CE refers to cholesterol esters.

      PH domains of ORP9 and ORP11 localized only partially to the Golgi, unlike the PH domains of OSBP and CERT" (line 154). Say here where the non-Golgi ORP9 and ORP11 PH domain pool is - presumably in the cytoplasm.  

      We thank the reviewer for their suggestion and rephrase the sentence accordingly. 

      Fig 7H-J: histograms not lines as these are separate unlinked categories

      We thank the reviewer for their suggestion. However, we think the original figure represent our findings in the best possible way. Our analysis regarding individual lipid species is also included in Supplementary figure 10.

      Reviewer #3 (Recommendations For The Authors): 

      (1) At the end of the intro, in summarizing their findings, the authors state (p3. lines 48-49) "These findings highlight how phospholipid and sphingolipid gradients along the secretory pathway are linked at ER-Golgi membrane contact sites." This should instead read "These findings highlight THAT phospholipid and sphingolipid gradients along the secretory pathway are linked at ER-Golgi membrane contact sites." 

      We thank the reviewer for their suggestion and change the sentence accordingly.

      (2) As noted in the public section, to show that ORP9/11 do indeed exchange lipids, an in vitro experiment demonstrating that ORP11 can transfer PI4P is essential. Ideally, it would be best to examine PS AND PI4P transfer by ORP9 AND 11 separately AND then by the ORP9/11 heterodimer. This could lend insights as to the function of the heterodimer. The He et al et Yu paper should provide guidelines for this. Why have the heterodimers? 

      We believe we addressed this point by showing the lipid transfer ability of the ORP9-ORP11 dimer. These findings are now part of the revised manuscript.

      (3) It would be interesting to discuss the roles of ORP9/ORP11 versus ORP9/ORP10... they seem so analogous, although this is at the discretion of the authors. 

      We thank the reviewer for their suggestion. Since the difference between ORP9-ORP10 and ORP9-ORP11 dimers was also raised by other reviewers, we decided to include this discussion in the manuscript. A section based on our answer to Reviewer #2 in Public Review is now part of the Discussions.

      (4) The authors used a melanoma cell line in their screens (p3, line 59). Could they explain why they used this cell line versus others? 

      We chose MelJuSo cell for various reasons. Mainly, MelJuSo are diploid, which eases generating knockouts in a screening setup compared to other polyploid cancer cell lines (e.g. HeLa). Furthermore, our CRISPR/Cas9 screening protocols are optimized for these cell lines.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      This work provides significant insight into freshwater cable bacteria (CB) and is an important contribution to the emerging CB literature. In this manuscript, Yang et al. describe currentvoltage measurements on CB collected from two freshwater sources in Southern California. The studies use electrostatic and conductive atomic force microscopies, as well as four-probe measurements. These measurements are consistent with back-of-the-envelope calculations on conductivities needed to sustain CB function. The data shows that freshwater CB have a similar structure and function to the more studied marine cable bacteria.

      Strengths:

      Excellent measurements on a new class of cable bacteria.

      Weaknesses:

      The paper would benefit from additional analysis of the data.

      Reviewer #1 (Recommendations for The Authors):

      This work provides significant insight into freshwater cable bacteria (CB) and is an important contribution to the emerging CB literature. In this manuscript, Yang et al. describe current-voltage measurements on CB collected from two freshwater sources in Southern California. The studies use electrostatic and conductive atomic force microscopies, as well as four-probe measurements. These measurements are consistent with back-of-the-envelope calculations on conductivities needed to sustain CB function. The data shows that freshwater CB have a similar structure and function to the more studied marine cable bacteria. Minor comments follow.

      We are grateful to the reviewer for the encouraging feedback and for appreciating the central message of the preprint. Below we address the reviewer’s constructive comments.

      Additional information could be provided regarding the degraded cells where an 'empty cage' remains, as well as the polyphosphate granules, which were previously observed in marine CB (refs. 11 and 18). 

      We have edited the manuscript to note that the appearance of empty cages and the polyphosphate granules in freshwater cable bacteria is indeed consistent with these features as previously reported in marine CB. The size of polyphosphate granules in freshwater CB are comparable or slightly smaller than in marine CB (Sulu-Gambari et al., 2015). In the case of empty cages, these cells were previously described as ‘ghost filaments’ which had lost all cell membrane and cytoplasmic material (Cornelissen et al., 2018). 

      Manuscript edits: a sentence regarding polyphosphate granules has been added into the manuscript from lines 307 - 308. “The size of polyphosphate granules in freshwater CB (70 nm – 400 nm) is comparable or slightly smaller than in marine CB (35)”.

      A sentence regarding the empty cages has been added into the manuscript (lines 303-305). “These empty cages were previously described as ‘ghost filaments’ which had lost all cell membrane and cytoplasm material (20).”

      The authors also state that the 'phase difference between the elevated ridges and interridge regions is proportional to the tip voltage squared,' and refer to Fig. 4D. This figure has only three data points with large error bars. The authors may wish to explain this finding and justify their analysis in greater detail.

      We thank the reviewer for pointing out that we presented this result but did not adequately describe its origin or significance. In general, the probe phase response of electrostatic force microscopy (EFM) can originate not only from the electrostatic interaction with the sample (i.e. the electrical properties of interest) but also from shorter range van der Waals forces (which are more reflective of probe-sample distance i.e. topography). To ensure that EFM is reporting electrical interactions, we performed these measurements using a two-pass technique, with the second pass retracing the topography measured during the first pass, but at a fixed height above the surface where the interactions are long range (electrostatic) rather than short range (vdW) or resulting from topography cross-talk. The purpose of the voltage change measurement (Fig. 4D) is to simply assess whether this procedure is successful, since electrostatic forces are proportional to the square of the voltage at a fixed height (F = ½ . ∂C⁄∂z .V2). While the error bar of that measurement is high, due to the intrinsic noise in the dynamic (high frequency) EFM phase response measurement, we note that the purpose of this measurement is simply to assess that the interaction is due to the electrical interaction with the sample, before proceeding to actual conductance measurements (Figs. 5-8).

      Manuscript edits: we previously simply cited a reference where the reader can delve deeper into the origin of the square voltage signal. To put this into better context, we now include an additional information (lines 461 - 475), noting the origin and purpose of the result as described above.  

      It is interesting that the freshwater CB appear to be more resilient to air compared to marine CB (or at least some freshwater filaments, as the authors note that the level of resilience is filament-dependent). The authors indicate that salt affects oxygen solubility and there is a larger oxygen content in freshwater. Do the authors have thoughts on whether or not the differences between marine and freshwater CB could fit, or not fit, with the hypothesis that conductivity in air is lowered due to oxidation of the Ni/S species (ref. 25 in manuscript)? Could the freshwater CB have greater protection against oxidation?

      We thank the reviewer for highlighting this point. Indeed, our manuscript mentions the current hypothesis that conductivity of cable bacteria may be diminished upon oxidation of the Ni/S groups (lines 101 - 105 and 498 - 504). It remains unclear how this idea may lead to variability between marine and freshwater cables. Interestingly, however, a recent comparative bioRxiv preprint (Digel et. al. 2023) noted significant differences in the morphology, number, and crosssectional area of nanofibers between a freshwater and marine CB strain. These differences may lead to a different resiliency against oxidative degradation upon exposure air. Specifically, even though the marine CB strain was characterized by a larger cross-section area per nanofiber, it had significantly fewer nanofibers, leading to 40% smaller total area than its freshwater counterpart. We have edited the manuscript to highlight these possible differences (at least in size) between freshwater and marine cables.

      Manuscript edits (lines 506 – 514) “For example, a recent comparative study (21) hints at significant differences in the morphology, number, and size of nanofibers when comparing a marine CB strain to a freshwater CB strain. Specifically, while the marine CB was characterized by a 50% larger cross-sectional area per nanofiber, the total nanofibers’ area was 40% smaller than the freshwater strain due to a smaller number of nanofibers per CB filament. Given the proposed central role of nanofibers in mediating electron transport along CB, it is possible that such differences may also lead to different degrees of tolerance against oxidative degradation upon exposure to air.”

      Figure 6D shows current-voltage measurements from three representative cables; there is a large variation, most notably between Cable 1 and Cables 2 and 3. Is this variation typical for different cables? Can the authors comment on the range of values observed and how many cables fit into different ranges? Any thoughts on the reasons behind the range?

      Figure 6 B and C (red and blue) are representative of most of the cable conductance measured using the point IV CAFM technique, with the Figure 6 A (green) IV curve being an example of the upper limit, which was less frequently observed. In total we measured ten cables using the point IV CAFM technique. These variations may stem from actual differences in the conductivity of separate CB filaments, the environment of the measurement, or limitations in the conductive AFM measurement techniques. These limitations include a large contact resistance due to the interaction of the small probe with the sample, which may lead to large variability depending on the contact point.  For this reason, we rely on 4-probe measurements (Fig. 8) for quantitative conductive analyses, rather than conductive AFM. It is important to note, however, that the conductive AFM measurements (Fig. 6 and Fig. 7) provide other complementary information including the demonstration of both transverse and longitudinal transport (lines 389-393) in Fig. 6 and the visualizing of the current carrying nanofibers in Fig. 7. 

      Manuscript edits: we have edited the manuscript (lines 413 - 418) to make it clear that the quantitative estimate of conductivity was made only using 4 probe measurements due to the limitations of CAFM or two-probe techniques.

      Can the authors comment on how the number of fibers per CB in their samples compares with the number of fibers in marine CB? Marine CB are known to have pinwheel junctions where the fibers come together before branching out again. This pinwheel design could play a role in the function of the CB or in its survival (see Adv. Biosys. 2020, 4, 2000006). Were pinwheel structures observed in freshwater CB? If so, how do they compare?

      From the previous studies, estimates of the number of fibers in marine CB appeared to vary significantly from 15 or 17 (Pfeffer et. al., 2012) to 58 – 61 (Cornelissen et. al., 2018). In our freshwater CB, we estimated the number of fibers at ~35 per CB (line 423), which is comparable to the count of 34 per freshwater CB recently reported by Digel et al., bioRxiv 2023. We cannot specifically comment on the pinwheel structure as we did not perform the transverse thin section TEM imaging necessary to observe the cell-cell junctions in this particular study.

      On lines 95-96, the authors discuss the fact that marine cable bacteria have a wide variance in their measured conductivities. While one may ask if the larger marine conductivities (near 80 S/cm) are representative, a conductivity of 0.1 S/cm is 2 orders of magnitude lower than this value, which the field generally refers to as a high conductivity. The authors should mention whether or not any of their specimens display the high conductivities seen in select marine cable bacteria specimens.

      It is indeed important to note that the ~80 S/cm figure refers to an upper end previously observed (ref. 22) for marine CB conductivity. In our manuscript (lines 525 - 526), we highlight that the previously observed range (including in that same study) is 10−2-101 S/cm and we were careful to qualify the previously reported upper end with ‘reaching as high as’ (line 97). Note that this places our measurement of 0.1 S/cm within the previously reported range. We have not observed freshwater CB conductivity near the upper end of the previously reported range, and generally propose that these types of measurements are better analyzed in the context of the biological function rather than ‘high vs. low’. Towards that end, the manuscript (lines 527-537) makes the argument that the 10-1 S/cm figure may be sufficient to support the electrical currents mediated by CB in sediments. We have edited the manuscript to highlight that we did not observe single CB nanofiber conductivity near the upper limit previously observed in marine CB (lines 522 525). 

      Reviewer #2 (Public Review):

      Summary:

      In this work, Mohamed Y. El-Naggar and co-workers present a detailed electronic characterization of cable bacteria from Southern California freshwater sediments. The cable bacteria could be reliably enriched in laboratory incubations, and subsequent TEM characterization and 16S rRNA gene phylogeny demonstrated their belonging to the genus Candidatus Electronema. Atomic force microscopy and two-point probe resistance measurements were then used to map out the characteristics of the conductive nature, followed by microelectrode four-probe measurements to quantify the conductivity.

      Interestingly, the authors observe that some freshwater cable bacteria filaments displayed a higher degree of robustness upon oxygen exposure than what was previously reported for marine cable bacteria. Finally, a single nanofiber conductivity on the order of 0.1 S/cm is calculated, which matches the expected electron current densities linking electrogenic sulphur oxidation to oxygen reduction in sediment. This is consistent with hopping transport.

      Strengths and weaknesses:

      A comprehensive study is applied to characterize the conductive properties of the sampled freshwater cable bacteria. Electrostatic force microscopy and conductive atomic force microscopy provide direct evidence of the location of conductive structures. Four-probe microelectrode devices are used to quantify the filament resistance, which presents a significant advantage over commonly used two-probe measurements that include contributions from contact resistances. While the methodology is convincing, I find that some of the conclusions seem to be drawn on very limited sample sizes, which display widely different behavior. In particular:

      The authors observe that the conductivity of freshwater filaments may be less sensitive to oxygen exposure than previously observed for marine filaments. This is indeed the case for an interdigitated array microelectrode experiment (presented in Figure 5) and for a conductive atomic force microscopy experiment (described in line 391), but the opposite is observed in another experiment (Figure S1). It is therefore difficult to assess the validity of the conclusion until sufficient experimental replications are presented.

      We indeed acknowledge both in the abstract (line 23-26) and section 2.2 (lines 374-377) the variable nature of the sensitivity and filament-dependent response to air exposure. Our discussion (lines 498-506) considers the possible reasons for this variability:

      ‘While these observations showed a high degree of variability and therefore require a more detailed investigation, it is interesting to consider the possibility that the oxidative decline (or other damaging processes), thought to be a consequence of oxidation of Ni cofactors involved in electron transport (25), may not affect all sections of the cm long CB filaments simultaneously; under these conditions, IDA measurements, which probe multiple micrometer-scale electrode-crossing CB regions (e.g. 372 crossings in Figure 5 inset) may offer an advantage over techniques addressing entire CBs or specific CB regions. It is also interesting to consider an alternative possibility that the conductive properties of freshwater CB maybe intrinsically more oxygen-resistant than marine CB’.

      To summarize , the manuscript points to the likelihood that the IDA technique used here may offer an advantage for detecting currents under damaging conditions since it interrogates multiple sections simultaneously. Furthermore, in a recent preprint from Digel et al., (2023), the conductivity of the only freshwater strain investigated in that study was among the highest compared to other marine CB strains. Therefore, the freshwater CB being more resistant is one possibility to be investigated based on these observations and results. We therefore present the latter as a possibility in the discussion.

      The calculation of a single nanofiber conductivity is based on experiment and calculation with significant uncertainty. E.g. for the number of nanofibers in a single filament that varies depending on the filament size (Frontiers in microbiology, 2018, 9: 3044.), and the measured CB resistance, which does not scale well with inner probe separation (Figure 5). A more rigorous consideration of these uncertainties is required.

      The reviewer raises an important point. For these calculations, we made sure to determine the representative number of fibers per cable and thickness of the nanofibers (~50 nm) from our own samples. We indeed assessed the possible variability across our different cable filaments and found the fiber numbers varied from 30 – 44 (with 35 used as a representative figure in the paper). For the scaling of resistance with inner probe separation, our 4P results estimated that the CB resistances are 47 MΩ  and 240 MΩ for the 20 µm and 200 µm lengths, respectively, rather than an expected tenfold difference if the cable has a uniform conductivity along the entire filaments. This result suggests nonuniform conductivity in different sections of the CB filament. Since accounting for non-uniform conduction (and variability in fiber morphology/density) is clearly difficult, we were careful to limit our conclusion to an order of magnitude estimate (e.g. lines 522-525). Given the previously reported range of cable bacteria conductivity (10−2101 S/cm), this places our estimate within this range. We have further edited the manuscript to note that our reported single nanofiber conductivity cannot be constrained further than the order of 0.1 S/cm due to our estimates in nanofiber diameter and per cable amount as well as the possibility of nonuniform conductivity along the CB length (lines 522-525).

      Reviewer #2 (Recommendations for The Authors):

      Figure 4A: Please add scale- and color bar.

      Done - new Fig. 4 included with colors bars for topography and phase. The inset of Fig. 4A denotes a 200 nm scale bar (and that scale is now mentioned in the figure caption)

      Figure 5: A time series graph might be more instructive.

      Done - we indeed appreciate this suggestion and find that it improved the clarity of Figure 5. An inset has been included in Figure 5 plotting the resistance R change over time under different conditions. This inset demonstrates that the resistance of the cable on the IDA was slowly decreasing in the N2/H2 anaerobic chamber, only to start increasing upon exposure to ambient air.

      After putting the cable back into the chamber, the resistance again decreased over time.

    1. Author Response:

      We thank the reviewers for their insightful feedback. In our revised version of the manuscript, we will address all points raised.

      Regarding the preprocessing (Reviewer 1), we agree that the StandardRat pipeline is optimal for newly acquired datasets. However, since this study involves reanalyzing an already published dataset (Ionescu et al., JNM, 2023), which was preprocessed, analyzed, and published before the StandardRat paper, we aimed to maintain the same preprocessing. This approach allows for consistent interpretation of the readout regarding functional and molecular connectivity in the context of our previously published findings. Nonetheless, we agree that providing full access to the data will enable other researchers to reproduce our results using the StandardRat preprocessing pipeline and perform additional analyses on this rich dataset. Therefore, we will provide full access to the data via an open repository, as the reviewer suggested.

      Regarding anesthesia, we acknowledge that this is a limitation of our study, as more recent studies have indicated superior protocols. However, we and others have shown that, while not ideal, isoflurane at the used dose maintains stable physiology and does not cause burst suppression in rats. We will amend our discussion to reflect these points.

      Regarding the other points, we will amend the manuscript to provide more detail on the experimental design, including the tracer application as suggested by Reviewer 2, and clarify parts of the analysis that are unclear in the current version. Additionally, we agree with Reviewer 2 that our current terminology may cause confusion, and we will amend it accordingly. We will also discuss the other points raised by the reviewers, such as the reduced sample size for the pharmacological cohort as limitations in our discussion.

      Thank you for your understanding and the opportunity to improve our manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Editors’ recommendations for the authors

      The reviewers recommend the following: 

      (a) Digging deeper into the discussion of the density-dependent dispersal. 

      (b) Clarifying the microfluidic setup.  

      (c) Clarifying the description and interpretation of the transcriptomic evidence. 

      (d) Toning down carbon cycle connections (some reviewers felt the evidence did not fully support the claims). 

      We would like to thank the editors for their thoughtful evaluation of our manuscript and their clear suggestions. We have revised the manuscript in the light of these comments, as we outline below and address in detail in the point-by-point response to the reviewers’ comments that follows. 

      (a) We have expanded the discussion of density-dependent dispersal and revised Figure 2C to improve clarity. 

      (b) We have also added further information concerning the microfluidic setup in the results section and provide an illustration of the setup in a new figure panel, Figure 1A.

      (c) Addressing the reviewers’ comments on the transcriptomic analysis, we have added more information in the description and interpretation of the results. 

      (d) We have rephrased the text describing the role of degradation-dispersal cycles for carbon cycling to highlight it as the motivation of this study and emphasize the link to literature on foraging, without creating expectations of direct measurements of global carbon cycling.

      Public Reviews:

      Reviewer #1 (Public Review):

      [...]

      Weaknesses: 

      Much of the genetic analysis, as it stands, is quite speculative and descriptive. I found myself confused about many of the genes (e.g., quorum sensing) that pop up enriched during dispersal quite in contrast to my expectations. While the authors do mention some of this in the text as worth following up on, I think the analysis as it stands adds little insight into the behaviors studied. However, I acknowledge that it might have the potential to generate hypotheses and thus aid future studies. Further, I found the connections to the carbon cycle and marine environments in the abstract weak --- the microfluidics setup by the authors is nice, but it provides limited insight into naturalistic environments where the spatial distribution and dimensionality of resources are expected to be qualitatively different. 

      We thank the reviewer for their suggestions to improve our manuscript. We agree that the original manuscript would have benefitted from more detailed interpretation of the observed changes in gene expression. We have revised the manuscript to elaborate on the interpretation of the changes in expression of quorum sensing genes (see response to reviewer 1, comment 3), motility genes (see response to reviewer 1, comment 6), alginate lyase genes (see response to reviewer 1, comment 7 and reviewer 2, comment 2), and ribosomal and transporter genes (see response to reviewer 2, comment 2).

      In general, we think that the gene expression study not only supports the phenotypic observations that we made in the microfluidic device, such as the increased swimming motility when exposed to digested alginate medium, but  also adds further insights. Our reasoning for studying the transcriptomes in well mixed-batch cultures was the inability to study gene expression dynamics to support the phenotypic observations about differential motility and chemotaxis in our microfluidics setup. The transcriptomic data clearly show that even in well-mixed environments, growth on digested alginate instead of alginate is sufficient to increase the expression of motility and chemotaxis genes. In addition, the finding that expression of alginate lyases and metabolic genes is increased during growth on digested alginate was revealed through the analysis of transcriptomes, something which would not have been possible in the microfluidic setup. We agree with the reviewer that our analyses implicate further, perhaps unexpected, mechanisms like quorum sensing in the cellular response to breakdown products, and that this represents an interesting avenue for further studies.

      Finally, we  also agree with the reviewer that it would be good to be more explicit in the text that our microfluidic system cannot fully capture the complex dynamics of natural environments. Our approach does, however, allow the characterization of cellular behaviors at spatial and temporal scales that are relevant to the interactions of bacteria, and thus provides a better understanding of colonization and dispersal of marine bacteria in a manner that is not possible through in situ experiments. We have edited our manuscript to highlight this and modified our statements regarding carbon cycling towards emphasizing the role degradation-dispersal cycles in remineralization of polysaccharides (see response to reviewer 1, comment 2).  

      Reviewer #2 (Public Review):

      [...]

      Weaknesses: 

      The explanation of the microfluidics measurements is somewhat confusing but I think this could be easily remedied. The quantitative interpretation of the dispersal data could also be improved and I'm not clear if the data support the claim made. 

      We thank the reviewer for their comments and helpful suggestions. We have revised the manuscript with these suggestions in mind and believe that the manuscript is improved by a more detailed explanation of the microfluidic setup. We have added more information in the text (detailed in response to reviewer 2, comments 1 and 2) and have added a depiction of the microfluidic setup (Fig. 1A). We have also modified the presentation and discussion of the dispersal data (Fig. 2C), as described in detail below in response to reviewer 2, comment 4, and argue that they clearly show density-dependent dispersal. We believe that this modification of how the results are presented provides a more convincing case for our main conclusion, namely that the presence of degradation products controls bacterial dispersal in a density-dependent manner.  

      Reviewer #3 (Public Review):

      [...]

      Weaknesses: 

      I find this paper very descriptive and speculative. The results of the genetic analyses are quite counterintuitive; therefore, I understand the difficulty of connecting them to the observations coming from experiments in the microfluidic device. However, they could be better placed in the literature of foraging - dispersal cycles, beyond bacteria. In addition, the interpretation of the results is sometimes confusing. 

      We thank the reviewer for their suggestions to improve the manuscript. We have edited the manuscript to interpret the results of this study more clearly, in particular with regard to the fact that breakdown products of alginate cause cell dispersal (see response to reviewer 2, comment 1), gene expression changes of ribosomal proteins and transporters (see response to reviewer 2, comment 2), as well as genes relating to alginate catabolism (see response to reviewer 2, comment 3).

      To provide more context for the interpretation of our results we now also embed our findings in more detail in the previous work on foraging strategies and dispersal tradeoffs.

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The authors should clarify in more detail what they mean by density dependence in Figure 2. Usually density dependence refers to a per capita dependence, but here it seems that the per capita rate of dispersal might be roughly independent of density (Figure 2c; if you double the number of cells it doubles the number of cells leaving). Rather it seems the dispersal is such that the density of remaining cells falls below a threshold (~300 cells). 

      We thank the reviewer for raising this important point. To analyze the data more explicitly in terms of per capita dependence and so make the density dependence in the dispersal from the microfluidic chambers more clear, we have modified Figure 2C and edited the text. 

      In the modified Figure 2C, we computed the fraction of dispersed cells for each chamber (i.e the change in cell number divided by the cell number at the time of the nutrient switch). This quantity directly reveals the per-capita dependence, as mentioned by reviewer 1, and is now represented on the y-axis of Figure 2C instead of the absolute change in cell number. 

      These data demonstrate that the fraction of dispersed cells increases with increasing numbers of cells present in the chamber at the time of switching, with more highly populated chambers showing a higher fraction of dispersed cells. These findings indicate that there is a strong density dependence in the dispersal process.

      As pointed out by reviewer 1, another interesting aspect of the data is the transition at low cell number. The fraction of dispersed cells is negative in the case of the chamber with approximately 70 cells, consistent with no dispersal at this low density, and a moderate density increase as a function of continued growth.  

      In addition to the new analysis presented in Figure 2C, we have modified the paragraph that discusses this result as follows (line 208):

      “We indeed found that the nutrient switch caused a few or no cells to disperse from small cell groups (Fig. 2B), whereas a large fraction of cells from large cell groups dispersed (Fig. 2C). In fact, the e fraction of cells that dispersed upon imposition of the nutrient switch showed a strong positive relationship with the number of cells present, meaning that cells in chambers with many cells were more likely to disperse than cells in chambers with fewer cells (Fig. 2C).”

      (2) The authors should tone down their claims about the carbon cycle in the abstract. I do not believe the results as they stand could be used to understand degradation-dispersal cycles in marine environments relevant to the carbon cycle, since these behaviors have been studied in microfluidic environments which in my understanding are quite different. As such, statements such as "degradation-dispersal cycles are an integral part in the global carbon cycle, we know little about how cells alternate between degradation and motility" and "Overall, our findings reveal the cellular mechanisms underlying bacterial degradation-dispersal cycles that drive remineralization in natural environments" are overstated in the abstract. 

      We appreciate the reviewer’s comments regarding the connections of our work with the carbon cycle. We have now rephrased these statements in our manuscript to describe a potential connection between our work and the marine carbon cycle. The colonization of polysaccharides particles by bacteria and subsequent degradation has been widely acknowledged to play a significant role in controlling the carbon flow in marine ecosystems. (Fenchel, 2002; Preheim et al., 2011; Yawata et al., 2014, 2020). We still refer to carbon flow in the revised manuscript, though cautiously, as microbial remineralization of biomass, which is recognized as an important factor in the marine biological carbon pump (e.g., (Chisholm, 2000; Jiao et al., 2024). As stated in the previous version of the manuscript, the main motivation of our work was to study the growth behaviors of marine heterotrophic bacteria during polysaccharide degradation, especially to understand when bacteria depart already colonized and degraded particles and find novel patches to grow and degrade, a process that is poorly understood. Therefore, it is conceivable that degradation-dispersal cycles do play a role in the flow of carbon in marine ecosystems. However, we acknowledge that the carbon cycle is influenced by a multitude of biological and chemical processes, and the bacterial degradation-dispersal cycle might not be the sole mechanism at play. 

      We also appreciate the reviewer’s comments highlighting that the complexity of natural environments is not fully captured in our microfluidics system. However, our microfluidics setup does allow us to quantify responses and behaviors of microbial groups at high spatial and temporal resolution, especially in the context of environmental fluctuations. Microbes in nature interact at small spatial scales and have to respond to changes in the environment, and the microfluidics setup enables the quantification of these responses. Moreover, dispersal of the bacterium V. cyclitrophicus that we use in our study, has been previously observed even during growth on particulate alginate (Alcolombri et al., 2021), but the cues and regulation controlling dispersal behaviors have been unclear.  Microfluidic experiments have now allowed us to study this process in a highly quantitative manner, and align well with observations from experiments from more nature-like settings. These quantitative experiments on bacterial strains isolated from marine particles are expected to constrain quantitative models of carbon degradation in the ocean (Nguyen et al., 2022).

      We have now adjusted our statements throughout our manuscript to reflect the knowledge gaps in understanding the triggers of degradation-dispersal cycles and their links with carbon flow in marine ecosystems. The revised manuscript, especially, contains the following statements (line 47 and line 60):

      “Even though many studies indicate that these degradation-dispersal cycles contribute to the carbon flow in marine systems, we know little about how cells alternate between polysaccharide degradation and motility, and which environmental factors trigger this behavioral switch.”

      “Overall, our findings reveal cellular mechanisms that might also underlie bacterial degradation-dispersal cycles, which influence the remineralization of biomass in marine environments.”

      (3) The authors should clarify why they think quorum-sensing genes are increased in expression on digested alginate. The authors currently mention that QS could be used to trigger dispersal, but given the timescales of dispersal in Figure 2 (~half an hour), I find it hard to believe that these genes are expressed and have the suggested effect on those timescales. As such I would have expected the other way round - for QS genes to be expressed highly during alginate growth, so that density could be sensed and responded to. Please clarify. 

      We have now clarified this point in the revised manuscript. While the triggering of dispersal by quorum-sensing genes may indeed appear counterintuitive, and the response is rapid (we see dispersal of cells within 30-40 minutes), both observations are in line with previous studies in another model organism Vibrio cholerae. The dispersal time is similar to the dispersal time of V. cholerae cells from biofilms, as described by Singh and colleagues, (Figure 1E of Ref. Singh et al., 2017). In that case, induction of the quorum sensing dispersal regulator HapR was observed during biofilm dispersal within one hour after switch of condition (Fig. 2, middle panel of Ref. Singh et al., 2017). Even though the specific quorum sensing signaling molecules are probably different in our strain (there is no annotated homolog of the hapR gene in V. cyclitrophicus), we observed that the full set of quorum sensing genes was enriched in cells growing on digested alginate (as reported in line 314 and Fig. 4A).

      We have added this information in the manuscript (line 317): 

      “The set of quorum sensing genes was also positively enriched in cells growing on digested alginate (Fig. 4A and S4F, Table S13). This role in dispersal is in agreement with a previous study that showed induction of the quorum sensing master regulator in V. cholerae cells during dispersal from biofilms on a similar time scale as here (less than an hour) [28].”

      Reviewer #2 (Recommendations For The Authors):

      (1) Around line 144 - I don't really understand how you flow alginate through the microfluidic platform. It seems if the particles are transiently going through the microfluidic chamber then the flow rate and hence residence time of the alginate particles will matter a lot by controlling the time the cells have to colonize and excrete enzymes for alginate breakdown. Or perhaps the alginate is not particulate but is instead a large but soluble polymer? I think maybe a schematic of the microfluidic device would help -- there is an implicit assumption that we are familiar with the Dal Co et al device, but I don't recall its details and maybe a graphic added to Figure 1 would help. 

      a. In reviewing the Dal Co paper I see that cells are trapped and the medium flows through channels and the plane where the cells are held. I am still a little confused about the size of the polymeric alginate -- large scale (>1um) particles or very small polymers? 

      We have now provided a detailed description of our microfluidic experimental system. At the start of the experiments, cells are in fact not trapped within the microfluidic device, but grow and can move freely within a chamber designed with dimensions (sub-micron heights) so that growth occurs only as a monolayer. Cells were exposed to nutrients, either alginate or alginate digestion products, both in soluble form (not particles). These compounds were flowed into the device through a main channel, but entered the flowfree growth chambers by diffusion. To make these aspects of our experiments clearer, we have added further information on this in the Materials & Methods section (line 556), added this information in the abstract (line 51), and in the results (line123).

      To make our microfluidic setup clearer, we have followed this advice and added a schematic as Figure 1A and have added more information on the setup to the main text (line 153):

      “In brief, the microfluidic chips are made of an inert polymer (polydimethylsiloxane) bound to a glass coverslip. The PDMS layer contains flow channels through which the culture medium is pumped continuously. Each channel is connected to several growth chambers that are laterally positioned. The dimensions of these growth chambers (height: 0.85 µm, length: 60 µm, width: 90-120 µm) allow cells to freely move and grow as monolayers. The culture medium, containing either alginate or digested alginate in their soluble form, is constantly pumped through the flow channel and enters the growth chambers primarily through diffusion [15,16,4,17,8]. Therefore, the number of cells and their positioning within microfluidic chambers is determined by the cellular growth rate as well as by cell movement4. This setup combined with time-lapse microscopy allowed us to follow the development of cell communities over time.”

      (2) What makes this confusing is the difference between Figure 1C and Figure S2A -- the authors state that the difference in Figure 1C is due to dispersal, but is there flow through the microfluidic device? So what role does that flow through the device have in dispersal? Is the adhesion of the cell groups driven at all by a physical interaction with high molecular weight polymers in the microfluidic devices or is this purely a biological effect? Could this also be explained by different real concentrations of nutrients in the two cases? 

      We realize from this comment that the role of flow of the medium in the microfluidic setup was not clearly addressed in our manuscript. In fact, cells were not exposed to flow, and nutrients were provided to the growth chambers by diffusion. We have added a clearer explanation of this point on line 158:

      “The culture medium, containing either alginate or digested alginate in their soluble form, is constantly pumped through the flow channel and enters the growth chambers primarily through diffusion [15,16,4,17,8]. Therefore, the number of cells and their positioning within microfluidic chambers is determined by the cellular growth rate as well as by cell movement4.“

      One purely physical effect that we anticipate is that a high viscosity of the medium could immobilize cells. To address this point, we measured the viscosity of both alginate and digested alginate and conclude that the increase in viscosity is not strong enough to immobilize cells. We added a statement in the text (line 170)

      “To test the role of increased viscosity of polymeric alginate in causing the increased aggregation of cells, we measured the viscosity of 0.1% (w/v) alginate or digested alginate dissolved in TR media. For alginate, the viscosity was 1.03±0.01 mPa·s (mean and standard deviation of three technical replicates) whereas the viscosity of digested alginate in TR media was found to be 0.74±0.01 mPa·s. Both these values are relatively close to the viscosity of water at this temperature (0.89 mPa·s18) and, while they may affect swimming behavior [19], they are insufficient to physically restrain cell movement [20].”

      as well as a section in the Materials and Methods (line 594):

      “Viscosity of the alginate and digested alginate solution

      We measured the viscosity of alginate solutions using shear rheology measurements. We use a 40 mm cone-plate geometry (4° cone) in a Netzsch Kinexus Pro+ rheometer. 1200 uL of sample was placed on the bottom plate, the gap was set at 150 um and the sample trimmed. We used a solvent trap to avoid sample evaporation during measurement. The temperature was set to 25°C using a Peltier element. We measure the dynamic viscosity over a range of shear rates  = 0.1 – 100 s-1. We report the viscosity of each solution as the average viscosity measured over the shear rates 10 – 100 s-1, where the shear-dependence of the viscosity was low.

      We measured the viscosity of 0.1% (w/V) alginate dissolved in TR media, which was 1.03 +/- 0.01 mPa·s (reporting the mean and standard deviation of three technical replicates.). The viscosity of 0.1% digested alginate in TR media was found to be 0.74+/-0.01 mPa·s. This means that the viscosity of alginate in our microfluidic experiments is 36% higher than of digested alginate, but the viscosities are close to those expected of water (0.89 mPa·s at 25 degree Celsius according to Berstad and colleagues [18]).”

      While our microfluidic setup allows us to track the position and movement of cells in a spatially structured setting, these observations do not allow us to distinguish directly whether the differences in dispersal are a result of purely physical effects of polymers on cells or are a result of them triggering a biological response in cells that causes them to become sessile. It is known that bacterial appendages like pili interact with polysaccharide residues (Li et al., 2003). Therefore, it is quite plausible that cross-linking by polysaccharides can contribute growth behaviors on alginate. However, our analysis of gene expression demonstrates that flagellum-driven motility is decreased in the presence of alginate compared to digested alginate, alongside other major changes in gene expression. In addition, our measures of dispersal show that dispersal of cells when exposed to digested alginate is density dependent. Both observations suggest that the patterns in dispersal are governed by decision-making processes by cells resulting in changes in cell motility, rather than being a product of purely physical interactions with the polymer. 

      The finding that viscosities of both alginate and digested alginate are similar to that of water, suggests that diffusion of nutrients in the growth chambers should be similar. Therefore, we think that the differences in real concentrations of nutrients is likely not contributing to the observed differences in behavior. 

      (3) Why is Figure S1 arbitrary units? Does this have to do with the calibration of LC-MS? It would be better, it seems, to know the concentrations in real units of the monomer at least. 

      We agree with the reviewer that it would have been better to have absolute concentrations for these compounds. However, to calibrate the mass spectrometer signals (ion counts) to absolute concentrations for the different alginate compounds, we would need an analytical standard of known concentration. We are not aware of such a standard and thus report only relative concentrations. We agree that the y-axis label of Figure S1 should not contain ‘arbitrary’ units, as it shows a ratio (of measurements in the same arbitrary units). We have edited the labels of Figure S1 accordingly and the figure legend in line 26 of the Supplemental Material (“Relative concentrations…”).

      (4) Line 188 - density-dependent dispersal. The claim here is that "cells in chambers with many cells were more likely to disperse than cells in chambers with less cells." (my emphasis). Looking at the data in Figure 2C it appears that about 40% of the cells disperse irrespective of the density, before the switch to digested alginate. So it would seem that there is not a higher likelihood of dispersal at higher cell densities. For the very highest cell density, it does appear that this fraction is larger, but I'd be concerned about making this claim from what I understand to be a single experiment. To support the claim made should the authors plot Change in Cell number/Starting Cell number on the y-axis of Fig. 2C to show that the fraction is increasing? It would seem some additional data at higher starting cell densities would help support this claim more strongly. 

      We thank the reviewer for this comment, which is in line with a remark made by reviewer 1 in their comment 1. In response to these two comments (and as described above), we have edited Figure 2C and now have plotted the change in cell number relative to starting cell number at the y axis to directly show the density dependence. We observe a positive (approximately linear) relationship between the fraction of dispersed cells with the number of cells present in the chamber at the time of switching. This indicates that there is a density dependence in the dispersal process, with highly populated chambers showing a higher fraction of dispersed cells. 

      In addition to the change in Figure 2C, we have modified the paragraph around line 208: “We indeed found that the nutrient switch caused a few or no cells to disperse from small cell groups (Fig. 2B), whereas a large fraction of cells from large cell groups dispersed (Fig. 2C). In fact, the e fraction of cells that dispersed upon imposition of the nutrient switch showed a strong positive relationship with the number of cells present, meaning that cells in chambers with many cells were more likely to disperse than cells in chambers with fewer cells (Fig. 2C).”

      The highest cell number at the start of the switch that we include is about 800 cells. The maximum number of cells that can fit into a chamber are ca. 1000 cells. Thus, 800 resident cells are close to the maximal density.

      (5) A comment -- I find the result of significant chemotaxis towards alginate but not the monomers of alginate to be quite surprising. The ecological relevance of this (line 219) seems like an important result that is worth expanding on a bit at least in the discussion. For now, my question is whether the authors know of any mechanism by which chemotaxis receptors could respond to alginate but not the monomer. How can a receptor distinguish between the two? 

      We agree that this result is surprising, given that oligomers can be more easily transported into the periplasm where sensing takes place, and they also provide an easier accessible nutrient source. Indeed, in case of the insoluble polymer chitin it has been shown that chemotaxis towards chitin is mediated by chitin oligomers (Bassler et al., 1991), which was suggested as a general motif to locate polysaccharide nutrient sources (Keegstra et al., 2022). However, a recent study has changed this perspective by showing widespread chemotaxis of marine bacteria towards the glucose-based marine polysaccharide laminarin, but not towards laminarin oligomers or glucose (Clerc et al., 2023). Together with our results on chemotaxis towards alginate (but not significantly toward alginate oligomers) this suggests that chemotaxis towards soluble polysaccharides can be mediated by direct sensing of the polysaccharide molecules.

      As recommended, we expanded the discussion of the ecological relevance and also added more information on possible mechanisms of selective sensing of alginate and its breakdown products (around line 479).:

      “Direct chemotaxis towards polysaccharides may facilitate the search for new polysaccharide sources after dispersal. We found that the presence of degradation products not only induces cell dispersal but also increases the expression of chemotaxis genes. Interestingly, we found that V. cyclitrophicus ZF270 cells show chemotaxis towards polymeric alginate but not digested alginate. This contrasts with previous findings for bacterial strains degrading the insoluble marine polysaccharide chitin, where chemotaxis was strongest towards chitin oligomers53, suggesting that oligomers may act as an environmental cue for polysaccharide nutrient sources55. However, recent work has shown that certain marine bacteria are attracted to the marine polysaccharide laminarin, and not laminarin oligomers56. Together with our results, this indicates that chemotaxis towards soluble polysaccharides may be mediated by the polysaccharide molecules themselves. The mechanism of this behavior is yet to be identified, but could be mediated by polysaccharide-binding proteins as have been found in Sphingomonas sp. A1 facilitating chemotaxis towards pectin57. Direct polysaccharide sensing adds complexity to chemosensing as polysaccharides cannot freely diffuse into the periplasm, which can lead to a trade-off between chemosensing and uptake58. Furthermore, most polysaccharides are not immediately metabolically accessible as they require degradation. But direct polysaccharide sensing can also provide certain benefits compared to using oligomers as sensory cues. First, it could enable bacterial strains to preferably navigate to polysaccharide nutrients sources that are relatively uncolonized and hence show little degradation activity. Second, strong chemotaxis towards degradation products could hinder a timely dispersal process as the dispersal then requires cells to travel against a strong attractant gradient formed by the degradation products. Overall, this strategy allows cells to alternate between degradation and dispersal to acquire carbon and energy in a heterogeneous world with nutrient hotspots [44,59–61].”

      (6) Comment on lines 287-8 -- that the "positive enrichment of the gene set containing bacterial motility proteins matched the increase in motile cells that we observe in Fig 3E." I'm confused about what is meant by the word "matched" here. Is the implication that there is some quantitative correspondence between increased motility in Figure 3 and the change in expression in Figure 4? Or is the statement a qualitative one -- that motility genes are upregulated in the presence of digested alginate? Table S12 didn't help me answer this question. 

      We thank the reviewer for their helpful comment. Our original statement was a qualitative one - observing that gene expression enrichment in genes associated with bacterial motility aligned with our expectations based on the previous observation of an increase in motile cells. We have now changed the wording to highlight the qualitative nature of this statement (line 315):

      “The positive enrichment of the gene set containing bacterial motility proteins aligned with our expectations based on the increase in motile cells that we observed in Figure 3E (Fig. 4A, Table S12).”

      (7) Line 326 - what is the explanation for the production of public enzymes in the presence of digest? How does this square with the previous narrative about cells growing on alginate digest expressing motility genes and chemotaxing towards alginate? It seems like the story is a bit tenuous here in the sense that digested alginates stimulate both motility - which is hypothesized to drive the discovery of new alginate particles - and lyase enzymes which are used to degrade alginate. So do the high motility cells that are chemotaxing towards alginate also express lyases en route? I'm of the opinion that constructing narratives like these in the absence of a more quantitative understanding of the colonization and degradation dynamics of alginate particles presents a major challenge and may be asking more of the data than the data can provide. 

      a. I noted later that this is addressed later around lines 393 in the Discussion section.

      Indeed, the notion that the presence of breakdown products triggers motility and also increases the expression of alginate lyases and other metabolic genes for alginate catabolism seems counterintuitive. We have now expanded our discussion of these results to contextualize these findings (around line 443):

      "One reason for this observation may be that cells primarily rely on intracellular monosaccharide levels to trigger the upregulation of genes associated with polysaccharide degradation and catabolism, as has previously been observed for E. coli across various carbon sources [50,51]. In fact, the majority of carbon sources are sensed by prokaryotes through one‑component sensors inside the cell50. In the one‑component internal sensing scheme, the enzymes and transporters for the use of various carbon sources are expressed at basal levels, which leads to an increase in pathway intermediates upon nutrient availability. The pathway intermediates are sensed by an internal sensor, usually a transcription factor, and lead to the upregulation of transporter and enzyme expression [50,51]. This results in a positive feedback loop, which enables small changes in substrate abundance to trigger large transcriptional responses [50,52]. Thus, the presence of alginate breakdown products may likely result in increased expression of all components of the alginate degradation pathway, including the expression of degrading enzymes. As the gene expression analysis was performed on well-mixed cultures in culture medium containing alginate breakdown products, we therefore expect a strong stimulation of alginate catabolism. In a natural scenario, where cells disperse from a polysaccharide hotspot before its exhaustion, the expression of alginate catabolism genes may likely decrease again once the local concentration of breakdown products decreases. However, continued production of alginate lyases could also provide an advantage when encountering a new alginate source and continued production of alginate lyases may thus help cells to prepare for likely future environments. Further investigations of bacterial enzyme secretion in changing nutrient environments and at relevant spatial scales are required to improve our understanding of the regulation of enzyme secretion along nutrient gradients."

      (8) I like Figure 6, and I think this hypothesis is a good result from this paper, but I think it would be important to emphasize this as a proposal that needs further quantitative analysis to be supported. 

      We have now edited the manuscript to make this point more clear. While both degradation and dispersal are well-appreciated parts of microbial ecology, the transitions and underlying mechanisms are unclear. We have edited the discussion to improve the clarity (line 419): 

      “This cycle of biomass degradation and dispersal has long been discussed in the context of foraging e.g., [44,45,13,46,47], but the cellular mechanisms that drive the cell dispersal remain unclear.”

      Also, we have updated Figure 6 to indicate more clearly which new findings this work proposes (now bold font) and which previous findings that were made in different bacterial taxa and carbon sources that aligns with our  work (now light font). We edited the figure legend accordingly (line 503):

      "By integrating our results with previous studies on cooperative growth on the same system, as well as results on dispersal cycles in other systems, we highlight where the specific results of this work add to this framework (bold font)."

      Minor comments 

      (1) Is there any growth on the enzyme used for alginate digestion? E.g. is the enzyme used to digest the alginate at sufficiently high concentrations that cells could utilize it for a carbon/nitrogen source? 

      We thank the reviewer for raising this point. We added the following paragraph as Supplemental Text to address it (line 179):

      “Protein amount of the alginate lyases added to create digested alginate

      Based on the following calculation, we conclude that the amount of protein added to the growth medium by the addition of alginate lyases is so small that we consider it negligible. In our experiment we used 1 unit/ml of alginate lyases in a 4.5 ml solution to digest the alginate. As the commercially purchased alginate lyases are 10,000 units/g, our 4.5 ml solution contains 0.45 mg of alginate lyase protein. The digested alginate solution diluted 45x when added to culture medium. This means that we added 0.18 µg alginate lyase protein to 1 ml of culture medium.

      As a comparison, for 1ml of alginate medium, 1000µg of alginate is added or for 1 ml of Lysogeny broth (LB) culture medium, 3,500 µg of LB are added.  Thus, the amount of alginate lyase protein that we added is ca. 5000 - 20,000 times smaller than the amount of alginate or LB that one would add to support cell growth. Therefore, we expect the growth that the digestion of the added alginate lyases would allow to be negligible.”

      (2) The lines in Figure 2B are very hard to see. 

      We have addressed this comment by using thicker lines in Figure 2B.

      (3) The black background and images in Figure 3A and B are hard to see as well. 

      We have now replaced Figure 3A and B, now using a white background.

      (4) Typo at the beginning of line 251? 

      Unfortunately we failed to find the typo referred to. We are happy to address it if it still exists in the revised manuscript.

      Reviewer #3 (Recommendations For The Authors):

      (1) I think there is not enough experimental evidence to conclude that the underlying cause of increased motility is the accumulation of digested alginate products. To conclusively show that this is the cause and not just some signal linked to cell density, perhaps the experiment should be repeated with a different carbon source. 

      We thank the reviewer for their comment, which made us realize that we did not make the nature of the dispersal cue clear. The gene expression data was obtained from batch cultures and measured at the same approximate bacterial densities in batch, which indeed shows that the digested alginate is a sufficient signal for an increase in motility gene expression. This agrees very well with our observation that cells growing on digested alginate in microfluidic chambers have an increased fraction of motile cells in comparison with cells exposed to alginate (Fig 3E). However, we did not mean to suggest that the observed dispersal by bacterial motility is not influenced by cell density, in fact, we see that dispersal (and hence the increase in cell motility) in microfluidic chambers that are switched from polymeric to digested alginate depends on the bacterial density in the chamber, with higher bacterial densities showing increased dispersal. This shows that the presence of alginate oligomers does trigger dispersal through motility, but this signal affects bacterial groups in a cell density dependent manner.

      Similar observations have been made in Caulobacter crescentus, which was found to form cell groups on the polymer xylan while cells disperse when the corresponding monomer xylose becomes available (D’Souza et al., 2021). We reference the additional work in lines 179 and 230. Taken together, these observations indicate a more general phenomenon in dispersal from polysaccharide substrates.

      (2) About the expression data: 

      • Ribosomal proteins and ABC transporters are enriched in cells grown on digested alginate and the authors discuss that this explains the difference in max growth rate between alginate and digested alginate. However, in Figure S2E the authors report no statistical difference between growth rates. 

      We have now edited the manuscript to clarify this point. We found that cells grown on degradation products reached their maximal growth rate around 7.5 hours earlier (Fig. S2D) and showed increased expression of ribosomal biosynthesis and ABC transporters in late-exponential phase (Fig. 4A). We consider this shorter lag time as a sign of a different growth state and therefore a possible reason for the difference in ribosomal protein expression.

      As the reviewer correctly points out, the maximum growth rates that were computed from the two growth curves were not significantly different (Fig. S2E). However, for our gene expression analysis, we harvested the transcriptome of cells that reached OD 0.39-0.41 (mid- to late-exponential phase). At this time point, the cell cultures may have differed in their momentary growth rate.

      We edited the manuscript to make this clearer (line 287):

      “Both observations likely relate to the different growth dynamics of V. cyclitrophicus ZF270 on digested alginate compared to alginate (Fig. S2A), where cells in digested alginate medium reached their maximal growth rate 7.5 hours earlier and thus showed a shorter lag time (Fig. S2D). As a consequence, the growth rate at the time of RNA extraction (mid-to-late exponential phase) may have differed, even though the maximum growth rate of cells grown in alginate medium and digested alginate medium were not found to be significantly different (Fig. S2E).”

      • The increased expression of transporters for lyases in cells grown on digested alginate (lines 273-274 and 325-328) is very confusing and the explanation provided in lines 412-420 is not very convincing. My two cents on this: Expression of more enzymes and induction of motility might be a strategy to be prepared for more likely future environments (after dispersal, alginate is the most likely carbon source they will find). This would be in line with observed increased chemotaxis towards the polymer rather than the monomer (Similar to C. elegans). 

      This comment is in line with reviewer 2, comment 7. In response to these two comments (and as described above), we expanded our discussion of these results to contextualize these findings (around line 443):

      “One reason for this observation may be that cells primarily rely on intracellular monosaccharide levels to trigger the upregulation of genes associated with polysaccharide degradation and catabolism, as has previously been observed for E. coli across various carbon sources [50,51]. In fact, the majority of carbon sources are sensed by prokaryotes through one‑component sensors inside the cell [50]. In the one‑component internal sensing scheme, the enzymes and transporters for the use of various carbon sources are expressed at basal levels, which leads to an increase in pathway intermediates upon nutrient availability. The pathway intermediates are sensed by an internal sensor, usually a transcription factor, and lead to the upregulation of transporter and enzyme expression [50,51]. This results in a positive feedback loop, which enables small changes in substrate abundance to trigger large transcriptional responses [50,52]. Thus, the presence of alginate breakdown products may likely result in increased expression of all components of the alginate degradation pathway, including the expression of degrading enzymes. As the gene expression analysis was performed on well-mixed cultures in culture medium containing alginate breakdown products, we therefore expect a strong stimulation of alginate catabolism. In a natural scenario, where cells disperse from a polysaccharide hotspot before its exhaustion, the expression of alginate catabolism genes may likely decrease again once the local concentration of breakdown products decreases. However, continued production of alginate lyases could also provide an advantage when encountering a new alginate source and continued production of alginate lyases may thus help cells to prepare for likely future environments. Further investigations of bacterial enzyme secretion in changing nutrient environments and at relevant spatial scales are required to improve our understanding of the regulation of enzyme secretion along nutrient gradients.”

      Additionally, we agree with the intriguing comment that continued expression of alginate lyases may also prepare cells for likely future environments. Further studies that aim to answer whether marine bacteria are primed by their growth on one carbon source towards faster re-initiation of degradation on a new particle will be an interesting research question. We now address this point in our manuscript (line 458):

      “However, continued production of alginate lyases could also provide an advantage when encountering a new alginate source and continued production of alginate lyases may thus help cells to prepare for likely future environments. Further investigations of bacterial enzyme secretion in changing nutrient environments and at relevant spatial scales are required to improve our understanding of the regulation of enzyme secretion along nutrient gradients.“

      (3) The yield reached by Vibrio on alginate is significantly higher than the yield in digested alginate, not similar, as stated in lines 133-134. Only cell counts are similar. Perhaps the author can correct this statement and speculate on the reason leading to this discrepancy: perhaps cells tend to aggregate in alginate despite the fact that these are well-mixed cultures. 

      We have edited the description of the OD measurements accordingly and agree with the reviewer that aggregation is indeed a possible reason for the discrepancy (line 141):

      “We also observed that the optical density at stationary phase was higher when cells were grown on alginate (Fig. S2B and C). However, colony counts did not show a significant difference in cell numbers (Fig. S3), suggesting that the increased optical density may stem from aggregation of cells in the alginate medium, as observed for other Vibrio species [7].”

      (4) I suggest toning down the importance of the results presented in this study for understanding global carbon cycling. There is a link but at present it is too much emphasized. 

      We have edited our statements regarding the carbon cycle. In the revised manuscript we stress the lack of direct quantifications of carbon cycling. . We still refer to carbon flow in the revised manuscript, as we would argue that microbial remineralization of biomass is recognized as an important factor in the marine biological carbon pump (e.g., Chisholm, 2000) and research on marine bacterial foraging investigates how bacterial cells manage to find and utilize this biomass.

      Our revised manuscript contains the following modified statements (line 47 and line 60): “Even though many studies indicate that these degradation-dispersal cycles contribute to the carbon flow in marine systems, we know little about how cells alternate between polysaccharide degradation and motility, and which environmental factors trigger this behavioral switch.”

      “Overall, our findings reveal cellular mechanisms that might also underlie bacterial degradation-dispersal cycles, which influence the remineralization of biomass in marine environments.”

      References

      • Alcolombri, U., Peaudecerf, F. J., Fernandez, V. I., Behrendt, L., Lee, K. S., & Stocker, R. (2021). Sinking enhances the degradation of organic particles by marine bacteria. Nature Geoscience, 14(10), 775–780. https://doi.org/10.1038/s41561-021-00817-x
      • Bassler, B. L., Gibbons, P. J., Yu, C., & Roseman, S. (1991). Chitin utilization by marine bacteria. Chemotaxis to chitin oligosaccharides by Vibrio furnissii. Journal of Biological Chemistry, 266(36), 24268–24275. https://doi.org/10.1016/S0021-9258(18)54224-1
      • Chisholm, S. W. (2000). Stirring times in the Southern Ocean. Nature, 407(6805), 685–686. https://doi.org/10.1038/35037696
      • Chubukov, V., Gerosa, L., Kochanowski, K., & Sauer, U. (2014). Coordination of microbial metabolism. Nature Reviews. Microbiology, 12(5), 327–340. https://doi.org/10.1038/nrmicro3238
      • Clerc, E. E., Raina, J.-B., Keegstra, J. M., Landry, Z., Pontrelli, S., Alcolombri, U., Lambert, B. S., Anelli, V., Vincent, F., Masdeu-Navarro, M., Sichert, A., De Schaetzen, F., Sauer, U., Simó, R., Hehemann, J.-H., Vardi, A., Seymour, J. R., & Stocker, R. (2023). Strong chemotaxis by marine bacteria towards polysaccharides is enhanced by the abundant organosulfur compound DMSP. Nature Communications, 14(1), 8080. https://doi.org/10.1038/s41467-023-43143z
      • Dal Co, A., van Vliet, S., Kiviet, D. J., Schlegel, S., & Ackermann, M. (2020). Shortrange interactions govern the dynamics and functions of microbial communities. Nature Ecology and Evolution, 4(3), 366–375. https://doi.org/10.1038/s41559-019-1080-2
      • D’Souza, G., Ebrahimi, A., Stubbusch, A., Daniels, M., Keegstra, J., Stocker, R., Cordero, O., & Ackermann, M. (2023). Cell aggregation is associated with enzyme secretion strategies in marine polysaccharide-degrading bacteria. The ISME Journal. https://doi.org/10.1038/s41396-023-01385-1
      • D’Souza, G. G., Povolo, V. R., Keegstra, J. M., Stocker, R., & Ackermann, M. (2021). Nutrient complexity triggers transitions between solitary and colonial growth in bacterial populations. The ISME Journal, 15(9), 2614–2626. https://doi.org/10.1038/s41396-021-00953-7
      • D’Souza, G., Schwartzman, J., Keegstra, J., Schreier, J. E., Daniels, M., Cordero, O. X., Stocker, R., & Ackermann, M. (2023). Interspecies interactions determine growth dynamics of biopolymer-degrading populations in microbial communities. Proceedings of the National Academy of Sciences of the United States of America, 120(44), e2305198120. https://doi.org/10.1073/pnas.2305198120
      • Fenchel, T. (2002). Microbial Behavior in a Heterogeneous World. Science, 296(5570), 1068–1071. https://doi.org/10.1126/science.1070118
      • Jiao, N., Luo, T., Chen, Q., Zhao, Z., Xiao, X., Liu, J., Jian, Z., Xie, S., Thomas, H., Herndl, G. J., Benner, R., Gonsior, M., Chen, F., Cai, W.-J., & Robinson, C. (2024). The microbial carbon pump and climate change. Nature Reviews Microbiology. https://doi.org/10.1038/s41579-024-01018-0
      • Keegstra, J. M., Carrara, F., & Stocker, R. (2022). The ecological roles of bacterial chemotaxis. Nature Reviews Microbiology, 20(8), 491–504. https://doi.org/10.1038/s41579-022-00709-w
      • Konishi, H., Hio, M., Kobayashi, M., Takase, R., & Hashimoto, W. (2020). Bacterial chemotaxis towards polysaccharide pectin by pectin-binding protein. Scientific Reports, 10(1), 3977. https://doi.org/10.1038/s41598-020-60274-1
      • Li, Y., Sun, H., Ma, X., Lu, A., Lux, R., Zusman, D., & Shi, W. (2003). Extracellular polysaccharides mediate pilus retraction during social motility of Myxococcus xanthus. Proceedings of the National Academy of Sciences, 100(9), 5443–5448. https://doi.org/10.1073/pnas.0836639100
      • Martínez-Antonio, A., Janga, S. C., Salgado, H., & Collado-Vides, J. (2006). Internal sensing machinery directs the activity of the regulatory network in Escherichia coli. Trends in Microbiology, 14(1), 22–27. https://doi.org/10.1016/j.tim.2005.11.002
      • McDougald, D., Rice, S. A., Barraud, N., Steinberg, P. D., & Kjelleberg, S. (2012). Should we stay or should we go: Mechanisms and ecological consequences for biofilm dispersal. Nature Reviews Microbiology, 10(1), 39–50. https://doi.org/10.1038/nrmicro2695
      • Nguyen, T. T. H., Zakem, E. J., Ebrahimi, A., Schwartzman, J., Caglar, T., Amarnath, K., Alcolombri, U., Peaudecerf, F. J., Hwa, T., Stocker, R., Cordero, O. X., & Levine, N. M. (2022). Microbes contribute to setting the ocean carbon flux by altering the fate of sinking particulates. Nature Communications, 13(1), 1657. https://doi.org/10.1038/s41467-022-29297-2
      • Norris, N., Alcolombri, U., Keegstra, J. M., Yawata, Y., Menolascina, F., Frazzoli, E., Levine, N. M., Fernandez, V. I., & Stocker, R. (2022). Bacterial chemotaxis to saccharides is governed by a trade-off between sensing and uptake. Biophysical Journal, 121(11), 2046–2059. https://doi.org/10.1016/j.bpj.2022.05.003
      • Povolo, V. R., D’Souza, G. G., Kaczmarczyk, A., Stubbusch, A. K., Jenal, U., & Ackermann, M. (2022). Extracellular appendages govern spatial dynamics and growth of Caulobacter crescentus on a prevalent biopolymer. bioRxiv, 2022.06.13.495907. https://doi.org/10.1101/2022.06.13.495907
      • Preheim, S. P., Boucher, Y., Wildschutte, H., David, L. A., Veneziano, D., Alm, E. J., & Polz, M. F. (2011). Metapopulation structure of Vibrionaceae among coastal marine invertebrates. Environmental Microbiology, 13(1), 265–275. https://doi.org/10.1111/j.1462-2920.2010.02328.x
      • Schwartzman, J. A., Ebrahimi, A., Chadwick, G., Sato, Y., Orphan, V., & Cordero, O. X. (2021). Bacterial growth in multicellular aggregates leads to the emergence of complex lifecycles. bioRxiv, 2021.11.01.466752. https://doi.org/10.1101/2021.11.01.466752
      • Singh, P. K., Bartalomej, S., Hartmann, R., Jeckel, H., Vidakovic, L., Nadell, C. D., & Drescher, K. (2017). Vibrio cholerae Combines Individual and Collective Sensing to Trigger Biofilm Dispersal. Current Biology, 27(21), 3359-3366.e7. https://doi.org/10.1016/j.cub.2017.09.041
      • Ulrich, L. E., Koonin, E. V., & Zhulin, I. B. (2005). One-component systems dominate signal transduction in prokaryotes. Trends in Microbiology, 13(2), 52–56. https://doi.org/10.1016/j.tim.2004.12.006
      • Wall, M. E., Hlavacek, W. S., & Savageau, M. A. (2004). Design of gene circuits: Lessons from bacteria. Nature Reviews Genetics, 5(1), 34–42. https://doi.org/10.1038/nrg1244
      • Yawata, Y., Carrara, F., Menolascina, F., & Stocker, R. (2020). Constrained optimal foraging by marine bacterioplankton on particulate organic matter. Proceedings of the National Academy of Sciences, 117(41), 25571–25579. https://doi.org/10.1073/pnas.2012443117
      • Yawata, Y., Cordero, O. X., Menolascina, F., Hehemann, J.-H., Polz, M. F., & Stocker, R. (2014). Competition–dispersal tradeoff ecologically differentiates recently speciated marine bacterioplankton populations. Proceedings of the National Academy of Sciences, 111(15), 5622–5627. https://doi.org/10.1073/pnas.1318943111
      • Zöttl, A., & Yeomans, J. M. (2019). Enhanced bacterial swimming speeds in macromolecular polymer solutions. Nature Physics, 15(6), 554–558. https://doi.org/10.1038/s41567-019-0454-3
    1. Author response:

      The following is the authors’ response to the original reviews.

      Response to Reviewer 1

      (Cys25)PTH(1-84) does not show efficacy surpassing that of the previously used rhPTH(1-34). This needs to be discussed biologically and clinically.

      Thank you very much for your valuable comments for enhancing the manuscript. We appreciate your input and have noted that this aspect was not addressed in the discussion. The authors have included the following paragraph in discussion section.

      “This biological difference is thought to be due to dimeric R25CPTH(1-34) exhibiting a more preferential binding affinity for the RG versus R0 PTH1R conformation, despite having a diminished affinity for either conformation. Additionally, the potency of cAMP production in cells was lower for dimeric R25CPTH compared to monomeric R25CPTH, consistent with its lower PTH1R-binding affinity.  (Noh et al., 2024) One of the potential clinical advantages of dimeric R25CPTH(1-34) is its partial agonistic effect in pharmacodynamics. This property may allow for a more fine-tuned regulation of bone metabolism, potentially reducing the risk of adverse effects associated with full agonism, such as hypercalcemia and bone resorption by osteolcast activity. Moreover, the dimeric form may offer a more sustained anabolic response, which could be beneficial in the context of long-term treatment strategies. (Noh et al., 2024) Also, the effects of dimer were prominent, as we mentioned better bone formation than the control group.” (2nd paragraph, Discussion section)

      The terms (Cys25)PTH(1-84) and Dimeric R25CPTH(1-34) are being used interchangeably and incorrectly. A unification of these terms is necessary.

      We totally agree with the reviewer’s notion. R25CPTH(1-84) represents mutated human PTH, rhPTH(1-34) and dimeric R25CPTH(1-34) are synthesized PTH analogs. To clarified the terminology, we thus have changeed the terminology in the manuscript appear in red.

      The figure legend is incorrect. Not all figures are described, and even though there are figures from A to I, only up to E is explained, or the content is different.

      We apologize for our negligence. As suggested by a reviewer, we've fixed the figure legends throughout before the list of references in the manuscript as follows.

      “Figure legends

      Figure 1. Micro-CT analysis (A-D) Experimental design for the controlled delivery of rhPTH(1-34) and dimeric R25CPTH(1-34) in ovariectomized beagle model. Representative images for injection and placement of titanium implant. (E) Micro-CT analysis. bone mineral density (BMD), bone volume (TV; mm3), trabecular number (Tb.N; 1/mm), trabecular thickness (Tb. Th; um), trabecular separation (Tb.sp; ㎛). Error bars indicate standard deviation. Data are shown as mean ± s.d. *p<0.05, **p<0.01, ***p<0.001, n.s., not significant.  P, posterior. R, right

      Figure 2. (A-I) Histological analysis of the different groups stained in Goldner’s trichrome. The presence of bone is marked by the green color and soft tissue in red. Red arrows indicate the position with soft tissues without bone around the implant threads. The area of bone formed was the widest in the rhPTH(1-34)-treated group. In the dimeric R25CPTH(1-34)treated group, there is a greater amount of bone than vehicle-treated group. Green arrows represent the bone formed over the implant. blue dotted line, margin of bone and soft tissue; Scale bars: 1mm

      Figure 3. Histological analysis using Masson trichrome staining results in the rhPTH(1-34) and dimeric R25CPTH(1-34)-treated group (A-L) Masson trichrome-stained sections of cancellous bone in the mandibular bone. The formed bone is marked by the color red. Collagen is stained blue. Black dotted box magnification region of trabecular bone in the mandible. Scale bars, A-C, G-I: 1mm; D-F, J-L: 200 ㎛

      Figure 4. Immunohistochemical analysis using TRAP staining for bone remodeling activity (A-L) TRAP staining is used to evaluate bone remodeling by staining osteoclasts. Osteoclasts is presented by the purple color. Black dotted box magnification region of trabecular bone in the mandible. (M, N) The number of TRAP-positive cells in the mandible of the rhPTH(1-34) and dimeric R25CPTH(1-34)-treated beagles. Scale bars, A-C, G-I: 1mm; D-F, J-L: 200 ㎛. Error bars indicate standard deviation. Data are shown as mean ± s.d. *p<0.05, **p<0.01, n.s., not significant

      Figure 5. Measurement of biochemical Marker Dynamics in serum. The serum levels of calcium, phosphorus, P1NP, and CTX across three time points (T0, T1, T2) following treatment with dimeric dimeric R25CPTH(1-34), rhPTH(1-34), or control. (A-B) Calcium and phosphorus levels exhibit an upward trend in response to both PTH treatments compared to control, suggesting enhanced bone mineralization. (C) P1NP levels, indicative of bone formation, remain relatively unchanged across time and treatments. (D) CTX levels, associated with bone resorption, show no significant differences between groups. Data points for the dimeric R25CPTH(1-34), rhPTH(1-34), and control are marked by squares, circles, and triangles, respectively, with error bars representing confidence intervals.

      Supplementary Figure. Three-dimensional reconstructed image of the bone surrounding the implants. Three-dimensional reconstructed images of the peri-implant bone depicting the osseointegration after different therapeutic interventions. (A) Represents the bone response to recombinant human parathyroid hormone fragment (rhPTH 1-34) treatment, showing the most robust degree of bone formation around the implant in the three groups. (B) Shows the bone response to a modified PTH fragment (dimeric R25CPTH(1-34)), indicating a similar level of bone growth and integration as seen with rhPTH(1-34), although to a slightly lesser extent. (C) Serves as the control group, demonstrating the least amount of bone formation and osseointegration. The upper panel provides a top view of the bone-implant interface, while the lower panel offers a cross-sectional view highlighting the extent of bony ingrowth and integration with the implant surface.”

      In Figure 5, although the descriptions of T0, T1, T2 are mentioned in the method section, it would be more clear if there was a timeline like in Figure 1.

      Based on the reviewer’s advice, we have indicated the timing of T0, T1, and T2 in the materials & methods section describing the serum biochemical assay, and we have shown a timeline in figure 5.

      In Figure 5, instead of having calcium, phosphorus, P1NP, CTX graphs all under Figure 5, it would be more convenient for referencing in the text to label them as Figure 5A, Figure 5B, Figure 5C, Figure 5D.

      We totally understood the reviewer’s comment. As the reviewer’s suggested, we have corrected the labeling in the text for figure 5 as follows.

      “The levels of calcium, phosphorus, CTX, and P1NP were analyzed over time using RM-ANOVA (Figure 5). There were no significant differences between the groups for calcium and phosphorus at time points T0 and T1 (Figure 5A). However, after the PTH analog was administered at T2 (Figure 5A), the levels were highest in the rhPTH(1-34) group, followed by the dimeric R25CPTH(1-34) group, and then, lowest in the control group, which was statistically significant (Figure 5B,C). (P < 0.05) The differences between the groups over time for CTX and P1NP were not statistically significant (Figure 5D, E).”

      Significance should be indicated in the figure (no asterisk present).

      As the reviewer’s comment, we put the asterisk in the figure 5.

      Addition of Figures in Text:

      Line 112: change from "figure 2" to "figure 1" / Line 115: mention "figure 1. E"

      Line 120: refer to "figure 1. E" / Line 123: change from "figure 3" to "figure 2"

      Line 128: refer to "figure 2.A-C" / Line 137: mention "figure 3"

      Line 138: refer to "figure 3. A-L" / Line 143: mention "figure 3. A-L"

      Line 144: refer to "figure 3. E,F,K,L" / Line 148: mention "figure 4"

      Line 150: refer to "figure 4 M,N" / Line 152: mention "figure 4. M,N"

      Line 155: refer to "figure 5" / Line 157: mention "figure 5"

      Line 159: refer to "figure 5" / Line 171: mention "figure 1 E"

      Line 175: refer to "figure 2 M, N"/ Line 194: mention "figure 3"

      Above all, thank you for the reviewer’s notion. We corrected detailed figure labeling in text to red color.

      Response to Reviewer 2

      First, the authors should clarify why they compared the effects of rhPTH(1-34) and of dimeric R25C2 PTH(1-34)? In most of the parameters, rhPTH(1-34) seems to be superior to dimeric R25C2 PTH(1-34). Why did the authors insist that the anabolic effects of dimer were prominent? Even though implication of dimeric R25C2 PTH(1-34) was drawn from genetic mutation studies, the authors should describe more clearly in the discussion the potential clinical benefits of the dimeric R25C2 PTH(1-34) compared to rhPTH(1-34), especially if dimeric R25C2 PTH(1-34) has just partial agonistic effect in pharmacodynamics.

      Thank you for your insightful comments and questions regarding our results between rhPTH(1-34) and dimeric R25CPTH(1-34). rhPTH(1-34) is a well-characterized therapy for osteoporosis. In this study, rhPTH(1-34) generally showed superior outcomes in most parameters tested, the dimeric R25CPTH(1-34) exhibited specific anabolic effects that are not as pronounced with rhPTH(1-34). We recognized R25CPTH(1-34) as a anabolic effector. One of the potential advantages of dimeric R25CPTH(1-34) is its partial agonistic effect in pharmacodynamics. This property may allow for a more fine-tuned regulation of bone metabolism, potentially reducing the risk of adverse effects associated with full agonism, such as hypercalcemia and bone resorption by osteolast activity. Moreover, the dimeric form may offer a more sustained anabolic response, which could be beneficial in the context of long-term treatment strategies. Also, based on our results, we notes that the effects of dimer were prominent, as we mentioned better bone formation than the control group. We appreciate your input and have noted that this aspect was not addressed in the discussion. As a result, we have included the following paragraph in discussion section.

      “This biological difference is thought to be due to dimeric R25CPTH(1-34) exhibiting a more preferential binding affinity for the RG versus R0 PTH1R conformation, despite having a diminished affinity for either conformation. Additionally, the potency of cAMP production in cells was lower for dimeric R25CPTH compared to monomeric R25CPTH, consistent with its lower PTH1R-binding affinity.  (Noh et al., 2024) One of the potential clinical advantages of dimeric R25CPTH(1-34) is its partial agonistic effect in pharmacodynamics. This property may allow for a more fine-tuned regulation of bone metabolism, potentially reducing the risk of adverse effects associated with full agonism, such as hypercalcemia and bone resorption by osteolcast activity. Moreover, the dimeric form may offer a more sustained anabolic response, which could be beneficial in the context of long-term treatment strategies. (Noh et al., 2024) Also, the effects of dimer were prominent, as we mentioned better bone formation than the control group.” (2nd paragraph, Discussion section)

      Second, please describe the intermittent and continuous application of PTH analogues. Many of the readers may misunderstand that the authors' daily injection of PTHs were actually to mimic the clinical intermittent application or continuous one. Incorporation of the author's intention for experimental design would be more helpful for readers.

      Thank you for your insightful comments regarding the need for clearer differentiation between intermittent and continuous applications of PTH analogs in this study. We appreciate your concern that the readers may not fully grasp whether our daily injection protocol was intended to mimic clinical intermittent or continuous PTH administration. To address this, we have revised the manuscript to explicitly clarify that the daily injections of rhPTH(1-34) and dimeric R25CPTH(1-34) were designed to simulate the intermittent dosing regimen commonly used in clinical practice. This regimen is known to maximize the anabolic effects on bone while minimizing potential catabolic actions associated with more frequent or continuous hormone exposure. We have added detailed explanations in the Introduction, Methods, and Discussion sections to help readers understand our experimental design and its relevance to clinical settings.

      Introduction section

      “Administration of prathyroid hormone (PTH) analogs can be categorized into two distinct protocols: intermittent and continuous. Intermittent rhPTH(1-34) therapy, typically characterized by daily injections, is clinically used to enhance bone formation and strength. This method leverages the anabolic effects of rhPTH(1-34) without significant bone resorption, which can occur with more frequent or continuous exposure. On the other hand, continuous rhPTH(1-34) exposure, often modeled in research as constant infusion, tends to accelerate bone resorption activities, potentially leading to bone loss (Silva and Bilezikian, 2015; Jilka, 2007). Understanding these differences is crucial for interpreting the therapeutic implications of rhPTH(1-34) in bone health.”

      Silva, B. C., & Bilezikian, J. P. (2015). Parathyroid hormone: anabolic and catabolic actions on the skeleton. Current Opinion in Pharmacology, 22, 41-50.

      Jilka, R. L. (2007). Molecular and cellular mechanisms of the anabolic effect of intermittent PTH. Bone, 40(6), 1434-1446.

      Materials and Methods section

      “Each animal received one injection per day, aimed at replicating the intermittent rhPTH(1-34) exposure proven beneficial for bone regeneration and overall skeletal health in clinical settings (Neer et al., 2001; Kendler et al., 2018). This regimen was chosen to investigate the potential anabolic effects of these specific PTH analogs under conditions closely resembling therapeutic use.”

      Neer, R. M., Arnaud, C. D., Zanchetta, J. R., Prince, R., Gaich, G. A., Reginster, J. Y., Hodsman, A. B., Eriksen, E. F., Ish-Shalom, S., Genant, H. K., Wang, O., and Mitlak, B. H. (2001). Effect of Parathyroid Hormone (1-34) on Fractures and Bone Mineral Density in Postmenopausal Women with Osteoporosis. The New England Journal of Medicine, 344(19), 1434-1441.

      Kendler, D. L., Marin, F., Zerbini, C. A. F., Russo, L. A., Greenspan, S. L., Zikan, V., Bagur, A., Malouf-Sierra, J., Lakatos, P., Fahrleitner-Pammer, A., Lespessailles, E., Minisola, S., Body, J. J., Geusens, P., Moricke, R., & Lopez-Romero, P. (2018). Effects of Teriparatide and Risedronate on New Fractures in Post-Menopausal Women with Severe Osteoporosis (VERO): A Multicenter, Double-Blind, Double-Dummy, Randomized Controlled Trial. The Lancet, 391(10117), 230-240.

      Discussion section

      “The use of daily injections in this study was intended to simulate intermittent PTH therapy, a well-established clinical approach for managing osteoporosis and enhancing bone regeneration. Intermittent administration of PTH, as opposed to continuous exposure, is critical for maximizing the anabolic response while minimizing the catabolic effects that are associated with higher frequency or continuous hormone levels. Our findings support the notion that even with daily administration, both rhPTH(1-34) and dimeric dimeric R25CPTH(1-34) promote bone formation and osseointegration, consistent with the outcomes expected from intermittent therapy. It’s important for future research to consider the dosage and timing of administration to further optimize the therapeutic benefits of PTH analogs (Dempster et al., 2001; Hodsman et al., 2005).”

      Dempster, D. W., Cosman, F., Kurland, E. S., Zhou, H., Nieves, J., Woelfert, L., Shane, E., Plavetic, K., Müller, R., Bilezikian, J., & Lindsay, R. (2001). Effects of Daily Treatment with Parathyroid Hormone on Bone Microarchitecture and Turnover in Patients with Osteoporosis: A Paired Biopsy Study. Journal of Bone and Mineral Research, 16(10), 1846-1853.

      Hodsman, A. B., Bauer, D. C., Dempster, D. W., Dian, L., Hanley, D. A., Harris, S. T., Kendler, D. L., McClung, M. R., Miller, P. D., Olszynski, W. P., Orwoll, E., Yuen, C. K. (2005). Parathyroid Hormone and Teriparatide for the Treatment of Osteoporosis: A Review of the Evidence and Suggested Guidelines for Its Use. Endocrine Reviews, 26(5), 688-703.

      Third, please unify the nomenclature. Ensure consistency in the nomenclature throughout the article. Unify the naming conventions for PTH analogues, such as rhPTH(1-34) vs teriparatide and (Cys25)PTH(1-84) vs R25CPTH(1-34) vs R25CPTH(1-34) vs (1-84). Choose one nomenclature for each analogue and use it consistently throughout the article.

      We totally agree with the reviewer’s notion. R25CPTH(1-84) represents mutated human PTH, rhPTH(1-34) and dimeric R25CPTH(1-34) are synthesized PTH analogs. To clarified the terminology, we thus have changed the terminology in the manuscript appear in red.

      Response to Reviewer 3

      I would recommend to rewrite the manuscript in a form that is more understandable to the readers. In fact, it appears to me that this work was originally formatted in a way that would need the Materials and Methods to precede the results. As presented (and as requested by the eLife formatting) the Materials and Methods are available only at the end of the reading and, as a consequence, the readers needs to refer to the Materials and Methods to have a general and initial understanding of the study design (i.e. type of treatment for each group, etc are not well specified in the Results section).

      Thank you for you constructive comments and suggestions regarding the manuscript. We appreciate your feedback on the organization of the manuscript entirely. As reviewer mentioned, Materials and methods were placed after the discussion section in accordance with the format of the elife journal. For a better and initial understanding, a description of each experimental group has been added to the Results section as follow. Thank you again for your valuable comments.

      “To investigate evaluating and comparing the efficacy of rhPTH(1-34) and the dimeric R25CPTH(1-34) in promoting bone regeneration and healing in a clinically relevant animal model. In our study, beagle dogs were selected as the model due to their anatomical similarity to human oral structures, suitable size for surgeries, human-like bone turnover rates, and established oral health profiles, ensuring comparable and ethically sound research outcomes. The normal saline injected-control group, injected with 40ug/day PTH (Forsteo, Eli Lilly) group, and 40ug/day PTH analog-injected group. Animals in each group were injected subcutaneously for 10 weeks.”

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      In this fundamental study, the authors use innovative fine-scale motion capture technologies to study visual vigilance with high-acuity vision, to estimate the visual fixation of free-feeding pigeons. The authors present convincing evidence for use of the fovea to inspect predator cues, the behavioral state influencing the latency for fovea use, and the use of the fovea decreasing the latency to escape of both the focal individual and other flock members. The work will be of broad interest to behavioral ecologists.

      We thank the editor for his interest and feedback on the manuscript. We hereafter addressed the comments of the reviewer.

      Reviewer #1 (Public Review):

      Summary:

      The authors were using an innovative technic to study the visual vigilance based on high-acuity vision, the fovea. Combining motion-capture features and visual space around the head, the authors were able to estimate the visual fixation of free-feeding pigeon at any moment. Simulating predator attacks on screens, they showed that 1) pigeons used their fovea to inspect predators cues, 2) the behavioural state (feeding or head-up) influenced the latency to use the fovea and 3) the use of the fovea decrease the latency to escape of both the individual that foveate the predators cues but also the other flock members.

      Strengths:

      The paper is very interesting, and combines innovative technic well adapted to study the importance of high-acuity vision for spotting a predator, but also of improving the behavioural response (escaping). The results are strong and the models used are well-adapted. This paper is a major contribution to our understanding of the use of visual adaptation in a foraging context when at risk. This is also a major contribution to the understanding of individual interaction in a flock.

      Weaknesses:

      I have identified only two weaknesses:

      (1) The authors often mixed the methods and the results, Which reduces the readability and fluidity of the manuscript. I would recommend the authors to re-structure the manuscript.<br /> (2) In some parts, the authors stated that they reconstructed the visual field of the pigeon, which is not true. They identified the foveal positions, but not the visual fields, which involve different sectors (binocular, monocular or blind). Similarly, they sometimes mix-up the area centralis and the fovea, which are two different visual adaptations.

      Thank you for your positive feedback. We addressed these comments by restructuring the methods and result sections as suggested, and by checking the terminology and specific vocabulary used throughout the manuscript.

      Reviewer #1 (Recommendations For The Authors):

      First, I would like to say that I really enjoyed the manuscript. This is a great contribution to the field.

      Thank you for the positive feedback, we highly appreciate it.

      Then, I have some comments that I hope, would help the authors to improve the manuscript.

      Major comments :

      I would recommend the authors to restructure the methods and the results section. In many parts, the models used are presented in the results section, while this should be presented in the methods section.

      Thank you for the suggestion, we now have ensured that the model descriptions are presented in the statistic section of the methods.

      To me, the introduction is too long (more than 5 pages). It would be beneficial to reduce it considerably. Furthermore, in the introduction, it misses some information about the visual abilities of your species ((visual acuity, visual field, temporal resolution, contrast sensitivity....).

      We agree that the introduction was very long and reduced it by removing the “Methodological issues” as well as strongly reducing the “Experimental rationales” to a minimum. We also added the missing information on the visual abilities of the pigeons in the “Experimental rationales” section (see L135-150). Please note, however, that we refer to the temporal resolution of pigeon vision in the method section, to associate it with the information of the used monitor’s resolution.

      Minor comments :

      Lines 37-39: This needs a reference.

      A reference has been added (McFarland, 1977)

      Lines 39-41: But see some papers published recently on Harris's hawks.

      Thank you for the references, we added the citation as well as a few more papers (Kane et al., 2015; Kano et al., 2018; Miñano et al., 2023; Yorzinski & Platt, 2014).

      Lines 41-43: This sentence needs a reference as well.

      A reference has been added (Cresswell, 1994; M. H. R. Evans et al., 2018; Inglis & Lazarus, 1981)

      Lines 56-103: In this paragraph, head down and head up also depends from the retinal map of the birds! Some birds have visual streak that allow them to see a potential threats while foraging. Please add more information about the importance of photoreceptors distribution.

      Thank you for pointing out this issue. We rewrote the sentence L65-69 as follows to include the importance retinal structures.

      “In several species, especially those with a broad visual field and specific retinal structures such as the visual streaks, individuals can simultaneously engage in foraging activities while remaining vigilant (Fernández-Juricic, 2012), likely using peripheral vision to detect approaching threats (Bednekoff & Lima, 2005; Cresswell et al., 2003; Kaby & Lind, 2003; Lima & Bednekoff, 1999).”

      Lines 76-79: you wrote : ".... favor alternative hypotheses based on their findings". Which findings? You need to explain.

      We rewrote this part as follows (L80-81).

      “other studies found evidence for the risk dilution (Beauchamp & Ruxton, 2008) and the edge effect (Inglis & Lazarus, 1981) in their study systems.”

      Lines 109-110: It would be good to have a representation of what is an area and a fovea, and how it is placed in the eye, what type of fovea exists and how it is related to visual field. Where does it project?

      We now give a better description of the pigeon’s visual field in the experimental rationales section that we hope will help the reader understanding the key features of pigeon’s vision (see L135-150). Specifically, we now say in L137-138:

      “they have one fovea centrally located in the retina of each eye, with an acuity of 12.6 c/deg (Hodos et al., 1985). Their fovea projects laterally at ~75° into the horizon in their visual field.”

      Lines 109-113: You might need to see some new papers here about the fovea. See for instance Bringmann 2019.

      Thank you for the suggestion, we now give a more precise definition of the fovea and refer to Bringmann’s paper for more details (L113-114):

      “a pit-like area in the retina with high concentration of cone cells where visual acuity is highest, and is responsible for sharp, detailed, and color vision.”

      Lines 113-120: Please explain how the visual field is related to fovea? Where is the fovea project in the visual fields?

      Similarly to the question above, we now give a more precise description of the pigeon’s visual field (see L135-150).

      Line 131-134: For a non-expert, you would need to explain what is micro, meso and macro scale?

      These sentences have been removed when shortening the introduction and we are not referring to micro, meso and macro scales anymore.

      Lines 134-136: Please explain in one sentence the technique here.

      We now explain in one sentence how motion capture enables the tracking of head and body orientation (L130-132):

      “Motion capture cameras track with high accuracy the 3D position of markers, which, when attached to the pigeon’s head and body, enables to reconstruct the rotations of the head and body in all directions.”

      Line 140: You presented here for the first time the word "foveation". Has this term been used before? If so, please add a reference. If not, please explain what you mean by foveation precisely.

      Thank you for noticing this lack. We are now providing the following definition “directing visual focus to the fovea to achieve the clearest vision” in the first place where we mention the term foveation (L149-150).

      Lines 146-148: Please explain why this proves that it is appropriate to not record eyes movements, and is this true for every behaviours?

      We acknowledge that some small eye movement might occur and reduce the accuracy of the method. This error is considered in the system using the +-10 degrees range around the foveas. The lines the reviewer referred to were removed when shortening the introduction, but we added an explanation in the paragraph describing pigeon vision to make it clearer (L147-150):

      “Yet, it should be noted that their eye movement was not tracked in our system, although it is typically confined within a 5 degrees range (Wohlschläger et al., 1993). We thus considered this estimation error of the foveation (directing visual focus to the fovea to achieve the clearest vision) in our analysis, as a part of the error margin (see Methods).”

      Lines 161-163: What is the frontal and binocular field for? You would need to explain the different fields of view and what they are supposed to be for.

      Furthermore, does the visual field of pigeon have been studied? If so, you would need to add more information about it.

      This information is now given in the new paragraph describing the pigeon’s vision in the  “Experimental rationales” section (see L135-150).

      Figure 1: It is not clear here which panels correspond to a, b or c. Please use some boxes to clarify it.

      Thank you for the comment, we now have made the figure’s sub-panels clearer.

      Lines 193-194: You wrote "... such as foveas (also known as the area centralis). No, this is not the same.

      (1) In some species, you have two foveas, one placed centrally in the retina, one place temporally. So the fovea is not the area centralis.

      (2) Second, some species do have an area centralis but without a fovea.

      Thank you for pointing out the inaccuracy. In this case, we were referring specifically to the pigeon’s fovea which is sometimes referred to as “area centralis”, but we now changed the sentence as follow to avoid any confusion (L174-175):

      “The initial two hypotheses (Hypotheses 1 and 2) aim to examine whether foveation correlates with predator detection.”

      Lines 192-212: I did not understand the logic of the hypotheses numbers? Why do you have 2.1 but not 3.1 for instance? And if you have two hypotheses for the within a global one (for instance, 2.1 and 2.2), what is the main hypothesis 2? You should explain more here because we get lost here and in the result section as well.

      We recognize this section might have appeared confusing to the reader. In short, we had four main hypotheses: 1) the fovea is used to evaluate predator cues, 2) the latency to foveate is related to vigilance behaviors. These first 2 hypotheses aimed to determine if the latency to foveate on the predator cue could be related to the detection. 3) foveation is related to the escape response of the pigeons and 4) there is a collective influence in the escape response. We further divided some of the hypotheses into 2 sub-hypotheses whenever 2 different tests were used to answer the same question. We have modified this section to be clearer.

      Lines 224-229: Where are the figures and statistics for these results?

      These results are presented in Table S1. We apologize for forgetting to add this reference and have now added it (L211).

      Lines 229-231: This should be in the method section.

      This model explanation (as well as all other hereafter mentioned) have been moved to the method section as suggested.

      Lines 248-252: This should be in the method section. Furthermore, you should better explain the model selection.

      Please see earlier comment. Additionally, we are now better explaining how the model has been built.

      Figure 2: It is not clear on the figure which letters correspond to which panels. Please improve the readability of the figure.

      It was modified accordingly.

      Lines 274-278: This should be in the method section.

      Please see earlier comment.

      Line 281: The "Fig.3" should be mentioned in the previous sentence.

      It was modified accordingly.

      Figure 3: Please explain why the latency to foveate had negative values in Fig.2 but not here, and not in Fig. 4 as well. This again highlights that we missed a number of information in the methods about the transformation of the data and the model selection.

      The variable presented in Fig 2d is not the latency to foveate but the “Normalized frequency at which the object was observed within foveal regions” (hypothesis 1). It represents the amount of time the object was lying within one of the foveal regions of the individual (“how long the pigeons foveated on it”), further normalized to unit sum to make all objects comparable. This variable was indeed logit-transformed (hence the negative value) to improve residual fit in the model, but this information (as well as other transformations) are always clearly stated on the axis caption of the graphs. Additionally, we now have improved the statistical analysis section to make the model used for each hypothesis testing clearer. But please let us know if you have suggestions for a further improvement in terms of presentation.

      Lines 297-301: This should be in the method section.

      Please see earlier comment.

      Lines 301-305: Fig. 3 b and c only referred to the two first factors. Please add more figures for the other factors. This could be in supp. Mat.

      We added the 3 graphs for the proportion of time foveating on the monitor, the saccade rate and the proportion of time foveating on conspecifics in the supplementary (Fig S6).

      Lines 306-309: This should be in methods, and you should have explained in methods how you performed your model selection.....

      We prefer leaving this paragraph in the result section, as it was intended to give the reader extra information on the predictive power of the different variables (by comparing the effectiveness of the models including one variable at a time, all the rest being equal) and not on the model selection per se. However, we now explain our goal better in the statistics section regarding this analysis (L635-636):

      “We further tested the relative predictive power of the different test variables by comparing the resulting models’ efficiency using AIC scores.”

      Lines 317-319: This should be in the method section.

      Please see earlier comment.

      Lines 320-322: This should be in the method section.

      Please see earlier comment.

      Lines 332-334: This should be in the method section.

      Please see earlier comment.

      Lines 334-336: Then, if this is not significant, you cannot say that.

      Thank you for noticing the inaccuracy, we have now rephrased it as (L298-299):

      “Earlier foveation of the first pigeon was not significantly related to an earlier escape responses among the other flock members, although there was a trend (χ2(1) = 3.66, p = 0.0559).”

      Line 336: Please explain why you did different models. We missed a lot of information in the method about your strategy for statistics.?

      We have now added a lot more information on the models in the statistics section, according to this comment as well as the previous ones. We hope the explanations of the analyses are now clearer to the reader.

      Lines 339-349: This should be in the method section.

      Please see earlier comment.

      Results section: As you may have understood, there are too many sentence that should be moved into the method section. Futhermore, I would recommend to modify the headdings so that they are more biologically speaking. Similarly to what you have done in the discussion section.

      Thank you for the comments. We agree with most of them, and have modified the manuscript accordingly. Additionally, we now use the same headings in the results section as the ones used in the discussion to make the text easier to follow.

      Lines 500-501: What were the body weight of the pigeon? At which weight of their full weight they were?

      This information is now added (492 ± 41g; mean ± SD). We did not control the amount of food during our experiments and only ensured 24h without food by feeding the pigeons after the experiment was completed. This information was added as follows (L454-456):

      “On experimental days, they were fed only after the experiments was completed; this ensures 24-hour no feeding at the time of the experiment, although we did not control the amount of the food over the course of the experimental periods.”

      Line 522-523: Those screens are very good for pigeons.

      Thank you for the positive comment, we indeed tried to match bird vision as close as possible.

      Lines 527-528: At which frequency was produced the moving stimulus? Your screen can display up to 144Hz, which is very good. But can your laptop do it? If not, it is important to mention it as pigeons may have a temporal resolution of vision up to 149Hz.

      Our laptop indeed supports 144Hz display. In addition, we now mention the temporal resolution of pigeon vision (L480-482).

      “We specifically chose a monitor with high temporal resolution to match the pigeon’s Critical Flicker Fusion Frequency (threshold at which a flickering light is perceived by the eye as steady) that reaches up to 143Hz (Dodt & Wirth, 1954).”

      Lines 555-572: Did you use a control shape in your experiment? Indeed, they may escape because of a moving pattern but not a predator shape?

      We did not use a control shape, as the aim of the experiment was not to directly test the effect of the shape itself. We designed the predator cue to resemble an approaching predator to ensure a response from the pigeons, but it might be that other shapes would have worked as well.

      Lines 588-589: Please explain why the coordinate system of the pigeon's head is considered as the visual field?

      From what I have understood, you did not reconstruct the visual fields, but only the position of the fovea. This should be noted like this as visual field involves more than a sphere around the head (binocular and monocular sectors, blind sectors, vertical extension....).

      Thank you for noticing the inaccuracy, we indeed did not consider other sectors of the visual field and therefore rephrased it as (L551): “the location of the objects and conspecifics from the pigeon’s perspective”.

      Lines 601-604: How much does it represent?

      As this was estimated by visual inspection, we do not have the exact percentage of data loss that was caused by grooming. However, because of the number of cameras in the SMART BARN motion capture system, it is reliable in detecting markers inside the space in “ideal” conditions (without occlusion). For example, a similar set-up found marker track loss of only <1% using a model bird (Itahara & Kano 2022)

      Itahara, A., & Kano, F. (2022). “Corvid Tracking Studio”: A custom-built motion capture system to track head movements of corvids. Japanese Journal of Animal Psychology, 72(1), 1–16. https://doi.org/10.2502/janip.72.1.1

      Lines 610-612: You would need to cite Wood 1917 and Hodos et al. 1991 who described the presence of a fovea in this species.

      We added both citations to the manuscript.

      Line 611: Again, the fovea is not egal to area centralis.

      Thank you, we changed it as well.

      Lines 625-626: you wrote "... in a few instances....". Please explain more. How many? What proportion?

      This happened in 9 observations out of 120. We now specify it in the text as well (L587-589):

      “in a few instances (9 out of 120 observations), pigeons foveated on the model predator after the looming stimulus had disappeared, but these cases were excluded from our analysis.”

      Lines 640-653: We missed a lot of information in the section "statistical analysis". If you moved most of the sentence from the results that describe the methods in the method section, that would be much better. Furthermore, you would need to explain more what statistics you used, which model selection, what type of data transformation....

      We agree this section lacked information, and we moved the information from the result to the statistics section.

      Supplmentary materials: boxplots from Fig. S1 and S2 are too small and impossible to read. Please improve the readability.

      We now have enlarged these plots to make them more readable.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript "Engineering of PAClight1P78A: A High-Performance Class-B1 GPCR-Based Sensor for PACAP1-38" by Cola et al. presents the development of a novel genetically encoded sensor, PAClight1P78A, based on the human PAC1 receptor. The authors provide a thorough in vitro and in vivo characterization of this sensor, demonstrating its potential utility across various applications in life sciences, including drug development and basic research.

      The diverse methods to validate PAClight1P78A demonstrate a comprehensive approach to sensor engineering by combining biochemical characterization with in vivo studies in rodent brains and zebrafish. This establishes the sensor's biophysical properties (e.g., sensitivity, specificity, kinetics, and spectral properties) and demonstrates its functionality in physiologically relevant settings. Importantly, the inclusion of control sensors and the testing of potential intracellular downstream effects such as G-protein activation underscore a careful consideration of specificity and biological impact.

      Strengths:

      The fundamental development of PAClight1P78A addresses a significant gap in sensors for Class-B1 GPCRs. The iterative design process -starting from PAClight0.1 to the final PAClight1P78A variant - demonstrates compelling optimization. The innovative engineering results in a sensor with a high apparent dynamic range and excellent ligand selectivity, representing a significant advancement in the field. The rigorous in vitro characterization, including dynamic range, ligand specificity, and activation kinetics, provides a critical understanding of the sensor's utility. Including in vivo experiments in mice and zebrafish larvae demonstrates the sensor's applicability in complex biological systems.

      Weaknesses:

      The manuscript shows that the sensor fundamentally works in vivo, albeit in a limited capacity. The titration curves show sensitivity in the nmol range at which endogenous detection might be possible. However, perhaps the sensor is not sensitive enough or there are not any known robust paradigms for PACAP release. A more detailed discussion of the sensors's limitations, particularly regarding in vivo applications and the potential for detecting endogenous PACAP release, would be helpful.

      We thank the reviewer for carefully analyzing our in vivo data and highlighting the limitation of our results regarding the sensor’s applicability in detecting endogenous PACAP. We added several sections conversing future possibilities for optimization in the discussion (see paragraphs 2-4). We agree that a more specific discussion of the limitations of our study is an important addition to help design future experiments. 

      There are several experiments with an n=1 and other low single-digit numbers. I assume that refers to biological replicates such as mice or culture wells, but it is not well defined. n=1 in experimental contexts, particularly in Figure 1, raises significant concerns about the exact dynamic range of the sensor, data reproducibility, and the robustness of conclusions drawn from these experiments. Also, ROI for cell cultures, like in Figure 1, is not well defined. The methods mentioned ROIs were manually selected, which appears very selective, and the values in Figure 1c become unnecessarily questionable. The lack of definition for "ROI" is confusing. Do ROIs refer to cells, specific locations on the cell membrane, or groups of cells? It would be best if the authors could use unbiased methods for image analysis that include the majority of responsive areas or an explanation of why certain ROIs are included or excluded.

      We thank the reviewer for the helpful suggestions. We have increased the number of replicates to n=3 for both HEK293T and neuron data depicted in Fig.1c. Furthermore, we have added Fig.1c’ containing the quantification of the maximum responses obtained in the dataset shown in Fig.1c also depicting the single values for each replicate. To clarify the definition of an ROI in our manuscript, we have detailed the process of ROI selection in the Methods section “Cell culture, imaging and quantification section”. Additionally, we also increased mouse numbers for in vivo PACAP infusions in mice (see Figure 4g).

      Reviewer #2 (Public Review):

      Summary:

      The PAClight1 sensor was developed using an approach successful for the development of other fluorescence-based GPCR sensors, which is the complete replacement of the third intracellular loop of the receptor with a circularly-permuted green fluorescent protein. When expressed in HEK cells, this sensor showed good expression and a weak but measurable response to the extracellular presence of PACAP1-38 (a

      F/Fo of 43%). Additional mutation near the site of insertion of the linearized GPF, at the C-terminus of the receptor, and within the second intracellular loop produced a final optimized sensor with F/Fo of >1000%. Finally, screening of mutational libraries that also included alterations in the extracellular ligand-binding domain of the receptor yielded a molecule, PAClight1P78A, that exhibited a high ligand-dependent fluorescence response combined with a high differential sensitivity to PACAP (EC50 30 nM based on cytometric sorting of stably transfected HEK293 cells) compared to its congener VIP, (with which PACAP shares two highly related receptors, VPAC1 and VPAC2) as well as several unrelated neuropeptides, and significantly slowed activation kinetics by PACAP in the presence of a 10-fold molar excess of the PAC1 antagonist PACAP6-38. A structurally highly similar control construct, PAClight1P78Actl, showed correspondingly similar basal expression in HEK293 cells, but no PACAP-dependent enhancement in fluorescent properties.

      PAClight1P78A was expressed in neurons of the mouse cortex via AAV9.hSyn-mediated gene transduction. Slices taken from PAClight1P78A-transfected cortex, but not slices taken from PAClight1P78Actl-transfected cortex exhibited prompt and persistent elevation of F/Fo after 2 minutes of perfusion with PACAP1-38 which persisted for up to 14 minutes and was statistically significant after perfusion with 3000, but not 300 or 30 nM, of peptide. Likewise, microinfusion of 200 nL of 300 uM PACAP1-38 into the cortex of optical fiber-implanted freely moving mice elicited a F/Fo (%) of greater than 15, and significantly higher than that elicited by application of similar concentrations of VIP, CRF, or enkephalin, or vehicle alone. In vivo experiments were carried out in zebrafish larvae by the introduction of PAClight1P78A into single-cell stage Danio rerio embryos using a Tol2 transposase-based plasmid with a UAS promoter via injection (of plasmid and transposase mRNA), and sorting of post-fertilization embryos using a marker for transgenesis carried in the UAS :

      PAClight1P78A construct. Expression of PAClight1P78A was directed to cells in the olfactory bulb which express the fish paralog of the human PAC1 receptor by using the Tg(GnRH3:gal4ff) line, and fluorescent signals were elicited by intracerebroventricular administration of PACAP1-38 at a single concentration (1 mM), which were specific to PACAP and to the presence of PAClight1P78A per se, as controlled by parallel experiments in which PAClight1P78Actl instead of PAClight1P78A was contained in the transgenic plasmid.

      Major strengths and weaknesses of the methods and results

      The report represents a rigorous demonstration of the elicitation of fluorescent signals upon pharmacological exposure to PACAP in nervous system tissue expressing PAClight1P78A in both mammals (mice) and fish (zebrafish larvae). Figure 4d shows a change in GFP fluorescence activation by PACAP occurring several seconds after the cessation of PACAP perfusion over a two-minute period, and its persistence for several minutes following. One wonders if one is apprehending the graphical presentation of the data incorrectly, or if the activation of fluorescence efficiency by ligand presentation is irreversible in this context, in which case the utility of the probe as a real-time indicator, in vivo, of released peptide might be diminished.

      We thank the reviewer for their careful consideration of our manuscript and agree that the activation of PAClight persisting for several minutes at micromolar concentrations could be a potential limitation for in vivo applications. We added a possible explanation for the persisting sensor activation in response to artificial application of PACAP38 in paragraph 3 of the discussion. We agree that this addition eases the interpretation of PAClight signals detected in vivo. 

      Appraisal of achievement of aims, and data support of conclusions:

      Small cavils with controls are omitted for clarity; the larger issue of appraisal of results based on the scope of the designed experiments is discussed in the section below. An interesting question related to the time dependence of the PACAP-elicited activation of PAClight1P87A is its onset and reversibility, and additional data related to this would be welcome.

      We agree that the reversibility of the sensor’s fluorescence is indeed an important feature especially for detecting endogenous PACAP release. Our data indicate that the sensor’s fluorescence is reversible when detecting small to medium doses of PACAP38 (see Figure 4d – Application of 30-300nM) that are presumably closer to physiological concentrations than the non-reversible concentration of 3000nM. Please, see also our new discussion on peptide concentrations in paragraph 4 of our discussion. For future experiments, it is indeed advisable to adjust the interval of repeated applications to the decay of the response at the respective concentration. Considering, the long-lasting downstream effects of endogenous signaling, longer intervals between ligand applications are generally preferred to match more closely the physiological range in which endogenous PAC1 is most likely affective. 

      Discussion of the impact of the work, and utility of the methods and data:

      Increasingly, neurotransmitter function may be observed in vivo, rather than by inferring in vivo function from in vitro, in cellular, or ex vivo experimentation. This very valuable report discloses the invention of a genetically encoded sensor for the class B1 GPCR PAC1. PAC1 is the major receptor for the neuropeptide PACAP, which in turn is a major neurotransmitter involved in brain response to psychogenic stress, or threat, in vertebrates as diverse as mammals and fishes. If this sensor possesses the sensitivity to detect endogenously released PACAP in vivo it will indeed be an impactful tool for understanding PACAP neurotransmission (and indeed PACAP action in general, in immune and endocrine compartments as well) in future experiments.

      However, the sensor has not yet been used to detect endogenously released PACAP. Until this has been done, one cannot answer the question as to whether the levels of exogenously perfused/administered PACAP used here merely to calibrate the sensor's sensitivity are indeed unphysiologically high. If endogenous PACAP levels don't get that high, then the sensor will not be useful for its intended purpose. The authors should address this issue and allude to what kind of experiments would need to be done in order to detect endogenous PACAP release in living tissue in intact animals. The authors could comment upon the success of other GPCR sensors that have been used to observe endogenous ligand release, and where along the pathway to becoming a truly useful reagent this particular sensor is.

      We thank the reviewer for highlighting the lack in clarity that the scope of this paper was not intended to cover the detection of endogenous PACAP release. We therefore expanded our discussion to encompass the intended purpose of detecting artificially infused or applied PAC1 agonists, such as conducting fundamental tests of drug specificity and developing new pharmacological ligands to selectively target PAC1. This includes a more detailed discussion of our in vivo findings and a clearer phrasing that stresses the potential application for applied drugs and not endogenous PACAP (see last paragraph in the discussion).

      We also agree that little is known about endogenous concentrations of PACAP in the brain. However, we have supplemented our discussion with several references estimating lower concentrations of PACAP and other peptides in vivo, suggesting average PACAP levels below the detection threshold of the sensor. Importantly, within certain brain regions and in closer proximity to release sites, significantly higher concentrations might be reached. Additionally, our data indicate that the concentrations observed under our current conditions do not saturate the sensor in vivo.  

      We therefore acknowledge the reviewer’s comment on the sensor’s potential limitations under our current experimental conditions. Hence, we expanded our discussion and suggest the use of higher resolution imaging to potentially reveal loci of high PACAP concentrations, which should be validated by future studies (see also our added discussion in paragraph 4). 

      Reviewer #3 (Public Review):

      Summary:

      The manuscript introduces PAClight1P78A, a novel genetically encoded sensor designed to facilitate the study of class-B1 G protein-coupled receptors (GPCRs), focusing on the human PAC1 receptor. Addressing the significant challenge of investigating these clinically relevant drug targets, the sensor demonstrates a high dynamic range, excellent ligand selectivity, and rapid activation kinetics. It is validated across a variety of experimental contexts including in vitro, ex vivo, and in vivo models in mice and zebrafish, showcasing its utility for high-throughput screening, basic research, and drug development efforts related to GPCR dynamics and pharmacology.

      Strengths:

      The innovative design of PAClight1P78A successfully bridges a crucial gap in GPCR research by enabling realtime monitoring of receptor activation with high specificity and sensitivity. The extensive validation across multiple models emphasizes the sensor's reliability and versatility, promising significant contributions to both the scientific understanding of GPCR mechanisms and the development of novel therapeutics. Furthermore, by providing the research community with detailed methodologies and access to the necessary viral vectors and plasmids, the authors ensure the sensor's broad applicability and ease of adoption for a wide range of studies focused on GPCR biology and drug targeting.

      Weaknesses

      To further strengthen the manuscript and validate the efficacy of PAClight1P78A as a selective PACAP sensor, it is crucial to demonstrate the sensor's ability to detect endogenous PACAP release in vivo under physiological conditions. While the current data from artificial PACAP application in mouse brain slices and microinfusion in behaving mice provide foundational insights into the sensor's functionality, these approaches predominantly simulate conditions with potentially higher concentrations of PACAP than naturally occurring levels.

      We thank the reviewer for their valuable comments and agree that the use of PAClight for detecting endogenous PACAP will be of big interest for the scientific community and should be a goal for future research. Considering the time, equipment and additional animal licenses necessary, we are convinced that these questions would go beyond the scope of the current paper and might rather be addressed in a follow-up publication. We therefore rephrased the discussion and added more details to clarify further the intended purpose of the current study. Additionally, we added a paragraph in the discussion suggesting experiments needed to validate PAClight for putative future in vivo applications. 

      Although the sensor's specificity for the PAC1 receptor and its primary ligand is a pivotal achievement, exploring its potential application to other GPCRs within the class-B1 family or broader categories could enhance the manuscript's impact, suggesting ways to adapt this technology for a wider array of receptor studies. Additionally, while the sensor's performance is convincingly demonstrated in short-term experiments, insights into its long-term stability and reusability in more prolonged or repeated measures scenarios would be valuable for researchers interested in chronic studies or longitudinal behavioral analyses. Addressing these aspects could broaden the understanding of the sensor's practical utility over extended research timelines.

      We extend our gratitude to the reviewer for diligently assessing our results. 

      Indeed, the very high level of sensitivity that we could achieve in PAClight leads us to think that potentially a grafting-based approach, such as the one we’ve recently described for class-A GPCR-based sensors (PMID: 37474807) could also work for the direct generation of multiple class-B1 sensors based on the optimized fluorescent protein module present in PAClight. Unfortunately, considering the amount of work that testing this hypothesis would entail, we are not able to perform these experiments in the context of this revision, and would rather pursue them as a future project. Nevertheless, we have expanded the discussion of the manuscript with a paragraph with these considerations.

      While we lack comprehensive data on the long-term stability of the sensor, our preliminary findings from photometry recordings optimization indicate consistent baseline expression of PAClight and PACLight ctrl over several weeks. Conducting experiments to systematically assess stability would require several months, which is currently impractical due to limitations in tools and licenses for repeated in vivo infusions. Hence, we intend to include these experiments in potential follow-up studies.

      Furthermore, the current in vivo experiments involving microinfusion of PACAP near sensor-expressing areas in behaving mice are based on a relatively small sample size (n=2), which might limit the generalizability of the findings. Increasing the number of subjects in these experimental groups would enhance the statistical power of the results and provide a more robust assessment of the sensor's in vivo functionality. Expanding the sample size will not only validate the findings but also address potential variability within the population, thereby reinforcing the conclusions drawn from these crucial experiments.

      We agree with the reviewer that a sample size of N=2 is not sufficient for in vivo recordings. We therefore increased the sample size and now present recordings with 5 PAClight1P78A and 4 PACLight-control mice. Of note, the new data validate our previous findings and conclusions and give a better idea of the variability in vivo that we now discuss in much more detail in the discussion (see paragraph 2). 

      Recommendations for the Authors:

      Reviewer #1 (Recommendations For The Authors):

      The lower potency of maxadilan activation might reflect broader implications for ligand-receptor dynamics. Perhaps the authors could discuss the maxadilan binding from a structural perspective, including AlphaFold models. Also, discussing how these findings might influence sensor application in diverse biological contexts would be insightful. Clear definitions and consistent use of these terms are crucial for ensuring that readers understand the methods and results.

      We would like to thank the reviewer for the comments. As part of this work, we did not obtain a dose-response curve for maxadilan peptide, and only reported the maximal response of the sensor to a high concentration of the peptide (10 µM). Thus, our findings would rather inform us on the maximal efficacy of the peptide, as opposed to its potency towards the PAC1R. Furthermore, we would like to point out that due to the lack of structural details for any GPCR-based sensor published to date, we cannot make any molecularly accurate conclusion regarding the precise reasons why a different ligand (in this case the sandfly maxadilan) induces a lower maximal efficacy of the response compared to the endogenous cognate ligand of the receptor. We do not believe that AlphaFold models can accurately replace structural information in this regard, especially given the consideration that the aminoacid linker regions between the GPCR and the fluorescent protein, which are a critical determinant of allosteric chromophore modulation by ligand-induced conformational changes, typically obtain the lowest confidence score in all AlphaFold predicted structural models of GPCR-based sensors. Finally, we would like to refer the reviewer to a very nice recent publication (PMID: 32047270) which resolved the structures of each of these peptides bound to the PAC1 receptor-Gs protein complex, which provides accurate molecular details on the different modalities of receptor binding and activation by PACAP138  versus maxadilan.

      Reviewer #2 (Recommendations For The Authors):

      The authors are congratulated on the meticulous achievement of their aim, i.e. a fluorescence-based sensor for the detection of PACAP with in vivo utility. Whether or not this sensor will have the requisite sensitivity to detect the release of endogenous PACAP within various regions of the nervous system, in response to specific environmental stimuli or changes in brain or physiological state, remains to be determined.

      We thank the reviewer for the very positive evaluation of our manuscript and for the suggested additions that will improve the strength of our arguments.

      We agree that the in vivo detection of endogenous PACAP will be an important objective for future studies. Due to time, resource and animal license constraints, we are not able to address this objective in our current study, but we now detail possible future experiments in the discussion section. Please see also our answer to the suggested discussion points previously.

      Reviewer #3 (Recommendations For The Authors):

      To comprehensively assess the sensor's sensitivity and specificity to endogenous PACAP, I recommend conducting additional in vivo experiments where PAClight1P78A is expressed in neurons that endogenously express the Pac1r receptor (using Adcyap1r1-Cre mouse line). These experiments should involve applying sensory or emotional stimuli known to evoke PACAP release or activating upstream PACAP-expressing neurons. Such studies would offer valuable data on the sensor's performance under natural physiological conditions and its potential utility for exploring PACAP's roles in vivo.

      We express our gratitude to the reviewer for providing detailed methodological approaches to examine endogenous PACAP release. These suggestions will prove invaluable for future investigations and are important additions to a follow-up publication. As mentioned earlier, we have incorporated some of these approaches into our discussion. Additionally, we have underscored the existing limitations in detecting endogenous PACAP in vivo and emphasized the relevance of PAClight for drug development purposes.