10,000 Matching Annotations
  1. Feb 2025
    1. Reviewer #1 (Public review):

      Summary:

      Jirouskova and colleagues in their study have carried out an in depth proteomic characterization of the dynamics of the liver fibrotic response and the resulting resolution in two distinct models of liver injury: CCl4-induced model of hepatotoxicity and pericentral/bridging liver fibrosis and the DDC feeding model of obstructive cholestasis and periportal fibrosis. They focussed on both the insoluble extracellular matrix (ECM) components as well as the soluble secreted factors produced by hepatic stellate cells (HSCs) and/or portal fibroblasts (PFs). They identified compartment- and time-resolved proteomic signatures in the two models with disease-specific factors or matrisomes. Their study also identified phenotypic differences between the models such as that while the CCl4-induced model induced profound hepatotoxicity followed by resolution, the DDC model induced more lasting liver damage and proteomic changes that resembled advanced human liver fibrosis favouring hepatocarcinogenesis.

      Overall, this comprehensive and very well conducted study is rigorous and well planned. The conclusions are supported by compelling studies and analyses. One caveat is the lack of mechanistic experiments to prove causality, but this can be carried out in follow-up studies.

      Strengths:

      • A major strength in the study is that the experiments are rigorous and very well conducted. For instance, the authors utilized two models of liver fibrosis to study different aspects of the pathology - hepatotoxicity vs cholestasis. In addition, 4 time points for each model were investigated - 2 for fibrosis development and 2 for fibrosis resolution. They have taken 3 components for proteomic analyses - total lysates, insoluble ECM components as well as the soluble secreted factors. Thus, the authors provide a comprehensive overview of the fibrosis and resolution process in these models.

      • Another great strength of the study is that the methodology utilized was able to dissect unique pathways relevant for each model as well as common targets. For example, the authors identified known pathways such as mTOR signalling to be differentially regulated in the CCl4 vs DDC model. mTOR signalling was increased in the DDC model that is associated with hyperproliferation. Thus showing that the approach taken is specific enough to distinguish between the two similar (both induce fibrosis) but distinct mechanisms (hepatotoxicity vs cholestasis) is a strong point of the study.

      Weaknesses:

      • A caveat of the study is that the authors have not conducted mechanistic (gain of function/loss of function) studies from any of their identified targets to truly prove causality. This remains one of the limitations of this study. Thus, future studies should investigate this point in detail. For instance, it would have been intriguing to dissect if knocking out specific genes involved in one specific model or genes common to both would yield distinct phenotypic outcomes.

    2. Reviewer #2 (Public review):

      Summary:

      The authors suggest that ECM abundance and composition change depending on the aetiology of liver fibrosis. To understand this they have investigated the proteome in two models of animal fibrosis and resolution. They suggest their findings could provide a foundation for future anti-fibrotic therapies.

      The revised version has been improved. Although some areas remain (described below), it is perhaps the dataset that will be most valuable.

      Strengths:

      The dataset appears well supported and will be valuable.

      Weaknesses:

      The manuscript is still fairly descriptive but on balance this is a useful dataset and appears to have broad support in that regard.

      There are no conclusions that can be drawn from their rebuttal regarding the human data they included as it is one patient per group and will most likely change dramatically with more patients. As such this area is still an issue but they have improved some of the data elsewhere.

    1. eLife Assessment

      This valuable study suggests that capsaicin nanoparticle administration in rats activates the transcription factor Nrf2 by directly binding to its repressor, KEAP1, leading to the induction of cytoprotective genes and preventing alcohol-induced gastric damage, offering a potential avenue for treating alcoholism-related gastric disorders. Although improvements were made following the first revision, the evidence supporting capsaicin as an Nrf2 activator remains incomplete, as some methodological aspects still require revision and the interpretation of key data needs further clarification.

    2. Reviewer #1 (Public review):

      The paper by Gao et al. describes the effect of capsaicin on the NRF2/KEAP1 pathway. The authors carried out a set of in vitro and in vivo experiments that addressed the mechanisms of the protective effect of capsaicin on ethanol-induced cytotoxicity.

      The authors conclude that capsaicin activates NRF2, which leads to the induction of cytoprotective genes, preventing oxidative damage. The paper shows that capsaicin may directly bind to KEAP1 and that it is a noncovalent modification of the Kelch domain.

      The authors also designed new albumin-coated capsaicin nanoparticles, which were tested for the therapeutic effect in vivo.

      I appreciate the authors' experimental efforts to strengthen the study's conclusions. However, in my opinion, the paper is still not fully technically sound, which weakens the strength of the evidence.

    3. Reviewer #2 (Public review):

      Summary:

      In this paper the authors wanted to show that capsaicin can disrupt the interaction between Keap1 and Nrf2 by directly binding to Keap1 at an allosteric site. The resulting stabilization of Nrf2 would protect CAP-treated gastric cells from alcohol- induced redox stress and damage as well as inflammation (both in vitro and in vivo)

      Strengths:

      One major strength of the study is the use of multiple methods (CoIP, SPR, BLI, deuterium exchange MS, CETSA, MS simulations, target gene expression) that consistently show for the first time that capsaicin can disrupt the Nrf2/Keap1 interaction at an allosteric site and lead to stabilization and nuclear translocation of Nrf2.<br /> Moreover, efforts to show causal involvement of the Keap/Nrf2 axis for the made cellular observations as well as addressing potential off target effects of the polypharmacological CAP appreciated.

      One point that still hampers a bit of full appreciation of the capsaicin effect in cells is that capsaicin is not investigated alone, but mostly in combination with alcohol only.<br /> Moreover, the true add-on value of the developed nanoparticles remains obscure.<br /> The partly relatively high levels of NRF2 in putatively unstressed cells question the validity of used models.

      The rationale for switching between different CAP concentrations is unclear /not entirely convincing.

      The language and introduction could be improved.

      Overall, the authors are convinced that capsaicin (although weakly) can bind to Keap1 and releases Nrf2 from degradation, with relevance for biological settings. With this, the authors provide a significant finding with marked relevance for the redox/Nrf2 as well as natural products /hit discovery communities.

      - Figure 2C: It is still not clear why naïve (unstressed /untreated cells) already show rather high nuclear abundance of Nrf2 (shouldn´t Nrf2 be continuously tagged for degradation by Keap1)<br /> - Figure 2G-H: Why switch to rather high concentrations?<br /> - Figure 2I: in the pics of mitochondria the control mitochondria look way more punctuated (likely fissed) than the ones treated with EtOH or EtOH + CAP. Wouldn´t one expect that EtOH leads to mitochondrial fission and CAP can prevent it?<br /> - Figure 3H: High basal Nrf2 levels in unstressed/untreated HEK WT cells, why?<br /> - Figure 4a: Inclusion of an additional Keap1 binding protein (one with a ETGE motif) would have been desirable (to get information on specificity/risks of off-target (unwanted) effects of CAP)<br /> - Figure 4D: Why is there no stabilization of Nrf2 by CAP in lane 2 ?<br /> - Figure 4f: 5% DMSO is a rather high solvent concentration , why so high (the solvent alone seems to have quite marked effects !)<br /> - Figure 6/7: not expert enough to judge formulations and histology scores. However, the benefit of the encapsulated capsaicin does not become entirely clear to me, as CAP and IRHSA@CAP mostly do not significantly differ in their elicited response.<br /> - Figure 7: Rebamipide was introduced as positive control in the text with an activating effect on Nrf2, but there is no induction of hmox and nqo in Figure 7f, why? It does not look as the positive control was wisely chosen.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Major concerns:

      For studies investigating capsaicin binding to KEAP1, the authors used capsaicin concentrations that are toxic to cells (Figures S1D and 4F, G). In vivo studies were performed only in 3 rats per group. The T-test was used for the comparison of more than two groups. Given the well-known issues with the specificity of the NRF2 antibody, the authors should provide appropriate controls, especially for IF and IHC staining.

      We sincerely appreciate your valuable comments. We repeated the experiments about CCK8 (Figure S1d) and Pull-down (Figure 4g), and then updated the results. In September 2022, GES-1 cells were more sensitive to capsaicin (CAP) because Gibco serum from North America was used. Later, in 2024, we changed the serum from Australia(Gibco: 10099-141), and we found that such GES-1 cells raised better, so we re-ran the test, and the IC50 was seen to be 304.8 μM, so concentrations used in this paper has no obvious toxicity to cells. What’s more, we repeated the Pull-down experiment with more reasonable concentrations of 32 μM and 100 μM, and the results were still in line with expectations. In summary, we concluded that the effect of CAP on GES-1 cells is closely related to the cell state, and that treatments of CAP from 32 to 100 μM can hinder the interaction between NRF2 and the Kelch domain of KEPA1. What’s more, at the cellular level, the experimental concentration of CAP was not more than 32 μM, which is a relatively safe concentration for cells.

      Thank you very much for your comments. We also pay attention to using more repetitions to increase the reliability of the experimental results in animal experiments. Therefore, recently we supplemented the experiment of Nfe2l2Knockout mice in Figure 9 (6 mice per group). Additionally, thank you very much for your comments on the use of T-test analysis, we reviewed the statistics and changed them by one-way ANOVA.

      Finally, thanks to your concern about the specificity of NRF2 antibody, we used commercialized NRF2 antibody which have been KO/KD validated (Cat No. 16396-1-AP, Proteintech) and can be used for IF and IHC staining. Each of our fluorescence result was equipped with Western Blotting in its active form at the size of 105-110 KDa for statistical analysis, the trend was consistent with the experimental results of IF and IHC, which fully proves the correctness of the results presented (Figure 2c and Figure S8j).

      Reviewer #2 (Public Review):

      Weaknesses:

      One major weakness of the study is that plausibility is taken as proof for causality. The finding that capsaicin directly binds to Keap1 and releases Nrf2 from its fate of degradation (in vitro) is taken for granted as the sole explanation for the observed improved gastric health upon alcohol exposure (in vivo). There is no consideration or exclusion of any potential unrelated off-target effect of capsaicin, or proteins other than Nrf2 that are also controlled by Keap1. 

      Another point that hampers full appreciation of the capsaicin effect in cells is that capsaicin is not investigated alone, but mostly in combination with alcohol only.

      Thank you very much for this comment. In the introduction, we clarified as follows: “Currently, experiments conducted in rats have demonstrated that red pepper/capsaicin (CAP) had significant protective effects on ethanol-induced gastric mucosal damage, and the mechanism may be related to the promotion of vasodilation(6,7), increased mucus secretion(8) and the release of calcitonin gene-related peptide (CGRP)(9,10). However, it is noteworthy that whether the antioxidant activity of CAP works has not been fully investigated.” Therefore, we also recognize that CAP does not exert its effects through the KEAP1-NRF2 pathway alone. Your advice is very useful. We further explored the TRPV1 and DPP3 to detect the potential off-target effects of CAP respectively. Capsazepine (CAPZ), which is TRPV1 receptor antagonist did not affect the protection of CAP against GES-1 (Fig S4f and S4g), which may indicate that CAP activation of NRF2 does not have to depend on TRPV1. The binding of CAP with DPP3, containing an ETGE motif and can bind to KEPA1, was detected by BLI, and we found that the K<sub>D</sub> between CAP and DPP3 was 1.653 mM(>100 μM), which may indicate the potential off-target effect of CAP is low because CAP had a strong binding force with KEAP1 about 31.45 μM (Fig S4h and S4i).

      Thank you very much for the comment of another point. Multiple experiments have shown that CAP significantly up-regulates NRF2 in the presence of additional stimuli such as EtOH (Figure 1i),  H<sub>2</sub>O<sub>2</sub> (Figure 1l), PS-341(Figure 2e) and DTT (Figure 4d), which pattern is consistent with our understanding of allosteric regulation and as expected. Especially for the experiments of PS-341 and DTT, we had a group that only adds CAP, and it can be seen that the addition of CAP alone did not significantly up-regulate NRF2, which is completely different from traditional NRF2 activators (especially artificially designed covalent binding peptides which have serious side effects).  

      Reviewer #3 (Public Review):

      Weaknesses:

      While the study provides valuable insights into the molecular mechanisms and in vivo effects of CAP, further clinical studies are needed to validate its efficacy and safety in human subjects. The study primarily focuses on the acute effects of CAP on ethanol-induced gastric mucosa damage. Long-term studies are necessary to assess the sustained therapeutic effects and potential side effects of CAP treatment.

      Furthermore, the study primarily focuses on the interaction between CAP and the KEAP1-NRF2 axis in the context of ethanol-induced gastric mucosa damage. It may be beneficial to explore the broader effects of CAP on other pathways or conditions related to oxidative stress. CAP has been known for its interaction with the Transient Receptor Potential Vanilloid type 1 (TRPV1) channel and subsequent NRF2 signaling pathway activation. Those receptors are also expressed within the gastric mucosa and could potentially cross-react with CAP leading to the observed outcome. Including experiments to investigate this route of activation could strengthen the present study.

      While the design of CAP nanoparticles is innovative, further research is needed to optimize the nanoparticle formulation for enhanced efficacy and targeted delivery to specific tissues.

      Addressing these weaknesses through additional research and clinical trials can strengthen the validity and applicability of CAP as a therapeutic agent for oxidative stress-related conditions.

      Thank you very much for these suggestions. We also believe that CAP is very valuable and promising for protecting EtOH induced gastric mucosal injury, and actively promote patent applications and if conditions permit, longer drug research for biosecurity is essential. Because of the inherently new discovery of the binding of CAP and KEAP1, and the important role of NRF2 in various oxidative stress-related diseases, we used Human umbilical cord mesenchymal stem cells (HUC-MSCs) and  H<sub>2</sub>O<sub>2</sub> to explore the potential broader effects of CAP related to oxidative stress in cells (Figure 1l and 1m). At the same time, we also explored TRPV1 related experiments, and we were surprised to find that inhibiting TRPV1 did not affect the effect of CAP (Supplementary Figure 4f and 4g). We hope that more people can read this article and do more interesting research together.

      Recommendations for the authors:

      Reviewing Editor (Recommendations For The Authors):

      Although this study has been conducted in rats, a direct proof that albumin-coated capsaicin nanoparticles act through activation of Nrf2 in protecting gastric mucosa against alcohol toxicity could be well conducted in commercially available Nrf2-deficient mice.

      Thank you very much for your suggestion and the comment is very constructive for us to improve this paper. We purchased Nrf2-deficient mice (Cat. NO. NM-KO-190433) and performed experiments, and the results showed that knockout mice with Nrf2 were more sensitive to EtOH and the effects of CAP were partially eliminated (Figure 9), which further validated the role of Nrf2-related signaling pathway in EtOH-induced gastric mucosal injury and the therapeutic effect of CAP.

      Reviewer #1 (Recommendations For The Authors):

      Minor concerns include proofreading the paper. Actinomycin is not an inhibitor of translation.

      Thank you for your comment. We have revised “Actinomycin” to “Cycloheximide”.

      Reviewer #2 (Recommendations For The Authors):

      - Please have a careful look at your conclusions: just because two effects happen at the same time and may be plausible explanations for each other, it does not mean that they are really in a causative relationship in your given test system (unless unambiguously proven by additional experiments).

      Your suggestions are very constructive for us to improve this paper.

      We further discussed the role of capsaicin with TRPV1, DPP3 and Nrf2deficient mice, hoping to make our conclusions more credible to some extent. 

      - You may want to frankly discuss other targets of capsaicin (e.g. the TrpV1 receptor) that possibly could also account for your observations, and that binding to Keap1 not only releases Nrf2 from proteasomal degradation.

      Thank you for your comment. As a result, we further explored the TRPV1 and DPP3 to detect the potential off-target effects of CAP respectively. Capsazepine (CAPZ), which is TRPV1 receptor antagonist does not affect the protection of CAP against GES-1 (Fig S4f and S4g). DPP3 with an ETGE motif was detected by BLI, and we found that the K<sub>D</sub> between CAP and DPP3 was 1.653 mM, which may indicate the potential off-target effect of CAP is low (Fig S4h and S4i). At the same time, the activation of NRF2 by non-classical pathways such as CAP regulation of DPP3 or other proteins also deserves more discussion and experimental verification.

      - For Figure 1G it does not become entirely clear what has been done (and thus deduction of conclusions is hampered).

      Thank you for your comment. Network targets analysis (Figure 1g) was performed to obtain the potential mechanism of effects of CAP on ROS. Biological effect profile of CAP was predicted based our previous networkbased algorithm:drug CIPHER. Enrichment analysis was conducted based on R package ClusterProfiler v4.9.1 and pathways or biological processes enriched with significant P value less than 0.05 (Benjamini-Hochberg adjustment) were remained for further studies. Then pathways or biological processes related to ROS and significantly enriched were filtered and classified into three modules, including ROS, inflammation and immune expression. Network targets of CAP against ROS were constructed based on above analyses, and finally we combined proteomics to determine the research idea of this paper

      -  Figure 1L: is there a reason/explanation why UC.MSC needs a comparably very high concentration of capsaicin.

      Thank you for your comment. Because the experimental results of 8 μM and 32 μM on this cell were more stable, and the activation effect of NRF2 downstream was more obvious.

      -  Figure 2C: it is surprising that naïve (unstressed /untreated cells) already show a rather high nuclear abundance of Nrf2 (shouldn´t Nrf2 be continuously tagged for degradation by Keap1).

      Thank you for your comment. This is a real experimental result, and we have found in many experiments that the untreated group can also show NRF2 when immunoblotting. We think that this phenomenon may be related to the cell state at that time.

      -  Figure 2E: the claim of synergy between CAP and the proteasome inhibitor is not justified with this single figure.

      Thank you for your comment. Multiple experiments have shown that CAP significantly up-regulates NRF2 in the presence of additional stimuli such as EtOH (Figure 1i),  H<sub>2</sub>O<sub>2</sub> (Figure 1l), PS-341 (Figure 2e) and DTT (Figure 4d), which pattern is consistent with our understanding of allosteric regulation and as expected. However, this synergy does warrant more research.

      -  CHX is cycloheximide (in the main text it is referred to as actinomycin).

      Thank you very much for your comment. We have revised “Actinomycin” to “Cycloheximide”.

      -  Figures 2G-H: why switch to rather high concentrations? Is it due to the overexpression of Keap1?

      Thank you for your comment. At the time of this part of the experiment, we had obtained in vitro data on the interaction of CAP and the Kelch domain of KEAP1 (about 32 μM). To keep the results uniform and valid, we chose a relatively higher concentration.

      -  Figure 2I: in the pics of mitochondria the control mitochondria look way more punctuated (likely fissed) than the ones treated with EtOH or EtOH + CAP. Wouldn´t one expect that EtOH leads to mitochondrial fission and CAP can prevent it?

      Thank you for your comment. MitoTracker® Red CMXRos (M9940, Solarbio, China) is a cell-permeable X-rosamine derivative containing weakly sulfhydryl reactive chloromethyl functional groups that label mitochondria. This product is an oxidized red fluorescent stain (Ex=579 nm, Em=599 nm) that simply incubates the cell and can be passively transported across the cell membrane and directly aggregated on the active mitochondria. Therefore, red does not represent broken mitochondria, but active mitochondria. Quantitative analysis of the mean branch length of mitochondria was calculated using MiNA software (https://github.com/ScienceToolkit/MiNA) developed by ImageJ.

      -  Figure 3C: figure legend is somewhat poor.

      Thank you for your comment. We have revised: “KEAP1-NRF2 interaction was detected with Surface plasmon resonance (SPR) in vitro.”

      -  Figure 3E: given that CAP disrupts Nrf2/Keap1- PPI, why is there no Nrf2 stabilization seen in the fourth lane (input/lysate)?

      Thank you for your comment. The fourth lane may promote the degradation of NRF2 due to overexpression of KEAP1.

      -  Figure 3H: high basal Nrf2 levels in unstressed/untreated HEK WT cells, why?

      Thank you for your comment. This is a real experimental result, and we have found in many experiments that the untreated group can also show NRF2 when immunoblotting in 293T cells. We think that this phenomenon may be related to the cell state at that time.

      -  Figure 3G/I: this data suggests to me that the alcohol-mediated toxicity is Keap1-dependent (rather than the protection by CAP), doesn´t it?

      Thank you for your comment. We can see that KEAP1-KO cells had a high expression of NRF2, which was also in line with our expectations, and EtOH-induced GES-1 damage may be closely related to oxidative stress.

      -  Figure 4a: the inclusion of an additional Keap1 binding protein (one with an ETGE motif) would have been desirable (to get information on specificity/risks of off-target (unwanted) effects of CAP). 

      Thank you for your comment. DPP3 with an ETGE motif was detected by BLI, and we found that the K<sub>D</sub> between CAP and DPP3 was 1.653 mM, which may indicate the potential off-target effect of CAP is low (Fig S4h and S4i).

      -  Figure 4D: why is there no stabilization of Nrf2 by CAP in lane 2 ? How can the DTT-mediated boost on Nrf2 levels be explained?

      Thank you for your comment. Multiple experiments have shown that CAP significantly up-regulates NRF2 in the presence of additional stimuli such as EtOH (Figure 1i),  H<sub>2</sub>O<sub>2</sub> (Figure 1l), PS-341 (Figure 2e) and DTT (Figure 4d), which pattern is consistent with our understanding of allosteric regulation and as expected. However, this synergy does warrant more research.

      -  Figure 4f: 5% DMSO is a rather high solvent concentration, why so high (the solvent alone seems to have quite marked effects).

      Thank you for your comment. Because our maximum concentration was set relatively high, we have also recognized relevant problems and resupplemented the more critical Pull-down experiment (Figure 4g). The current DMSO of 0.2% had no effect on the experimental results.

      -  Figure 5: it should be described in the figure legend which mutant is used. Based on the previous data, I would expect an investigation of mutants carrying amino acid exchanges at the newly identified allosteric site.

      Thank you for your comment. The mutated version involved substitutions at residues Y334A, R380A, N382A, N414A, R415A, Y572A, and S602A (the orthostatic site), which are residues reported to engage NRF2 and classic Keap1 inhibitors. The exploration of newly discovered allosteric sites is worthy of further study.

      -  Figure 6/7: I am not expert enough to judge formulations and histology scores. However, the benefit of the encapsulated capsaicin does not become entirely clear to me, as CAP and IRHSA@CAP mostly do not significantly differ in their elicited response.

      Thank you for your comment. On the one hand, nanomedicine improves the safety of administration: it helps to reduce the intense spicy irritation of CAP itself when administered in the stomach; On the other hand, the dosage of drugs is reduced to a certain extent to achieve better therapeutic effect.

      -  Figure 7: rebamipide was introduced as positive control in the text with an activating effect on Nrf2, but there is no induction of hmox and nqo in Figure 7f, why?

      Thank you for your comment. The effect of addition of positive control drug (Rebamipide) on NRF2 activation is not the focus of this paper. We speculate that the transcription and translation of related genes may not be completely synchronized when Rebamipide was taken at the same time.

      -  Figure 8: the CAP effect on inflammation is visible, however, a clear causal connection between ROS/Nrf2/KEap1 is not given in the presented experiments.

      Thank you for your comment. The simple mechanics of this paper are illustrated in the Graphic diagram. The activation of NRF2 exerts both antiinflammatory and antioxidant functions, which has been reported in many articles, but the causal relationship is still open to exploration.

      Points related to presentation:  

      -  The data with the encapsulated CAP appear a little as a sidearm that does not bolster your main message (maybe take out and elaborate on this topic more extensively in another manuscript).

      -  Revise the introduction on the Nrf2 signaling pathway as it is written at the moment, someone outside the Nrf2 field might have trouble understanding it.

      -  The use of language requires proofreading and revision.

      Thank you for your comment. We rearranged and proofread it.

      Reviewer #3 (Recommendations For The Authors):

      Overall, the manuscript is well-written and the results are presented in a concise and comprehensible manner.

      Some recommendations on the experimental evidence and further suggestions:

      • The authors should state how they assessed the distribution of the data. Description of data with mean and standard deviation as well as comparisons between different groups with t-test assumes that the underlying data is normally distributed.

      Your suggestions are very constructive for us to improve the paper.  The differences in the mean values between the two groups were analyzed using the student’s t-test, while the differences among multiple groups were analyzed using a one-way ANOVA test in the GraphPad Prism software.

      Therefore, we checked and proofread the statistical analysis.

      • Additional experiments further characterising and validating the activation of CAP via direct KELCH1-binding could include parallel experiments with similar agonists like dimethyl fumarate. It would be interesting to know how CAP activation compares to DMF activation.

      Thank you very much for your comment. We believe that the activation of NRF2 by DMF has been widely reported and well-studied, so we did not purchase this drug for comparative study here. If it can be promoted clinically in the future, we may consider comparing with DMF.

      • Also, the knock-down of NRF2 would be a suggested experiment to do because it rules out that the benefit of CAP is independent of KEAP1-NRF2 binding and activation.

      Thank you very much for your suggestions. We purchased Nrf2-deficient mice and performed experiments, and the results showed that knockout mice with Nrf2 were more sensitive to ethanol and the effects of CAP were partially eliminated (Figure 9), which further validated the role of Nrf2-related signaling pathway in alcohol-induced gastric mucosal injury and the therapeutic effect of CAP.

      Some corrections on text and figures:

      • Figure 1b: incorrect spelling of DNA stain. Should be Hoechst33324.

      Thank you very much for your comment. We have revised.

      • Figure 1c: don't put the label inside the plot.

      Thank you very much for your comment. We have revised.

      • Figure 1d: choose less verbose axes titles (this also applies to other figures).

      Thank you very much for your comment. We have revised.

      • Figures 1e and 1f: please state the units.

      Thank you very much for your comment. The enzyme activity of SOD and the content of MDA were compared with that of the control group.

      • Heading 2.2: NRF2-ARE instead of NRF-ARE.

      Thank you very much for your comment. We have revised.

      • Line 118: missing expression after immune.

      Thank you very much for your comment. We have revised.

      • Figure 1g: names of proteins are not readable.

      Thank you very much for your comment. We have revised.

      • Line 120: You performed transcriptomic analyses to identify differentially expressed GENES not proteomic.

      Thank you very much for your comment. This part of the work we do is proteomics.

      • Line 122: Fold change should be stated in both directions, i.e. absolute FC like |FC| > 1. Or did you select only upregulated DEGs? Is it not log2 FC?

      Thank you very much for your comment. We have revised.

      • Figure 1h (and Supplementary Figure 1a): Missing heatmap legend for FC.

      What do the colors show? Sample (column) description missing.

      Thank you very much for your comment. We used red to indicate up-regulation, blue to indicate down-regulation, and the vertical coordinate on the right side were antioxidant genes such as GSS and SOD1, respectively, and the proportion between the treatment group and the model group (CAP + EtOH/EtOH) had been calculated and labeled.

      • Line 145: A Western blot is not a proteomic analysis.

      Thank you very much for your comment. We have revised: “Concurrently, the elevated expression levels of GSS and Trx proteins, which were also downstream targets of NRF2, further validated by western blotting (Figure 1j).”

      • Supplementary Figure 2e-j: expression fold change is not the right quantity. The signal of the actual protein was quantified. And what are you comparing to with the statistics? The stars on one bar are not clear.

      Thank you very much for your comment. The expression level of this part was normalized compared with that of the control group. The significance differentiation analysis is compared with the model group.

      • What was the concentration of  H<sub>2</sub>O<sub>2</sub> used?

      Thank you very much for your comment. 200 μM  H<sub>2</sub>O<sub>2</sub> was used.

      • Figure 2d: use a more precise y-axis label.

      Thank you very much for your comment. We do want to compare the amount of NRF2 entering the nucleus, so the relative expression is compared to the internal reference

      • Figure 2g: missing molecular weight markers.

      Thank you very much for your comment. Since the ubiquitination modification is a whole membrane, and only marking the size of HA and GAPDH is not beautiful enough here.

      • Line 221: lactate is the endproduct of the anaerobic glycolytic pathway.

      Thank you very much for your comment. We have revised.

      • Supplementary Figure 3d: should it be PKM2 (instead of PKM) and LDHA (instead of LDH). Should fit with the text in the manuscript.

      Thank you very much for your comment. We have revised.

      • Supplementary Figures 3 e-f: brackets in y-axis labels are too bold.

      Thank you very much for your comment. We have revised.

      • Figures 3a and b. Brackets should only be used if two conditions are being compared statistically. Remove the one line with ns as it could imply that you have compared the first with the last condition only.

      Thank you very much for your comment. We have revised.

      • Consistent labeling of kDa in figures (no capital K in KDa).

      Thank you very much for your comment. We have revised.

      • Figure 4a. Move kDa on top of 70.

      Thank you very much for your comment. We have revised.

      • Figure 3 g-h: Why 2% EtOH. Used 5% previously?

      Thank you very much for your comment. Because here we changed the 293T cell line, 5% EtOH concentration is too high on this cell.

      • Supplementary Figure b-e: correct typo in y-axis label: expression.

      Thank you very much for your comment. We have revised.

      • Figure 4a: correct x-axis label for temperature unit. Too bold. Not readable.

      Add a clear label and unit for y-axis.

      Thank you very much for your comment. We have revised.

      • Figure 4 b-c: should have a legend explaining colors.

      Thank you very much for your comment. Our Figure legend already contains the meaning of colors: “(b) Computational docking of CAP molecule to KEAP1 surface pockets. The Keap1 protein is represented in gray, while the CAP molecule is shown in yellow. The seven key amino acids predicted to be crucial for the interaction are highlighted in blue. (c) Partial overlap of CAPbinding pocket with KEAP1-NRF2 interface. The KEAP1-NRF2 interaction interface is represented in purple.”

      • Supplementary Figure 5a. Add axis units.

      Thank you very much for your comment. We have revised.

      • Figure 4e: Missing b ions value for number 19.

      Thank you very much for your comment. This part is not missing, but corresponds to 19 of y ions.

      • Figure 7f: adjust brackets - they are too bold.

      Thank you very much for your comment. We have revised.

      • Supplementary Figure 8b-i: labels not readable. c should be spleen.

      Thank you very much for your comment. We have revised.

      • Line 787: specify BH adjustment to Benjamini-Hochberg.

      Thank you very much for your comment. We have revised.

      • Check spelling of µl throughout the Methods section e.g. line 854 - shouldn't be "ul".

      Thank you very much for your comment. We have revised.

      • Line 974: correct spelling of species names: E. coli should be in italics.

      Thank you very much for your comment. We have revised all of these corrections on text and figures. For me, the writing of papers will be more rigorous and careful in the future.

    1. eLife Assessment

      This fundamental study reports the effects of the psychedelic drug psilocin on iPSC-derived human cortical neurons, analyzing different aspects of structural and functional neuronal plasticity. The evidence is convincing, integrating a comprehensive characterization of 5-HT2A expression and its subcellular distribution upon treatment with psilocin at different time points. The study supports the value of using iPSC-derived human cortical neurons for testing the potentially translational effects of psilocin and other psychedelic-related compounds.

    2. Reviewer #1 (Public review):

      Summary:

      This study reports the effects of psilocin on iPSC-derived human cortical neurons.

      Strengths:

      The characterization was comprehensive, involving immunohistochemistry of various markers, 5-HT2A receptors, BDNF, and TrkB, transcriptomics analyses, morphological determination, electrophysiology, and finally synaptic protein measurements. The results are in close agreement with prior work (PMID 29898390) on rat-cultured cortical neurons. Nevertheless, there is value in confirming those earlier findings and furthermore demonstrating the effects in human neurons, which are important for translation. The genetic, proteomics, and cell structure analyses used in this paper are its major strengths. The study supports the value of using iPSC-derived human cortical neurons for drug development involving psychedelics-related compounds.

      Weaknesses:

      (1) Line 140: 5-HT2A receptor expression was found via immunocytochemistry to reside in the somatodendritic and axonal compartments. However, prior work from ex vivo tissue using electron microscopy has found predominantly 5-HT2A receptor expression in the somatodendritic compartment (PMID: 12535944). Was this antibody validated to be 5-HT2A receptor-specific? Can the authors reason why the discrepancy may arise, and if the axonal expression is specific to the cultured neurons?

      (2) Line 143: It would be helpful to specify the dose of psilocin tested, and describe how this dose was chosen.

      (3) Figure 1: The interpretation is that the differential internalization in the axonal and somatodendritic compartments is time-dependent. However, given that only one dose is tested, it is also possible that this reflects dose dependence, with the longer time exposure leading to higher dose exposure, so these variables are related. That is, if a higher dose is given, internalization may also be observed after 10 minutes in the dendritic compartment.

      (4) Figure 3 & 4: What is the 'control' here? A more appropriate control for the 24 hours after psilocin application would be 24 hours after vehicle application. Here the authors are looking at before and after, but the factor of time elapsed and perturbation via application is not controlled for.

      (5) The sample size was not clearly described. In the figure legend, N = the number of neurites is provided, but it is unclear how many cells have been analyzed, and then how many of those cells belong to the same culture. These are important sample size information that should be provided. Relatedly, statistical analyses should consider that the neurites from the same cells are not independent. If the neurites indeed come from the same cells, then the sample size is much smaller and a statistical analysis considering the nested nature of the data should be used.

    3. Reviewer #2 (Public review):

      In this article, Schmidt et al use iPSC-derived human cortical neurons to test the effects the psychedelic psilocin in different models of neuroplasticity.

      Using human iPSC-derived cortical neurons, the authors test the expression of 5-HT2A and subcellular distribution, as well as the effect of different times of exposure to psilocin on 5-HT2A expression. The authors evaluated the effect of the 5-HT2 antagonist ketanserin, as well as the inhibition of dynamin-dependent endocytic pathways with dynasore. Gene expression and plasticity (structural and functional) was also evaluated after different times of exposure to psilocin.

      In general, results are interesting since they use the iPSC to evaluate the potentially translationally relevant effects of psilocin (the active metabolite of the psychedelic psilocybin). However, there are a few concerns that need to be addressed:

      (1) My main critique is the lack of experimental validation of selectivity and/or specificity of the anti-5-HT2A antibody targeting the extracellular loop of the 5-HT2A receptor (Alomone labs, cat # ASR-033). Most of the primary antibodies targeting class A GPCRs (including the 5-HT2A receptor) have very limited selectivity. Without validation (using for example knockdown techniques to decrease expression of 5-HT2A in their iPSC-derived human cortical neurons), the experiments using this antibody should be excluded from the manuscript.

      (2) Did the author evaluate whether 5-HT is present in the cell media? If it is, this may affect the functional outcomes evaluated throughout, since as the endogenous ligand it would in principle activate the 5-HT2A receptor.

      (3) Some of the datasets are not statistically analyzed (or quantified), such as Figure S1F.

      (4) Another important concern is the experimental design used to evaluate the effect of psilocin at different time points (24h, 4 days and 10 days). One of the unique and translationally interesting effects of psychedelics including psilocybin is that the in vivo plasticity-related effects (increased structural or synaptic plasticity for example) are observed post-acutely, or once the active compound psilocin is fully metabolized, or not present in the CNS directly targeting the 5-HT2A. Using the iPSC, it seems that the authors continuously exposed cells to psilocin (for hours or even days) at least for some of the experimental techniques. Since this is not the model of what occurs using an in vivo model (such as a single dose of psilocybin to mice, collecting frontal cortex samples 24-h after drug administration, once the active compound is fully metabolized), the authors' findings lack translational validity. Can the authors comment on this?

      (5) In Figure 2E, it seems that ketamine by itself is reducing BDNF density. How then the authors conclude that ketamine blocks psi-induced effects? Using a more selective 5-HT2A antagonist such as M100907 could also improve the outcome (in terms of selectivity) of this experiment.

      (6) To evaluate neurite complexity, the authors used the AAV-CamKII-mCherry viral vector, but mCherry (Fig 4A) seems to be retained in the nucleus.

      (7) Minor: Reference 36- this is a review article that does not mention the psychedelic psilocin

    4. Author response:

      We sincerely thank the reviewers for their thorough and constructive evaluation of our manuscript. We particularly appreciate their recognition of our comprehensive characterization approach, which integrates immunohistochemistry, transcriptomics, morphological assessments, and electrophysiology to understand psilocin's effects on human neurons. The reviewers highlighted that our findings closely align with and validate prior work on rat cortical neurons, while importantly extending these insights to human cells. We are encouraged by their acknowledgment that our study demonstrates the value of using iPSC-derived human cortical neurons for testing potentially translatable effects of psychedelic compounds. Their positive assessment of our work's implications for psychedelic drug development is particularly valuable, as it supports our goal of advancing the understanding of these compounds' therapeutic potential and their possible application in treating neuropsychiatric disorders.

      We are also very grateful for the reviewers' constructive criticism which will help strengthen our manuscript significantly. Based on their detailed feedback, we plan to perform several additional experiments for inclusion in the revised manuscript.

      The most important concern raised by both reviewers is about the specificity of the antibody used to detect the expression pattern and abundance of 5-HT2A receptors at the cells' surface. We acknowledge that GPCR antibodies, including those targeting 5-HT2A receptors, can be challenging in terms of specificity and reliability, particularly given the structural similarities within this receptor family. To address these concerns comprehensively, we propose the following systematic validation strategy:

      (1) Cell-Type Specific Expression Analysis: We will systematically evaluate the antibody across different developmental stages and cell lines. The results from the stainings will be correlated with RNA sequencing data to provide quantitative validation of expression patterns. Cell types to be included will be:

      · iPSCs (expected negative)

      · Neural progenitors (expected positive)

      · Mature neurons (expected positive)

      · HEK cells (expected negative) This multi-stage analysis will allow us to track receptor expression through development and verify antibody specificity across distinct cellular contexts.

      (2) Peptide Competition Study: We will perform blocking experiments using the specific peptide sequence against which the antibody was raised. By pre-incubating the antibody with its cognate peptide at established working concentration, followed by detailed documentation of signal reduction in peptide-blocked condition versus standard staining, we can demonstrate binding specificity. This approach will provide direct evidence of antibody selectivity for its intended target.

      (3) Sequence Analysis and Specificity: We will perform a comprehensive protein BLAST analysis of the antigenic peptide sequence, assess potential cross-reactivity with related receptors, and evaluate species conservation and specificity. This in silico approach will complement our experimental validation and help identify any potential off-target binding sites.

      (4) Additional Validation: While technically challenging, we will attempt knockdown studies using siRNA/shRNA approaches to provide additional validation of antibody specificity. This molecular intervention will offer another layer of validation through targeted reduction of the receptor.

      We plan to present these results in a new supplementary figure that will provide a comprehensive overview of our validation efforts. Should we not be able to convincingly demonstrate the specificity of the antibody, we will discuss with the editors and reviewers to modify Figure 1 and exclude critical parts from the manuscript. While we find the results interesting and important to communicate, an omission would not critically impact the key message of the manuscript, which is the structural and molecular changes elicited by psilocin on human neurons. The strength of our multi-modal approach means that our core findings are supported by several independent lines of evidence beyond antibody-based detection.

    1. eLife Assessment

      The authors aimed to quantify feral pig interactions in eastern Australia to inform disease transmission networks. They used GPS tracking data from 146 feral pigs across multiple locations to construct proximity-based social networks and analyze contact rates within and between pig social units. This fundamental study shows that targeting adult males in feral pig control programs could help global efforts to contain disease. The methods are compelling and the paper should be of interest to the fields of veterinary medicine, public health, and epidemiology.

    2. Reviewer #2 (Public review):

      Summary:

      The paper attempts to elucidate how feral (wild) pigs cause distortion of the environment in over 54 countries of the world, particularly Australia.

      The paper displays proof that over $120 billion worth of facilities were destroyed annually in the United States of America.

      The authors have tried to infer that the findings of their work were fundamental and possessing a compelling strength of evidence.

      Strengths:

      (1) Clearly stating feral (wild) pigs as a problem in the environment.

      (2) Stating how 54 countries were affected by the feral pigs.

      (3) Mentioning how $120 billion was lost in the US, annually, as a result of the activities of the feral pigs.

      (4) Amplifying the fact that 14 species of animals were being driven into extinction by the feral pigs.

      (5) Feral pigs possessing zoonotic abilities.

      (6) Feral pigs acting as reservoirs for endemic diseases like brucellosis and leptospirosis.

      (7) Understanding disease patterns by the social dynamics of feral pig interactions.

      (8) The use of 146 GPS-monitored feral pigs to establish their social interaction among themselves.

      Weaknesses:

      None, as the weaknesses had been already addressed.

    3. Reviewer #3 (Public review):

      Summary:

      The authors sought to understand social interactions both within and between groups of feral pigs, with the intent of applying their findings to models of disease transmission. The authors analyzed GPS tracking data from across various populations to determine patterns of contact that could support the transmission of a range of zoonotic and livestock diseases.<br /> The analysis then focused on the effects of sex, group dynamics, and seasonal changes on contact rates that could be used to base targeted disease control strategies which would prioritize the removal of adult males for reducing intergroup disease transmission.

      Strengths:

      It utilized GPS tracking data from 146 feral pigs over several years, effectively capturing seasonal and spatial variation in the social behaviors of interest. Using proximity-based social network analysis, this work provides a highly resolved snapshot of contact rates and interactions both within and between groups, substantially improving research in wildlife disease transmission.<br /> Results were highly useful and provided practical guidance for disease management, showing that control targeted at adult males could reduce intergroup disease transmission, hence providing an approach for the control of zoonotic and livestock diseases.

      Weaknesses:

      None, as the authors have already addressed the identified weaknesses.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors aimed to quantify feral pig interactions in eastern Australia to inform disease transmission networks. They used GPS tracking data from 146 feral pigs across multiple locations to construct proximity-based social networks and analyse contact rates within and between pig social units.

      Strengths:

      (1) Addresses a critical knowledge gap in feral pig social dynamics in Australia.

      (2) Uses robust methodology combining GPS tracking and network analysis.

      (3) Provides valuable insights into sex-based and seasonal variations in contact rates.

      (4) Effectively contextualizes findings for disease transmission modeling and management.

      (5) Includes comprehensive ethical approval for animal research.

      (6) Utilizes data from multiple locations across eastern Australia, enhancing generalizability.

      Weaknesses:

      (1) Limited discussion of potential biases from varying sample sizes across populations

      This is a really good comment, and we will address this in the discussion as one of the limitations of the study

      (2) Some key figures are in supplementary materials rather than the main text.

      We will move some of our supplementary material to the main text as suggested.

      (3) Economic impact figures are from the US rather than Australia-specific data.

      We included the impact figures that are available for Australia (for FDM), and we will include the estimated impact of ASF in Australia in the introduction.

      (4) Rationale for spatial and temporal thresholds for defining contacts could be clearer.

      We will improve the explanation of why we chose the spatial and temporal thresholds based on literature, the size of animals and GPS errors.

      (5) Limited discussion of ethical considerations beyond basic animal ethics approval.

      This research was conducted under an ethics committee's approval for collaring the feral pigs. This research is part of an ongoing pest management activity, and all the ethics approvals have been highlighted in the main manuscript.

      The authors largely achieved their aims, with the results supporting their conclusions about the importance of sex and seasonality in feral pig contact networks. This work is likely to have a significant impact on feral pig management and disease control strategies in Australia, providing crucial data for refining disease transmission models.

      Reviewer #2 (Public review):

      Summary:

      The paper attempts to elucidate how feral (wild) pigs cause distortion of the environment in over 54 countries of the world, particularly Australia.

      The paper displays proof that over $120 billion worth of facilities were destroyed annually in the United States of America.

      The authors have tried to infer that the findings of their work were important and possess a convincing strength of evidence.

      Strengths:

      (1) Clearly stating feral (wild) pigs as a problem in the environment.

      (2) Stating how 54 countries were affected by the feral pigs.

      (3) Mentioning how $120 billion was lost in the US, annually, as a result of the activities of the feral pigs.

      (4) Amplifying the fact that 14 species of animals were being driven into extinction by the feral pigs.

      (5) Feral pigs possessing zoonotic abilities.

      (6) Feral pigs acting as reservoirs for endemic diseases like brucellosis and leptospirosis.

      (7) Understanding disease patterns by the social dynamics of feral pig interactions.

      (8) The use of 146 GPS-monitored feral pigs to establish their social interaction among themselves.

      Weaknesses:

      (1) Unclear explanation of the association of either the female or male feral pigs with each other, seasonally.

      This will be better explained in the methods.

      (2) The "abstract paragraph" was not justified.

      We have justified the abstract paragraph as requested by the reviewer.

      (3) Typographical errors in the abstract.

      Typographical errors have been corrected in the Abstract.

      Reviewer #3 (Public review):

      Summary:

      The authors sought to understand social interactions both within and between groups of feral pigs, with the intent of applying their findings to models of disease transmission. The authors analyzed GPS tracking data from across various populations to determine patterns of contact that could support the transmission of a range of zoonotic and livestock diseases. The analysis then focused on the effects of sex, group dynamics, and seasonal changes on contact rates that could be used to base targeted disease control strategies that would prioritize the removal of adult males for reducing intergroup disease transmission.

      Strengths:

      It utilized GPS tracking data from 146 feral pigs over several years, effectively capturing seasonal and spatial variation in the social behaviors of interest. Using proximity-based social network analysis, this work provides a highly resolved snapshot of contact rates and interactions both within and between groups, substantially improving research in wildlife disease transmission. Results were highly useful and provided practical guidance for disease management, showing that control targeted at adult males could reduce intergroup disease transmission, hence providing an approach for the control of zoonotic and livestock diseases.

      Weaknesses:

      Despite their reliability, populations can be skewed by small sample sizes and limited generalizability due to specific environmental and demographic characteristics. Further validation is needed to account for additional environmental factors influencing social dynamics and contact rates.

      This is a really good point, and we thank the reviewer for pointing out this issue. We will discuss the potential biases due to sample size in our discussion. We agree that environmental factors need to be incorporated and tested for their influence on social dynamics, and this will be added to the discussion as we have plans to expand this research and conduct, the analysis to determine if environmental factors are influencing social dynamics.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Consider moving some key figures from supplementary materials to the main text to strengthen the presentation of results.

      We included a new figure to strengthen the presentation of results (Figure 3a-b), which shows the node level measures by sex and for direct and indirect networks.

      (2) Expand discussion of limitations, particularly addressing potential biases from varying sample sizes across populations.

      We added more detail and clarity about this potential bias into the limitation section within the discussion: “Different populations in our study had varying numbers of collared individuals, with some populations having only two individuals at certain times. This variability in sample size across populations is a limitation when interpreting the results. Small populations are often the result of a few individuals being trapped and collared, and this does not necessarily reflect the actual number of individuals in those groups.” Moreover, while reviewing the effect of the potential bias, we found that a General Linear Mixed Effect Model (Table 1) was not optimal for analysing the effect of sex on the network measures, and therefore this analysis has been done again using a non-parametric test (Wilcoxon rank-sum test)  for direct and indirect networks based on a 5 metres threshold (Table 1).

      (3) If available, include Australia-specific economic impact data in the introduction.

      We included the impact figures that are available for Australia (for FDM) in the introduction.

      (4) Clarify the rationale for chosen spatial and temporal thresholds for defining contacts.

      This has been added in the methodology: “Direct contact was defined when two individuals interacted either at 2, 5, or 350-metre buffers within a five-minute interval [36]. A previous study used 350 metres as a spatial threshold [16], while others use the approximate average body length of an individual [36]”

      (5) Consider adding a brief discussion of ethical considerations beyond basic animal ethics approval, addressing aspects like animal welfare during collaring and potential environmental impacts.

      Feral pigs are an invasive species in Australia, and managing their population is crucial to protecting native ecosystems. The trapping and collaring of these animals have been conducted following the stringent animal welfare requirements necessary to obtain animal ethics approval in Australia. However, it is important to consider the broader ethical implications. Animal welfare during collaring is a critical aspect and involves minimising stress and physical harm to the animals. The collars used are lightweight and properly fitted only on adults due to welfare issues collaring juveniles.

      (6) Add a statement about data availability/accessibility.

      The GPS data cannot be shared; however, the R codes will be deposited in GitHub (https://github.com/Tatianaproboste/Feral-Pig-Interactions) and the link has been added in the final version.

      (7) Expand on the implications of seasonal variation in contact rates for disease management strategies in the discussion.

      We have added this information in the discussion: “For example, controlling an outbreak during summer would potentially require more resources than an outbreak in other seasons due to the higher number of contact between individuals during summer.”

      Reviewer #2 (Recommendations for the authors):

      The typographical errors in the abstract to be corrected are:

      (1) Line 22: Remove the "are" before "threaten".

      This has been corrected.

      (2) Line 24: Replace the "to" before "extinction" with "into".

      This has been corrected.

      (3) Line 28: Rephrase the sentence.

      ‘Yet social dynamics are known to vary enormously from place to place, so knowledge generated for example in USA and Europe might not easily transfer to locations such as Australia.’

      (3) Line 29: Insert a "comma" after "Here".

      This has been corrected.

      (4) Lines 33 -34: Explain, clearly, the contact rates; is it between females to females or females to males?

      We have improved this phrase and now it reads: “…. with females demonstrating higher group cohesion (female-female) and males acting as crucial connectors between independent groups.”

      (5) Line 36: Make yourselves clear about what you mean by "targeting adult male".

      We believe “targeting adult males” is correct in this context.

      Reviewer #3 (Recommendations for the authors):

      (1) Line 22 and 44, I think are threaten "are" should be removed for better clarity.

      This has been corrected.

      (2) Line 71, the source and not "force" of infection.

      The force of infection is correct here.

      (3) Line 72, population "of".

      This has been corrected.

      (4) Under statistical analysis, the software version should be included.

      R has changed to multiple versions since we started this analysis.

      (5) Terminological consistency: as far as possible try to be consistent with the terms used in the text, such as using "contact rate" instead of "interaction rate" in order not to puzzle the readers.

      We have changed most of the “interactions” to “contact” instead as suggested.

      (6) Correct Typos: Identify typos and grammatical inconsistencies of any kind, especially in those complex sentences that may be hard to follow.

      The typos have been checked.

      (7) Under the methodology, briefly describe why specific thresholds were chosen and any limitations.

      We added the following into the method: “Direct contact was defined when two individuals interacted either at 2, 5, or 350-metre buffers within a five-minute interval [36]. A previous study used 350 metres as a spatial threshold [16], while others use the approximate average body length of an individual [36]”

      (8) The discussion should be strengthened by drawing clear links between the findings and actionable management strategies.

      We have strengthened the discussion by adding more specific actionable management strategies. For example, controlling an outbreak during summer would potentially require more resources than an outbreak in other seasons due to the higher number of contacts between individuals during summer.

      (9) Did you consider additional environmental factors, such as rainfall, food availability, or habitat features, to better understand how these influence seasonal variations in pig interactions and contact rates?

      This is something that we have in mind and will explore in future research. This has been partially explored but is based on how environmental factors and seasons affect the home range (Wilson et al 2023).

      (10) Figure Legends: Add more detailed descriptions in figure legends, especially for those figures showing network metrics or contact rates.

      More information has been added to the figure legends.

      (11) The paper includes too many figures, and thus, it is recommended to simplify or merge some figures where appropriate. In particular, this is recommended for those figures that plot more network measures across thresholds. Adding clear, summarized captions with interpretation on threshold and measure significance would be a great help in interpreting complicated visualizations.

      The figure that shows the comparison between global network measures, including average local transitivity, edge density, global transitivity, mean distance and number of edges for direct and indirect networks has been moved to supplementary material (Figure S3). We also included direct and indirect model-level measures by sex as in Figure 3 and improved the captions of the figures presented in the main document.

    1. eLife Assessment

      This is an important study demonstrating that anosmia in Parkinson's disease patients is due to dysfunction in cholinergic neurons. This study provides compelling evidence, using scRNA sequencing, that cholinergic olfactory projection neurons (OPN) are consistently affected in five different fruit fly models of Parkinson's disease, exhibiting synaptic dysfunction before the onset of motor deficits. Comparisons with scRNA sequencing of patients' human brain samples reveals similar synaptic gene deregulation in cholinergic neurons of patients. This study points the possibility that targeting cholinergic neurons could be a potential avenue for early diagnosis and intervention in PD.

    2. Reviewer #1 (Public review):

      In Pech et al. the authors take advantage of a genetic model organism to investigate the convergent impact of multiple mutations linked to Parkinson's Disease (PD). To investigate this question they leverage Drosophila genetics to create wild type and mutant alleles for five different mutations linked to PD. An additional novel focus of this work is an examination of the animals in an early phase before apparent dopaminergic degeneration. Having generated this resource, authors discover apply an impressive array of experiments including behavioural assays, calcium imaging and single-cell profiling. They also cross-validate their findings in human PD brains. Strikingly, the authors discover common dysregulated genes between fly and human that converges on synaptic dysregulation. Finally, they demonstrate that even in early timepoints, there is extensive dysfunction of olfactory projection neuron calcium.

      This is a fantastic, comprehensive, timely and landmark pan-species work that demonstrates the convergence of multiple familial PD mutations onto a synaptic program. It is extremely well written and the authors have addressed all my comments in this review. I recommend this work be published as soon as possible.

    3. Reviewer #3 (Public review):

      Summary:

      This study investigates the cellular and molecular events leading to hyposmia, an early dysfunction in Parkinson's disease (PD), which develops up to 10 years prior to motor symptoms. The authors use five Drosophila knock-in models of familial PD genes (LRRK2, RAB39B, PINK1, DNAJC6 (Aux), and SYNJ1 (Synj)), three expressing human genes and two Drosophila genes with equivalent mutations.

      The authors carry out single-cell RNA sequencing of young fly brains and single-nucleus RNA sequencing of human brain samples. The authors found that cholinergic olfactory projection neurons (OPN) were consistently affected across the fly models, showing synaptic dysfunction before the onset of motor deficits, known to be associated with dopaminergic neuron (DAN) dysfunction.

      Single-cell RNA sequencing revealed significant transcriptional deregulation of synaptic genes in OPNs across all five fly PD models. This synaptic dysfunction was confirmed by impaired calcium signalling and morphological changes in synaptic OPN terminals. Furthermore, these young PD flies exhibited olfactory behavioural deficits that were rescued by selective expression of wild-type genes in OPNs.

      Single-nucleus RNA sequencing of post-mortem brain samples from PD patients with LRRK2 risk mutations revealed similar synaptic gene deregulation in cholinergic neurons, particularly in the nucleus basalis of Meynert (NBM). Gene ontology analysis highlighted enrichment for processes related to presynaptic function, protein homeostasis, RNA regulation, and mitochondrial function.

      This study provides compelling evidence for the early and primary involvement of cholinergic dysfunction in PD pathogenesis, preceding the canonical DAN degeneration. The convergence of familial PD mutations on synaptic dysfunction in cholinergic projection neurons suggests a common mechanism contributing to early non-motor symptoms like hyposmia. The authors also emphasise the potential of targeting cholinergic neurons for early diagnosis and intervention in PD.

      Strengths:

      This study presents a novel approach, combining multiple mutants to identify salient disease mechanisms. The quality of the data and analysis is of a high standard, providing compelling evidence for the role of OPN neurons in olfactory dysfunction in PD. The authors also provide evidence to show that early olfactory defects lead to later dopaminergic neuron dysfunction. The comprehensive single-cell RNA sequencing data from both flies and humans is a valuable resource for the research community. The identification of consistent impairments in cholinergic olfactory neurons, at early disease stages, is a powerful finding that highlights the convergent nature of PD progression. The comparison between fly models and human patients' brains provides strong evidence of the conservation of molecular mechanisms of disease, which can be built upon in further studies using flies to prove causal relationships between the defects described here and neurodegeneration.

      The identification of specific neurons involved in olfactory dysfunction opens up potential avenues for diagnostic and therapeutic interventions.

    1. eLife Assessment

      This study provides important findings on the nature of eye movement choices by human subjects. The study uses a novel approach and provides relatively clear and convincing results of the relationship between pupil size and saccade production. The results should be of interest to a broad audience interested in sensorimotor integration and sensory-guided decision-making.

    2. Reviewer #3 (Public review):

      Summary:

      This manuscript extends previous research by this group by relating variation in pupil size to the endpoints of saccades produced by human participants under various conditions including trial-based choices between pairs of spots and search for small items in natural scenes. Based on the premise that pupil size is a reliable proxy of "effort", the authors conclude that less costly saccade targets are preferred. Finding that this preference was influenced by the performance of a non-visual, attention-demanding task, the authors conclude that a common source of effort animates gaze behavior and other cognitive tasks.

      Strengths:

      Strengths of the manuscript include the novelty of the approach, the clarity of the findings, and the community interest in the problem.

      Weaknesses:

      Enthusiasm for this manuscript is reduced by the following weaknesses:

      (1) A relationship between pupil size and saccade production seems clear based on the authors' previous and current work. What is at issue is the interpretation. The authors test one, preferred hypothesis, and the narrative of the manuscript treats the hypothesis that pupil size is a proxy of effort as beyond dispute or question. The stated elements of their argument seem to go like this:<br /> PROPOSITION 1: Pupil size varies systematically across task conditions, being larger when tasks are more demanding.<br /> PROPOSITION 2: Pupil size is related to the locus coeruleus.<br /> PROPOSITION 3: The locus coeruleus NE system modulates neural activity and interactions.<br /> CONCLUSION: Therefore, pupil size indexes the resource demand or "effort" associated with task conditions.<br /> How the conclusion follows from the propositions is not self-evident. Proposition 3, in particular, fails to establish the link that is supposed to lead to the conclusion.

      (2) The authors test one, preferred hypothesis and do not consider plausible alternatives. Is "cost" the only conceivable hypothesis? The hypothesis is framed in very narrow terms. For example, the cholinergic and dopamine systems that have been featured in other researchers' consideration of pupil size modulation are missing here. Thus, because the authors do not rule out plausible alternative hypotheses, the logical structure of this manuscript can be criticized as committing the fallacy of affirming the consequent.

      (3) The authors cite particular publications in support of the claim that saccade selection is influenced by an assessment of effort. Given the extensive work by others on this general topic, the skeptic could regard the theoretical perspective of this manuscript as too impoverished. Their work may be enhanced by consideration of other work on this general topic, e.g, (i) Shenhav A, Botvinick MM, Cohen JD. (2013) The expected value of control: an integrative theory of anterior cingulate cortex function. Neuron. 2013 Jul 24;79(2):217-40. (ii) Müller T, Husain M, Apps MAJ. (2022) Preferences for seeking effort or reward information bias the willingness to work. Sci Rep. 2022 Nov 14;12(1):19486. (iii) Bustamante LA, Oshinowo T, Lee JR, Tong E, Burton AR, Shenhav A, Cohen JD, Daw ND. (2023) Effort Foraging Task reveals a positive correlation between individual differences in the cost of cognitive and physical effort in humans. Proc Natl Acad Sci U S A. 2023 Dec 12;120(50):e2221510120.

      (4) What is the source of cost in saccade production? What is the currency of that cost? The authors state (page 13), "... oblique saccades require more complex oculomotor programs than horizontal eye movements because more neuronal populations in the superior colliculus (SC) and frontal eye fields (FEF) [76-79], and more muscles are necessary to plan and execute the saccade [76, 80, 81]." This statement raises questions and concerns. First, the basis of the claim that more neurons in FEF and SC are needed for oblique versus cardinal saccades is not established in any of the publications cited. Second, the authors may be referring to the fact that oblique saccades require coordination between pontine and midbrain circuits. This must be clarified. Second, the cost is unlikely to originate in extraocular muscle fatigue because the muscle fibers are so different from skeletal muscles, being fundamentally less fatigable. Third, if net muscle contraction is the cost, then why are upward saccades, which require the eyelid, not more expensive than downward? Thus, just how some saccades are more effortful than others is not clear.

      (5) The authors do not consider observations about variation in pupil size that seem to be incompatible with the preferred hypothesis. For example, at least two studies have described systematically larger pupil dilation associated with faster relative to accurate performance in manual and saccade tasks (e.g., Naber M, Murphy P. Pupillometric investigation into the speed-accuracy trade-off in a visuo-motor aiming task. Psychophysiology. 2020 Mar;57(3):e13499; Reppert TR, Heitz RP, Schall JD. Neural mechanisms for executive control of speed-accuracy trade-off. Cell Rep. 2023 Nov 28;42(11):113422). Is the fast relative to the accurate option necessarily more costly?

      (6) The authors draw conclusions based on trends across participants, but they should be more transparent about variation that contradicts these trends. In Figures 3 and 4 we see many participants producing behavior unlike most others. Who are they? Why do they look so different? Is it just noise, or do different participants adopt different policies?

      Comments on revisions:

      The authors have addressed the concerns and questions raised in the original review.

    1. eLife Assessment

      This manuscript presents valuable findings showing that rapamycin directly activates the cool-sensing ion channel, TRPM8, acting through a different binding site than other small-molecule cooling agents such as menthol. The use of Ca2+-imaging, electrophysiology, and computational biology provides solid evidence to support the finding. The authors also present a novel NMR-based method to help identify details of the binding site interactions. In this revised version, some analysis and the presentation have been corrected and improved. Their findings provide insights into TRP channel pharmacology and may indicate previously unknown physiological effects or therapeutic mechanisms of the immunosuppressant, rapamycin.

    2. Reviewer #1 (Public review):

      Summary:

      In this valuable study, the authors found that the macrolide drug rapamycin, which is an important pharmacological tool in the clinic and the research lab, is less specific than previously thought. They provide solid functional evidence that rapamycin activates TRPM8 and begin to develop an NMR method to measure the specific binding of a ligand to a membrane protein.

      Strengths:

      The authors use a variety of complementary experimental techniques in several different systems, and their results support the conclusions drawn.

      Weaknesses:

      The proposed location of the rapamycin binding pocket within the membrane means that molecular docking approaches designed for soluble proteins alone do not provide solid evidence for a rapamycin binding pocket location in TRPM8, but the authors are appropriately careful in stating that the model is consistent with their functional experiments. The novel STTD method is intriguing and supportive of the functional results and docking predictions, but further validation of this method is needed.

      Impact:

      This work provides still more evidence for the polymodality of TRP channels, reminding both TRP channel researchers and those who use rapamycin in other contexts that the adjective "specific" is only meaningful in the context of what else has been explicitly tested.

      Comments on revisions:

      The authors have addressed my major concerns from the previous round of revision, and I agree that those things that remain un-done are outside the scope of this manuscript.

    3. Reviewer #2 (Public review):

      Summary:

      Tóth and Bazeli et al. find rapamycin activates heterologously-expressed TRPM8 and dissociated sensory neurons in a TRPM8-dependent way with Ca2+-imaging. With electrophysiology and STTD-NMR, they confirmed the activation is through direct interaction with TRPM8. Using mutants and computational modeling, the authored localized the binding site to the groove between S4 and S5, different than the binding pocket of cooling agents such as menthol. The hydroxyl group on carbon 40 within the cyclohexane ring in rapamycin is indispensable for activation, while other rapalogs with its replacement, such as everolimus, still bind but cannot activate TRPM8. Overall, the findings provide new insights into TRPM8 functions and may indicate previously-unknown physiological effects or therapeutic mechanisms of rapamycin.

      Strengths:

      The authors spent extensive effort on demonstration that the interaction between TRPM8 and rapamycin is direct. The evidence is solid. In probing the binding site and the structural-function relationship, the authors combined computational simulation and functional experiments. It is very impressive to see that "within" a rapamycin molecule, the portion shared with everolimus is for "binding", while the hydroxyl group in the cyclohexane ring is for activation. Such detailed dissection represents a successful trial in computational biology-facilitated, functional experiment-validated study of TRP channel structural-activity relationship. The research draws the attention of scientists, including those outside the TRP channel field, to previously-neglected effects of rapamycin, and therefore the manuscript deserves broad readership.

      Weaknesses:

      The significance of the research could be improved by showing or discussing whether a similar binding pocket is present in other TRP channels, and hence rapalogs might bind to or activate these TRP channels. Additionally, while the finding on TRPM8 is novel, it is worthwhile to perform more comprehensive pharmacological characterization, including single-channel recording and a few more mutant studies to offer further insight into the mechanism of rapamycin binding to S4~S5 pocket driving channel opening. It is also necessary to know if rapalogs have independent or synergistic effects on top of other activators, including cooling agents and lower temperature, and its dependence on regulators such as PIP2.

      Additional discussion that might be helpful:

      The authors did confirm that rapamycin does not activate TRPV1, TRPA1 and TRPM3. But other TRP channels, particularly other structurally-similar TRPM channels, should be discussed or tested. Alignment of the amino acid sequences or structures at the predicted binding pocket might predict some possible outcomes. In particular, rapamycin is known to activate TRPML1 in a PI(3,5)P2-dependent manner, which should be highlighted in comparison among TRP channels (PMID: 35131932, 31112550).

      After revision:

      I acknowledge that the authors have addressed some of the questions in their revised version. They have explained that additional experiments might be beyond the scope of the current study. I appreciate their effort in doing their best to improve the manuscript and to leave the rest in discussion.

    4. Reviewer #3 (Public review):

      Summary:

      Rapamycin is a macrolide of immunologic therapeutic importance, proposed as a ligand of mTOR. It is also employed as in essays to probe protein-protein interactions.<br /> The authors serendipitously found that the drug rapamycin and some related compounds, potently activate the cationic channel TRPM8, which is the main mediator of cold sensation in mammals. The authors show that rapamycin might bind to a novel binding site that is different from the binding site for menthol, the prototypical activator of TRPM8. These convincing results are important to a wide audience, since rapamycin is a widely used drug and is also employed in essays to probe protein-protein interactions, which could be affected by potential specific interactions of rapamycin with other membrane proteins, as illustrated herein.

      Strengths:

      The authors employ several experimental approaches to convincingly show that rapamycin activates directly the TRPM8 cation channel and not an accessory protein or the surrounding membrane. In general, the electrophysiological, mutational and fluorescence imaging experiments are adequately carried out and cautiously interpreted, presenting a clear picture of the direct interaction with TRPM8. In particular, the authors convincingly show that the interactions of rapamycin with TRPM8 are distinct from interactions of menthol with the same ion channel.

      Weaknesses:

      The main weakness of the manuscript was the NMR method employed to show that rapamycin binds to TRPM8. The authors developed and deployed a novel signal processing approach based on subtraction of several independent NMR spectra to show that rapamycin binds to the TRPM8 protein and not to the surrounding membrane or other proteins. In this revised version the authors have strengthened the evidence that the method gives solid results and have improved the clarity of the presentation.

      Comments on revisions:

      The authors have greatly improved the quality of the presentation of the NMR data and have answered my concerns regarding the new methodology. The manuscript is improved and represents an important contribution.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this valuable study, the authors found that the macrolide drug rapamycin, which is an important pharmacological tool in the clinic and the research lab, is less specific than previously thought. They provide solid functional evidence that rapamycin activates TRPM8 and develop an NMR method to measure the specific binding of a ligand to a membrane protein.

      Strengths:

      The authors use a variety of complementary experimental techniques in several different systems, and their results support the conclusions drawn.

      Weaknesses:

      Controls are not shown in all cases, and a lack of unity across the figures makes the flow of the paper disjointed. The proposed location of the rapamycin binding pocket within the membrane means that molecular docking approaches designed for soluble proteins alone do not provide solid evidence for a rapamycin binding pocket location in TRPM8, but the authors are appropriately careful in stating that the model is consistent with their functional experiments.

      Impact:

      This work provides still more evidence for the polymodality of TRP channels, reminding both TRP channel researchers and those who use rapamycin in other contexts that the adjective "specific" is only meaningful in the context of what else has been explicitly tested.

      Reviewer #2 (Public Review):

      Summary:

      Tóth and Bazeli et al. find rapamycin activates heterologously-expressed TRPM8 and dissociated sensory neurons in a TRPM8-dependent way with Ca2+-imaging. With electrophysiology and STTD-NMR, they confirmed the activation is through direct interaction with TRPM8. Using mutants and computational modeling, the authored localized the binding site to the groove between S4 and S5, different than the binding pocket of cooling agents such as menthol. The hydroxyl group on carbon 40 within the cyclohexane ring in rapamycin is indispensable for activation, while other rapalogs with its replacement, such as everolimus, still bind but cannot activate TRPM8. Overall, the findings provide new insights into TRPM8 functions and may indicate previously unknown physiological effects or therapeutic mechanisms of rapamycin.

      Strengths:

      The authors spent extensive effort on demonstrating that the interaction between TRPM8 and rapamycin is direct. The evidence is solid. In probing the binding site and the structural-function relationship, the authors combined computational simulation and functional experiments. It is very impressive to see that "within" a rapamycin molecule, the portion shared with everolimus is for "binding", while the hydroxyl group in the cyclohexane ring is for activation. Such detailed dissection represents a successful trial in the computational biology-facilitated, functional experiment-validated study of TRP channel structuralactivity relationship. The research draws the attention of scientists, including those outside the TRP channel field, to previously neglected effects of rapamycin, and therefore the manuscript deserves broad readership.

      Weaknesses:

      The significance of the research could be improved by showing or discussing whether a similar binding pocket is present in other TRP channels, and hence rapalogs might bind to or activate these TRP channels. Additionally, while the finding on TRPM8 is novel, it is worthwhile to perform more comprehensive pharmacological characterization, including single-channel recording and a few more mutant studies to offer further insight into the mechanism of rapamycin binding to S4~S5 pocket driving channel opening. It is also necessary to know if rapalogs have independent or synergistic effects on top of other activators, including cooling agents and lower temperature, and their dependence on regulators such as PIP2.

      Additional discussion that might be helpful:

      The authors did confirm that rapamycin does not activate TRPV1, TRPA1 and TRPM3. But other TRP channels, particularly other structurally similar TRPM channels, should be discussed or tested. Alignment of the amino acid sequences or structures at the predicted binding pocket might predict some possible outcomes. In particular, rapamycin is known to activate TRPML1 in a PI(3,5)P2-dependent manner, which should be highlighted in comparison among TRP channels (PMID: 35131932, 31112550).

      Reviewer #3 (Public Review):

      Summary:

      Rapamycin is a macrolide of immunologic therapeutic importance, proposed as a ligand of mTOR. It is also employed as in essays to probe protein-protein interactions.

      The authors serendipitously found that the drug rapamycin and some related compounds, potently activate the cationic channel TRPM8, which is the main mediator of cold sensation in mammals. The authors show that rapamycin might bind to a novel binding site that is different from the binding site for menthol, the prototypical activator of TRPM8. These solid results are important to a wide audience since rapamycin is a widely used drug and is also employed in essays to probe protein-protein interactions, which could be affected by potential specific interactions of rapamycin with other membrane proteins, as illustrated herein.

      Strengths:

      The authors employ several experimental approaches to convincingly show that rapamycin activates directly the TRPM8 cation channel and not an accessory protein or the surrounding membrane. In general, the electrophysiological, mutational and fluorescence imaging experiments are adequately carried out and cautiously interpreted, presenting a clear picture of the direct interaction with TRPM8. In particular, the authors convincingly show that the interactions of rapamycin with TRPM8 are distinct from interactions of menthol with the same ion channel.

      Weaknesses:

      The main weakness of the manuscript is the NMR method employed to show that rapamycin binds to TRPM8. The authors developed and deployed a novel signal processing approach based on subtraction of several independent NMR spectra to show that rapamycin binds to the TRPM8 protein and not to the surrounding membrane or other proteins. While interesting and potentially useful, the method is not well developed (several positive controls are missing) and is not presented in a clear manner, such that the quality of data can be assessed and the reliability and pertinence of the subtraction procedure evaluated.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Major points

      (1) Given the novelty of the STTD NMR approach, please provide more details and data for the reader.

      • I would like to see all of the collected spectra so that readers can see and judge the effect sizes for themselves, perhaps as an additional supplementary figure.

      We agree with the reviewer that the data transparency of the NMR measurements should be improved. We changed panel C of Figure 2 in the main text and provided all the STD and the computed STDD and STTD spectra recorded on one set of experiments. We carried out additional experimental replicas on new samples and addressed the variability of cell samples by rescaling the STD effects based on reference <sup>1</sup>H measurements. We provided supplementary spectra of the reference experiments without saturation (Figure S5) and the obtained STTD spectra from the three parallel NMR sessions (Figure S6).

      • I appreciate the labels for STDD-1, STDD-2, and STTD on the lower two spectra of Figure 2C. Is the top spectrum from STD-1 or is it prior to saturation? In Figure 2C, what do the x1 and x2 notations on the right-hand side of the spectra indicate?

      We showed the top spectrum as an overview and a demonstration of the spectral complexity of the samples. <sup>1</sup>H experiments were run before the STD measurements to assess the sample quality and stability. The demonstrated spectrum on sample 1 (TRPM8 with rapamycin in HEK cells) was recorded with more transients than the corresponding STDs, thus it is only visually comparable with the difference spectra after scaling (2x). Figure 2 was changed and all the spectra were replaced as mentioned before. All the recorded <sup>1</sup>H-experiments without saturation including the one removed are now available in the supplementary information (Figure S5).

      • The STTD NMR results with WT TRPM8 are consistent with rapamycin binding directly to the channel. Testing whether rapamycin binding observed with STTD NMR is disrupted by one of the most compelling mutations (D796A, D802A, G805A, or Q861A) would be a further test of this direct interaction.

      We thank the reviewer for the suggestion and agree that testing the most compelling mutants would be a promising next step. These mutations were generated in plasmid vectors and only transiently transfected into HEK cells. For NMR analysis we would need a high amount of cells stably overexpressing the mutant channels which were not available for experimentation.

      • Given that this is not a methods paper, it is probably outside the scope to further validate the STTD NMR measurements by performing parallel ITC, SPR, MST, or radiolabeled ligand experiments. Nevertheless, I would be excited to see such a comparison since STTD NMR appears to have promise as an experimental technique for assessing ligand binding to membrane proteins that does not require large amounts of purified protein or radioactive isotopes.

      We agree with the reviewer that additional independent biophysical measurements on the interactions are necessary to further validate the STTD methodology. This paper is a preliminary demonstration of the STTD concept and our group is currently working on the challenges of on-cell NMR (e.g., sample and spectral complexity) and the standardization of the proposed workflow.     

      (2) Please clarify the methods used to model of rapamycin binding. Docking can be imprecise in TRP channels, even with a sophisticated docking scheme (Hughes et al., 2019, doi: https://doi.org/10.7554/eLife.49572.001).  

      Thank you for mentioning this point and providing the reference. We have further clarified our methods and included the reference in our discussion, indicating the limitations of our approach.

      • As a positive control, does the docking strategy accurately predict binding of known compounds (menthol, icilin, etc.) to TRPM8 consistent with cryo-EM structures?  

      Yes, the binding site for menthol, based on a similar docking strategy as for rapamycin, is also presented, and matches with predictions from other publications. This is now clarified in the revised manuscript.

      • Why was homology modeling to the human sequence used with the mouse structure but not the avian structure?  

      At this onset of the project, only the avian structure was available, and it was used in the primary docking. Later, to get more precise docking relevant for human TRPM8 pharmacology, we did revert to the then available structure of the mouse ortholog.  

      • How many rapamycin structural clusters were built, and how many structures were there in each cluster? How many were used? "most populated" is unspecific.  

      Thank you for your comment. We have added the following highlighted information to the methods section to address your comment:

      “Representative conformations of rapamycin were identified by clustering of the 1000-membered pools, having the macrocycle backbone atoms compared with 1.0 Å RMSD cut-off. Middle structures of the ten most populated clusters, accounting for more than 90% of the total conformational ensemble generated by simulated annealing, were used for further docking studies. To refine initial docking results and to identify plausible binding sites, the above selected rapamycin structures were docked again, following the same protocol as above, except for the grid spacing which was set to 0.375 Å in the second pass. The resultant rapamycin-TRPM8 complexes were, again, clustered and ranked according to the corresponding binding free energies. Selected binding poses were subjected to further refinement. The three most populated and plausible binding poses were further refined by a third pass of docking, where amino acid side chains of TRPM8, identified in the previous pass to be in close contact with rapamycin (< 4 Å), were kept flexible. Grid volumes were reduced to these putative binding sites including all flexible amino acid side chains (21.0-26.2 Å x 26.2-31.5 Å x 24.8-29.2 Å).”

      However, it is important to clarify that the clusters are not built and their number is not specified by the user. The number of clusters found depends on how similar the structures are in the structural ensemble analyzed by clustering. A high number of clusters indicates a diverse, whereas a low number suggests a uniform structural ensemble. Furthermore, it is arbitrarily controlled by the similarity cutoff specified by the user. If the cutoff is selected well, then the number of structures is different in each cluster. There are some highly populated clusters and a few which only have one structure. The selection of how many cluster representatives are used is usually based on the decision of whether or not the sum of the population of selected clusters sufficiently covers the mapped conformational space.

      • Additionally, the rapamycin poses were generated using a continuum solvent model that is unlikely to replicate the conditions existing in the lipid bilayer or in a lipid-exposed binding pocket as is predicted here. It is therefore possible that the rapamycin poses chosen for docking do not represent the physiological rapamycin binding pose, hampering the ability of the docking algorithm to find an appropriate docking pocket.  

      • Furthermore, accurately docking that may bind to membrane-exposed pockets is a challenging problem, particularly because many scoring algorithms, including those employed by Autodock, do not distinguish between solvent-exposed and membrane-exposed faces of the protein. This affects the predicted binding energies.  

      We appreciate the reviewer's insightful comments. We add a note in discussion part, mentioning these important limitations.  

      • In Figure 4, it appears that the proposed rapamycin binding pocket is located at the interface between two subunits, but only one is shown. Is there any contact with residues in the neighboring subunit? Based on Figure S4, I assume not, but am unsure.

      Based on the estimated distances, we do not think that there are any relevant interactions with residues from neighboring subunits. This is now indicated in the results section.

      • Consider uploading the rapamycin-docked model to a public repository such as Zenodo for readers to examine and manipulate themselves  

      As suggested, the model will be uploaded in a public repository. A link to the file on Zenodo is now included.

      (3) Please discuss the spatial location of the proposed rapamycin binding pocket relative to the vanilloid binding pocket in TRPV1.

      • The mutagenesis indicates that D745, D802, G805, and Q861 are most important for rapamycin sensitivity in TRPM8. Interestingly, the proposed rapamycin binding pocket appears to overlap spatially with the vanilloid binding pocket in TRPV1. Consistent with this, Q861 aligns with E570 in TRPV1, which is a critical residue for resiniferatoxin sensitivity. Indeed, similar to Q861's modeled proximity to the cyclohexyl ring, the hydroxyl group of the vanillyl moity of capsaicin (4DY in 7LR0, for example) is in proximity to E750 in TRPV1. Additionally, searching PubChem by structural similarity suggests that vanillyl head group of the TRP channel modulators capsaicin and eugenol are similar structurally to the trans-2Methoxycyclohexan-1-ol ring. Without overlaying the two structures myself, it is difficult to say more than that, but I encourage the authors to comment on any similarities and differences they observe.

      • If the proposed rapamycin pocket is indeed similar to the location of the vanilloid binding site, the authors may wish to discuss other TRPM channel structures that show ligands and lipids bound to this pocket because this provides evidence that this pocket influences TRPM channel function. For example, how does the proposed rapamycin binding pocket compare to TRPM8 bound to agonist AITC (PDBID 8e4l), TRPM5 bound to inhibitor NDNA (7mbv), and TRPM2 bound to phosphatidylcholine (6co7)?

      • Other TRP channel structures with ligands or lipids modeled in this region include TRPV1 bound to resiniferatoxin, capsaicin, or phosphatidylinositol (7l2j, 7l24, 7l2s, 7l2t, 7l2u, 7lp9, 7lpc, 7lqy, 7mz6, 7mz9, 7mza); TRPV3 bound to phosphatidylcholine (7mij, 7mik, 7mim, 7min, 7ugg); TRPV5 bound to econazole (6b5v) or ZINC9155 (6pbf); TRPV6 bound to piperazine (7d2k, 7k4b, 7k4c, 7k4d, 7k4e, 7k4f) or cholesterol hemisuccinate (7s8c); TRPC6 bound to BTDM (7dxf) or phosphatidylcholine (6uza); and TRP1 bound to PIP2 (6pw5).

      We thank the reviewer for these valuable insights. We have included some additional discussion highlighting the similarities between the proposed rapamycin binding site and some of the other ligandchannel interactions in the TRP superfamily, in particular the well-known vanilloid binding site in TRPV1. However, to keep the discussion focused, we have not fully discussed all the indicated interactions, to best serve the clarity and scope of the manuscript.  

      (4) I would like to see negative control calcium imaging and electrophysiology data with untransfected HEK cells to confirm that the observed activation is mediated by TRPM8 to parallel the TRPM8 KO sensory neuron experiments.  

      This important information is now included in the revised manuscript (Figure S2).

      (5) The DM-nitrophen Ca uncaging experiments are an interesting method to test Ca sensitivity of rapamycin, but the results make these experiments more complex to interpret. Ca has been shown to be an obligate cofactor for icilin sensitivity in TRPM8 under conditions where both the internal and external Ca concentrations are tightly controlled (Kuhn et al., 2009, doi: https://doi.org/10.1074/jbc.M806651200), which is necessary because TRPM8 allows Ca permeation through the pore when open. The large icilin-evoked currents in Figure 5A and 5B indicate that the effective intracellular calcium concentration is not zero prior to calcium uncaging, which may be high enough to mask any Ca-dependence of rapamycin that occurs at low Ca concentrations. Given this ambiguity, the inside-out patch clamp configuration would provide more control over the internal and external Ca concentration than is achieved in the Ca uncaging experiments. Because the authors have already demonstrated their ability to perform such experiments (Figure 2 panel B), it would be nice to see tests of Ca dependence using inside-out patch clamp.

      As was already shown in Figure 2, Rapamycin activates TRPM8 in inside-out patches, and these experiments were performed using calcium-free cytosolic and extracellular solutions. Note that earlier studies have already shown that icilin activates outward TRPM8 currents in the full absence of calcium: see e.g. Janssens et al. eLife, 2016. Chuang et al. 2004. In the case of Icilin, increased calcium further potentiates the current, which is more prominent for the inward current.

      In the Ca uncaging experiments, considering the Kd of DM-nitrophen of 5 nM, we expect that the intracellular calcium concentration before the UV flash would be approximately 15 nM. Taken together, both the inside-out experiments and the flash uncaging experiments confirm that rapamycin responses are not directly regulated by intracellular calcium, contrary to icilin.

      (6) Sequence conservation within TRPM channels could be used in combination with the binding pocket model and mutagenesis to predict rapamycin selectivity for TRPM8 over other TRPMs. For example, some important residues, specifically G805 and Q861, are not conserved in TRPM3, which agrees with the lack of rapamycin sensitivity observed in TRPM3 (Figure S1). Further sequence comparison would provide testable hypotheses for future exploration of rapamycin sensitivity in other TRPMs that could validate the proposed binding pocket.

      Thank you for the suggestion. We now indicate in the discussion that only some of the key residues are conserved and make suggestions for future studies.  

      (7) Please unify the color scheme across the figures to improve clarity.

      • The authors frequently use the colors blue, red, and green to represent menthol and rapamycin in the figures, but they are inconsistent in which one represents menthol and which represents rapamycin. It would be clearer for the audience if, for example, rapamycin is always represented with red and menthol is always represented with blue.  

      Thank you for pointing this out. We have made the coloring schemes more uniform.

      • In Figure 1, panel E, the coloring for Menthol and Pregnenolone Sulfate changes between the TRPM8+/+ and TRPM8-/- panels.  

      Thank you for pointing this out. We have updated the coloring schemes to ensure consistency between the TRPM8+/+ and TRPM8-/- panels.

      • Figure 3 B and E, perhaps color the plot background as a 3-color gradient (blue to white to red) rather than yellow and aqua. Center the white at the WT ratio, keeping the dashed line, with diverging gradients to, for example, blue for mutations that selectively affect menthol sensitivity and red for rapamycin.

      Thank you for the suggestion – we have changed the figure accordingly.  

      • Figure 4 panels A and B use the same color (green) to show two different things (menthol molecule and mutated residues that affect rapamycin sensitivity). It would be clearer for readers to change these colors to agree with a unified color scheme such that, for example, the menthol molecule is colored blue and the rapamycin-neighboring residues are colored red.

      Thank you for the suggestion. We have updated the figure to use a unified color scheme, with the menthol molecule now colored green and the rapamycin-neighboring residues colored cyan, to enhance clarity for readers.

      • I recommend adding a figure or panel that shows side chains for all mutations, colored by menthol/rapamycin selectivity, as indicated by the functional data in Figure 3B and 3E. This will highlight spatial patterns of the selective residues that are discussed in the text.

      Thank you for your suggestion, we added all the side residues in Figure S10.

      Minor points

      (1) It would be nice to have one more concentration data point in the middle of the dose response curve shown in Figure 1 panel B. The response is not saturating at the top or foot of the curve in Figure 1 panel D, precluding a confident fit to a two-state Boltzmann function.

      Instead of adding a single data point to this figure, we performed independent measurements on a plate reader system, comparing concentration responses at room temperature and 37 degrees. These data are now included as Figure S1.   

      (2) The cartoon in Figure 2 panel B should be made more accurate. For example, only the transmembrane helices should be depicted embedded in the membrane, not the whole protein including the intracellular domain. Because the experiment was performed with cells, change the orientation of TRPM8 in the cartoon to show the intracellular domain of the protein facing away from the extracellular side of the membrane where the rapamycin is applied.

      Thank you for this comment. We have corrected the cartoon accordingly

      (3) Perhaps put the yellow circles under or around the carbon atoms to which the identified hydrogen atoms belong in Figure 2 panel E and Figure 4 panel C. I found it difficult to visualize and compare the STTD NMR results with the predicted binding pocket.

      Thank you for the feedback. We have added yellow circles around the carbon atoms corresponding to the identified hydrogen atoms in Figure S9.  

      (4) Regarding the sentence on p. 12 beginning "In agreement with this notion..."

      • Include icilin, Cooling Agent-10, and WS-3 as other cooling agents whose sensitivity has been modulated by mutation of Y745

      • Cryosim-3 responses were not tested in either of the two papers cited; please add citation to Yin et al., 2022, doi: https://doi.org/10.1126/science.add1268 .

      • Other relevant papers include:

      – Malkia et al., 2009, doi: https://doi.org/10.1186/1744-8069-5-62 which includes molecular docking showing the hydroxyl group of menthol interacting with Y745

      – Beccari et al., 2017, doi: https://doi.org/10.1038/s41598-017-11194-0 Figure 5 shows disruption of icilin and Cooling Agent-10 sensitivity by Y745A

      – Palchevskyi et al., 2023, doi: https://doi.org/10.1038/s42003-023-05425-6 Figure 3 shows disruption of icilin, cooling agent-10, WS-3, and menthol sensitivity by Y745A o Plaza-Cayon et al., 2022, https://doi.org/10.1002%2Fmed.21920 Review of TRPM8 mutations

      • typo: Y754H should be Y745H

      Thank you for these suggestions. We have added the above references to the text and corrected the typo.

      (5) The authors use the competitive action of everolimus on rapamycin activation as evidence that the different macrolides are binding to the same binding pocket. In addition, prior work showed that Y745H and N799A mutations (which render TRPM8 insensitive to menthol and icilin, respectively) do not affect TRPM8 sensitivity to the structurally-related compound tacrolimus (Arcas et al., 2019). This is consistent with the docking and mutagenesis results presented here.

      Thank you for this valuable suggestion. We discuss these data in the revised version.

      (6) Rapamycin sensitivity has also been observed in TRPML1 (Zhang et al. 2019, doi: https://doi.org/10.1371/journal.pbio.3000252).

      We added a short reference to this interesting finding in the discussion.

      (7) The whole-cell currents are very large in several of the electrophysiology experiments (for example Figure 3 panel D and Figure S1), which could lead to artifacts of voltage errors as well as ion accumulation/depletion. However, because this paper is not relying on reversal potential measurements or trying to quantify V1/2, these errors are unlikely to affect the qualitative conclusions drawn.

      This is a fair point, but indeed unlikely to affect our main conclusions. Note that we compensated between 70 and 90% of the series resistance, so we don’t expect voltage errors exceeding ~10 mV.

      (8) Ligand sensitivity is frequently species-dependent in TRP channels, so it is interesting that multiple species were used here and that both human and mouse isoforms exhibit rapamycin sensitivity. It should be emphasized that human TRPM8 was used in the calcium imaging and electrophysiology experiments, as well as some docking models, while the mouse isoform was used in the sensory neuron experiments and a mutated avian isoform was used for some docking models.

      This information is available in the Methods and we believe it is clear for the readers.

      (9) Perhaps discuss the unclear mechanism of G805A action in icilin (but not menthol, cold, or praziquantel) sensitivity because it is not in direct contact with the ligand. For example, Yin et al., 2019 propose flexibility allowing Ca binding site and larger binding site for icilin.

      Yin et al. (2019) suggests that the G805A mutation impacts icilin sensitivity by influencing the flexibility of the binding site and possibly affecting calcium binding. In our study, we found that G805A significantly reduces rapamycin sensitivity, likely due to its direct role in the rapamycin binding pocket rather than affecting calcium binding. This is now briefly mentioned in the results section.

      (10) The Figure S1 legend indicates that n=5 for all panels, so please show normalized population IV curves rather than individual examples. Additionally, it would be interesting to see what happens when each agonist is co-applied with rapamycin. Does rapamycin potentiate or inhibit agonist activation in these channels and/or TRPM8?

      We believe that normalized population IVs are not ideal for representing whole-cell currents, considering the substantial variation in current densities. We therefore prefer to show example traces in Figure S3 of the revised version but include mean values of current densities for all tested cells in the text.

      While the effects of co-application of rapamycin with activating ligands could be of interest, we consider this somewhat outside the scope of the present manuscript. The combination of HEK293 cell experiments, along with results obtained in WT and TRPM8-deficient mice does, in our opinion, sufficiently describe the selectivity of rapamycin towards TRPM8 compared to other sensory TRP channels.

      (11) Figure S1 panel A does not contain units for Rapamycin or AITC concentrations.

      Thank you for pointing this out. The units were added to the figure.  

      (12) It would be nice if the authors characterized the different mutations as predicted to contribute to site 1 (D796, H845, Q861, based on Figure S4), site 2 (D796, M801, F847, and R851), and/or site 3 (F847, V849, and R851).

      The indicated mutants were all tested, as shown in Figure 3.

      (13) The numbering scheme in Figure S4 does not appear to match the residue numbers in the rest of the paper for certain residues (HIS-844 rather than H845, PHE-846 rather than F847, VAL-848 rather than V849, ARG-850 rather than R851, and GLN-860 rather than Q861), and labels are often overlapping and difficult to see. I also find the transparent spheres very difficult to distinguish from the transparent background, which makes it difficult to appreciate the STTD NMR data overlay.

      We apologize for the confusing numbering scheme. The lower numbers refer to the initial docking that was done using the avian TRPM8 ortholog. We have made a newer, clearer version of Figure S4 and inserted as Figure S9.  

      (14) Please superpose the Ligplots in Figure S5 panels E and F as described in the LigPlus manual (https://www.ebi.ac.uk/thornton-srv/software/LigPlus/manual/manual.html) to facilitate easier comparison.

      Thank you for the suggestion. We followed the suggestion to superpose the Ligplots as described but found that the result was visually cluttered and difficult to interpret. To avoid confusion, we instead decided to remove panels E and F from Figure S5, as we believe that the visualization in panels A-D is clear and informative.

      (15) Some n values are missing in figure legends.

      We checked all legends, and added n numbers were missing.

      (16) There is an inconsistent specification of error bars as SEM in the figure legends, though it is specified in methods.

      A question for my own edification: Here, you have looked at ligand interactions with the protein by saturating the protein resonances and observing transfer to the ligand. Would it be possible to instead saturate lipid or solute resonances and observe transfer to a ligand? I am curious whether this would be one way to measure equilibrium partitioning of ligand into a membrane and/or determine the effective concentration of a ligand in the membrane. Additionally, could one determine whether the compound is fully partitioned into the center of the membrane or just sitting on the surface?

      The reviewer highlights an interesting aspect. The widely used WaterLOGSY NMR experiment (doi: 10.1023/a:1013302231549) saturates water molecules then the magnetization is transferred to the ligand of interest. Characteristic changes in ligand resonances are observed in the case of a binding event with proteins. On the other hand, the selective saturation of lipids is -while theoretically possible –technically challenging mainly because of the inherent low signal-dispersion of lipids and peak overlapping with ligand resonances. Additionally, lipid systems are more dynamic compared to proteins and ligand-lipid interactions could be weaker and less specific, significantly affecting the sensitivity of STD experiments.

      Reviewer #2 (Recommendations For The Authors):

      Major:

      • Is it feasible to test rapamycin on TRPM8 with single-channel recording? This will allow us to better probe the mechanism of rapamycin activation and compare it with menthol, with parameters of singlechannel conductance and maximal open probability.

      In our experience, it is very difficult to obtain single-channel recordings from TRPM8. The channel expresses at high densities, typically leading to patches contain multiple channels, making a proper analysis of mean open and closed times very difficult. Therefore, we have decided not to include such measurements in the manuscript.

      • The authors classified rapamycin as a type I agonist, the type that stabilizes the open conformation, same as menthol but more prominent. Does that indicate that rapamycin work synergistically (rather than independently) with menthol, because co-application of them can allow them to add to each other in stabilizing the open conformation? I wonder if the authors agree that this could be tested with experiments as in Figure S3, by showing a much more prolonged deactivation with co-application of menthol and rapamycin than applying each alone.

      Thank you for the insightful suggestion. We conducted co-application experiments, and our results show that the deactivation time is indeed significantly prolonged when both compounds are applied together compared to each alone. In fact, very little deactivation is seen when both compounds are co-applied, which made it virtually impossible to perform reliable fits to the deactivation time course for the Menthol+Rapamycin condition. Instead, we have now included summary results showing the percentage of deactivation after 100 ms. We included these findings in FigureS8.  

      • It could be tested whether rapamycin activation of TRPM8 requires or overrides the requirement of PIP2 with inside-out patch by briefly exposing the patch to poly-lysine to sequester PIP2.

      This is certainly a good suggestion for further follow-up studies. However, we considered that examination of the (potential) interaction between ligands and PIP2 was outside the scope of the current manuscript.

      • Figure 1C suggests that the authors test rapamycin when there is a relatively high baseline TRPM8 activation (prior to rapamycin) activation. This raises the possibility that rapamycin is more a potentiator than an activator. I wonder if the following two experiments could address it: (1) perfuse rapamycin while holding at different membrane potentials, wash-off rapamycin in the solution and quickly (in a few seconds) test the activated current magnitude (before rapamycin dissociation), to compare whether a more depolarized membrane potential (high baseline open probability) allows rapamycin to potentiate more. (2) Perform the experiment at a higher temperature (low baseline open probability) and test whether rapamycin EC50 shifts to the right.

      Thank you for the thoughtful suggestion. Overall, we are not really in favor of making a distinction between a potentiator and an activator since it is not really feasible to create a situation where TRPM8 activity is zero. As suggested, we performed the dose response experiment at a higher temperature (37 °C) and observed that rapamycin’s EC<sub>50</sub> shifts to the right FigureS2. This is similar to what has been observed for menthol on TRPM8 and for many other ligands on other temperature-sensitive TRP channels.

      Minor:

      (1) The author should report hill coefficient together with EC50 when showing dose-responses.

      We have added Hill coefficients for all the fits.

      (2) In Figure 1 (E, F), it might be clearer to use Venn-diagram to show whether there is overlapping among rapamycin-, menthol-, and cinnamaldehyde-responsive neurons. According to the authors' explanation, we can predict that rapamycin-insensitive, menthol-sensitive neurons should predominantly be cinnamaldehyde-responsive.

      Thank you for your suggestion. In these experiments, we applied several agonists and the combination of them would result in a visually crowded Venn diagram difficult to interpret. However, we agree, with the reviewer’s suggestion, and discuss the percentage of the cinnamaldehyde+ neurons in the rapa- menthol+ population in Trpm8<sup>-/-</sup> neurons.

      (3) In Figure 3(C), since F847 does not respond to either menthol or rapamycin, it should be excluded from (B). Otherwise it is misleading.

      Thank you for pointing this out. To clarify, we have included a calcium imaging trace for the F847 mutant, demonstrating a clear response to rapamycin in FigureS9. This additional data highlights that F847 does respond to rapamycin, albeit with a more modest response amplitude. This is now also clarified in the results section.  

      (4) The word "potency" in pharmacology usually refers to a smaller EC50 number in dose-dependent experiments. In "Effect of rapamycin analogs on TRPM8" session, the authors use "potency" to refer to response to a single-dose experiment of different compounds. The experiment does not measure potency.

      Thank you for pointing out this mistake. We have corrected the text and replaced “potency” with “efficacy”.

      (5)  "2-methoxyl-" is misspelled in the text body.

      We have corrected the typo.

      (6) It will be nice to include "vehicle" in Figure 6B, or alternatively normalize all individual traces to vehicle. In Figure 6C and D, everolimus has almost no effect with compared to vehicle, and should not be shown as if it had ~8% in Figure 6B.

      We have added the vehicle values to Figure 6B from the same experiments.

      Reviewer #3 (Recommendations For The Authors):

      (1) The NMR method presented here as novel and employed to identify a proposed molecule bound to a membrane protein (TRPM8 in this case) is not well explained and presented. Since several spectra need to be subtracted, the authors should present the raw data and the results of the subtractions step by step. Also, it seems that the height of the peaks in each spectra will be highly variable and thus a reliable criterion employed to scale spectra before subtraction. None of these problems are discussed of described.

      The reviewer is right, that the data transparency should be improved and due to the high molecular complexity of the samples the size of the STD effects should be carefully scaled. We carried out additional experimental replicas on new samples and addressed the inherent sample/peak height variability by rescaling the STD effects based on reference <sup>1</sup>H measurements. We provided supplementary spectra of the reference experiments without saturation (Figure S5) and the computed STTD spectra from three parallel NMR sessions (Figure S6). We changed panel C of Figure 2 in the main text and provided all the STD and the computed STDD and STTD spectra recorded on one set of NMR experiments. We added the following paragraph to the main text: “To address the effect of the inherent variability of cellular samples on peak heights, STD effects were normalized based on the comparison of independent <sup>1</sup>H experiments (Figure S5). Three STTD replicates were computed, unambiguously confirming direct binding to TRPM8 in two datasets (Figure S6 A,B)”.

      Importantly since this signal subtraction method is proposed as a new development, control experiments employing well-established pairs of ligand and membrane protein receptor should be performed to demonstrate the reliability of the method.

      We agree with the reviewer, that the STTD experiment as a new development needs further validation, however, this paper is a preliminary demonstration of a new strategy building on the well-established STD and STDD NMR methodologies. Our group is actively engaged in studying additional biological samples to enhance our understanding of the applicability of STTD NMR. These efforts also aim to address challenges such as sample and spectral complexity by refining and standardizing the proposed workflow.

      (2) The tail currents shown in supplementary figure 3 are clearly not monoexponential. The fit to a single exponential can be seen to be inadequate and thus the comparison of kinetics of control, rapamycin and menthol is incorrect. At least two exponentials should be fitted and their values compared.

      We agree that the decay in the (combined) presence of agonists deviates from a simple monoexponential behavior. While we agree that fitting with two (or more) exponentials would provide a better fit, this also comes with greater variations/uncertainties in the fit parameters. This is particularly the case when inactivation is very slow and incomplete, or when the difference between slow and fast exponential time constants is <5, as seen with rapamycin and rapamycin +menthol. Therefore, we decided to provide monoexponential time constants as a proxy to describe the clear slowing down of activation and deactivation time courses in the presence of Type I agonists.   

      Also related to this aspect, recordings of TRPM8 currents can not be leak subtracted with a p/n protocol, thus a large fraction of the initial tail current must be the capacitive transient. There is no indication in the methods of how was this dealt with for the fitting of tail currents.

      As explained in the methods, capacitive transients and series resistance were maximally compensated. Therefore, we do not agree that a large fraction of the initial tail current must be capacitive. This can also be clearly seen in experiment such as Figure 1C, where the inward tail current is fully abolished in the presence of a TRPM8 antagonist. Likewise, very small and rapidly inactivating tail currents can be seen during voltage steps under control conditions (e.g. Figure S7  and S8 in the revised version).  

      (3) The docking procedure employed, as the authors show, is not appropriate for membrane proteins since it does not include a lipid membrane. It is not clear in the methods section if the MD minimization described applies only to the rapamycin molecule or to rapamycin bound to TRPM8.  

      It is also not clear if the important residue Q861 (and other residues that are identified as interacting with rapamycin) were identified from dockings or proposed based on other evidence.

      (4) Identifying amino acid residues that diminish the response to a ligand, does not uniquely imply that they form a binding site or even interact with said ligand. It is entirely possible that they can be involved in the allosteric networks involved in the activating conformational change. This caveat should be clearly posited by the authors when discussing their results.

      In our study, we identified several residues that significantly reduce the response to rapamycin when mutated, while retaining robust responses to menthol, which indicates that these mutations do not affect crucial conformational changes leading to channel gating. While our cumulative data suggest that these residues may be involved in direct interaction with rapamycin, we recognize the alternative possibility that they allosterically affect rapamycin-induced channel gating. This is now clearly stated in the first paragraph of the discussion.

    1. eLife Assessment

      This important study by Liu et al. presents a comprehensive structure-function analysis of the presynaptic protein UNC-13, leading to new insights into how its distinct domains control neurotransmitter release. The methods, data, and analyses are convincing, and the genetic and electrophysiological approaches support many of their conclusions. The work will be of interest to neuroscientists studying synaptic transmission, as it provides a foundation for future mechanistic studies of Munc13/UNC-13 family proteins.

    2. Joint Public Review:

      Summary:

      In this manuscript, the authors investigate how different domains of the presynaptic protein UNC-13 regulate synaptic vesicle release in the nematode C. elegans. By generating numerous point mutations and domain deletions, they propose that two membrane-binding domains (C1 and C2B) can exhibit "mutual inhibition," enabling either domain to enhance or restrain transmission depending on its conformation. The authors also explore additional N-terminal regions, suggesting that these domains may modulate both miniature and evoked synaptic responses. From their electrophysiological data, they present a "functional switch" model in which UNC-13 potentially toggles between a basal state and a gain-of-function state, though the physiological basis for this switch remains partly speculative.

      Strengths:

      (1) The authors conduct a thorough exploration of how mutations in the C1, C2B, and other regulatory domains affect synaptic transmission. This includes single, double, and triple mutations, as well as domain truncations, yielding a large, informative dataset.

      (2) The study includes systematically measuring both spontaneous and evoked synaptic currents at neuromuscular junctions, under various experimental conditions (e.g., different Ca²⁺ levels), which strengthens the reliability of their functional conclusions.

      (3) Findings that different domain disruptions produce distinct effects on mEPSCs, mIPSCs, and evoked EPSCs suggest UNC-13 may adopt an elevated functional state to regulate synaptic transmission.

      Weaknesses:

      It remains unclear whether the various domain alterations truly converge on a single "gain-of-function" state or instead represent multiple pathways for enhancing UNC-13 activity. Different mutations selectively affect spontaneous or evoked release, suggesting that each variant may not share the same underlying mechanism. Moreover, many conclusions rely on combining domain deletions or point mutations, yet the electrophysiological data show distinct outcomes across EPSCs, IPSCs, mini, and evoked responses. This raises questions about whether these manipulations all act on the same pathway and whether their observed additivity or suppression genuinely reflects a single mechanistic process. A unifying model-or at least a clearer explanation of why the authors infer one mechanistic state across different domain manipulations would strengthen the paper's conclusions.

      The manuscript proposes that UNC-13 toggles from a basal to a "gain-of-function" state under normal synaptic activity. However, it does not address when or how this switch might occur in vivo, since it is demonstrated principally via artificial mutations. Providing direct evidence or additional discussion of such switching under physiological conditions would be particularly informative.

      What is the physiological significance of the proposed gain-of-function state? The data suggest that certain mutants (e.g., HK+D1-5N) lacking the gain-of-function state can still support synaptic transmission at wild-type levels. How do the authors reconcile this with the idea that the gain-of-function state plays a critical role at the synapse?

      The authors determined the fluorescence intensity of mApple-tagged UNC-13 variants (Figure 1J-K and Figure 7J-K), finding no significant changes compared to the wild-type. However, a more detailed analysis of the density or distribution of fluorescent puncta in axons could clarify whether certain mutations alter the localization of UNC-13 at synapses. Demonstrating colocalization with wild-type UNC-13 (or another presynaptic marker) would help rule out mislocalization effects.

      The study mainly relies on extrachromosomal transgenes, which can show variable copy numbers and expression levels among individual worm strains. This variability might complicate interpretation, as differences in expression could mask or exaggerate certain phenotypes.

      Finally, the discussion is somewhat diffused. Streamlining the text to focus on the most direct connections would help readers pinpoint the key conclusions and open questions.

    1. eLife Assessment

      This important study uses advanced computational methods to elucidate how environmental dielectric properties influence the interaction strengths of tyrosine and phenylalanine in biomolecular condensates. The evidence supporting the claims of the authors is solid, as the simulations are performed rigorously providing mechanistic insights into the origin of the differences between the two aromatic amino acids considered. This study will be of broad interest to researchers studying biomolecular phase separation.

    2. Reviewer #1 (Public review):

      This is an interesting and timely computational study using molecular dynamics simulation as well as quantum mechanical calculation to address why tyrosine (Y), as part of an intrinsically disordered protein (IDP) sequence, has been observed experimentally to be stronger than phenylalanine (F) as a promoter for biomolecular phase separation. Notably, the authors identified the aqueous nature of the condensate environment and the corresponding dielectric and hydrogen bonding effects as a key to understanding the experimentally observed difference. This principle is illustrated by the difference in computed transfer free energy of Y- and F-containing pentapeptides into a solvent with various degrees of polarity. The elucidation offered by this work is important. The computation appears to be carefully executed, the results are valuable, and the discussion is generally insightful. However, there is room for improvement in some parts of the presentation in terms of accuracy and clarity, including, e.g., the logic of the narrative should be clarified with additional information (and possibly additional computation), and the current effort should be better placed in the context of prior relevant theoretical and experimental works on cation-π interactions in biomolecules and dielectric properties of biomolecular condensates. Accordingly, this manuscript should be revised to address the following, with added discussion as well as inclusion of references mentioned below.

      (1) Page 2, line 61: "Coarse-grained simulation models have failed to account for the greater propensity of arginine to promote phase separation in Ddx4 variants with Arg to Lys mutations (Das et al., 2020)". As it stands, this statement is not accurate, because the cited reference to Das et al. showed that although some coarse-grained models, namely the HPS model of Dignon et al., 2018 PLoS Comput did not capture the Arg to Lys trend, the KH model described in the same Dignon et al. paper was demonstrated by Das et al. (2020) to be capable of mimicking the greater propensity of Arg to promote phase separation than Lys. Accordingly, a possible minimal change that would correct the inaccuracy of this statement in the manuscript would be to add the word "Some" in front of "coarse-grained simulation models ...", i.e., it should read "Some coarse-grained simulation models have failed ...". In fact, a subsequent work [Wessén et al., J Phys Chem B 126: 9222-9245 (2022)] that applied the Mpipi interaction parameters (Joseph et al., 2021, already cited in the manuscript) showed that Mpipi is capable of capturing the rank ordering of phase separation propensity of Ddx4 variants, including a charge scrambled variant as well as both the Arg to Lys and the Phe to Ala variants (see Figure 11a of the above-cited Wessén et al. 2022 reference). The authors may wish to qualify their statements in the introduction to take note of these prior results. For example, they may consider adding a note immediately after the next sentence in the manuscript "However, by replacing the hydrophobicity scales ... (Das et al., 2020)" to refer to these subsequent findings in 2021-2022.

      (2) Page 8, lines 285-290 (as well as the preceding discussion under the same subheading & Figure 4): "These findings suggest that ... is not primarily driven by differences in protein-protein interaction patterns ..." The authors' logic in terms of physical explanation is somewhat problematic here. In this regard, "Protein-protein interaction patterns" appear to be a straw man, so to speak. Indeed, who (reference?) has argued that the difference in the capability of Y and F in promoting phase separation should be reflected in the pairwise amino acid interaction pattern in a condensate that contains either only Y (and G, S) and only F (and G, S) but not both Y and F? Also, this paragraph in the manuscript seems to suggest that the authors' observation of similar contact patterns in the GSY and GSF condensates is "counterintuitive" given the difference in Y-Y and F-F potentials of mean force (Joseph et al., 2021); but there is nothing particularly counterintuitive about that. The two sets of observations are not mutually exclusive. For instance, consider two different homopolymers, one with a significantly stronger monomer-monomer attraction than the other. The condensates for the two different homopolymers will have essentially the same contact pattern but very different stabilities (different critical temperatures), and there is nothing surprising about it. In other words, phase separation propensity is not "driven" by contact pattern in general, it's driven by interaction (free) energy. The relevant issue here is total interaction energy or the critical point of the phase separation. If it is computationally feasible, the authors should attempt to determine the critical temperatures for the GSY condensate versus the GSF condensate to verify that the GSY condensate has a higher critical temperature than the GSF condensate. That would be the most relevant piece of information for the question at hand.

      (3) Page 9, lines 315-316: "...Our ε [relative permittivity] values ... are surprisingly close to that derived from experiment on Ddx4 condensates (45{plus minus}13) (Nott et al., 2015)". For accuracy, it should be noted here that the relative permittivity provided in the supplementary information of Nott et al. was not a direct experimental measurement but based on a fit using Flory-Huggins (FH), but FH is not the most appropriate theory for a polymer with long-spatial-range Coulomb interactions. To this reviewer's knowledge, no direct measurement of relative permittivity in biomolecular condensates has been made to date. Explicit-water simulation suggests that the relative permittivity of Ddx4 condensate with protein volume fraction ≈ 0.4 can have a relative permittivity ≈ 35-50 (Das et al., PNAS 2020, Fig.7A), which happens to agree with the ε = 45{plus minus}13 estimate. This information should be useful to include in the authors' manuscript.

      (4) As for the dielectric environment within biomolecular condensates, coarse-grained simulation has suggested that whereas condensates formed by essentially electric neutral polymers (as in the authors' model systems) have relative permittivities intermediate between that of bulk water and that of pure protein (ε = 2-4, or at most 15), condensates formed by highly charged polymers can have relative permittivity higher than that of bulk water [Wessén et al., J Phys Chem B 125:4337-4358 (2021), Fig.14 of this reference]. In view of the role of aromatic residues (mainly Y and F) in the phase separation of IDPs such as A1-LCD and LAF-1 that contain positively and negatively charged residues (Martin et al., 2020; Schuster et al., 2020, already cited in the manuscript), it should be useful to address briefly how the relationship between the relative phase-separation promotion strength of Y vs F and dielectric environment of the condensate may or may not be change with higher relative permittivities.

      (5) The authors applied the dipole moment fluctuation formula (Eq.2 in the manuscript) to calculate relative permittivity in their model condensates. Does this formula apply only to an isotropic environment? The authors' model condensates were obtained from a "slab" approach (page 4 and thus the simulation box has a rectangular geometry. Did the authors apply Equation 2 to the entire simulation box or only to the central part of the box with the condensate (see, e.g., Figure 3C in the manuscript). If the latter is the case, is it necessary to use a different dipole moment formula that distinguishes between the "parallel" and "perpendicular" components of the dipole moment (see, e.g., Equation 16 in the above-cited Wessén et al. 2021 paper). A brief added comment will be useful.

      (6) With regard to the general role of Y and F in the phase separation of biomolecules containing positively charged Arg and Lys residues, the relative strength of cation-π interactions (cation-Y vs cation-F) should be addressed (in view of the generality implied by the title of the manuscript), or at least discussed briefly in the authors' manuscript if a detailed study is beyond the scope of their current effort. It has long been known that in the biomolecular context, cation-Y is slightly stronger than cation-F, whereas cation-tryptophan (W) is significantly stronger than either cation-Y and cation-F [Wu & McMahon, JACS 130:12554-12555 (2008)]. Experimental data from a study of EWS (Ewing sarcoma) transactivation domains indicated that Y is a slightly stronger promoter than F for transcription, whereas W is significantly stronger than either Y or F [Song et al., PLoS Comput Biol 9:e1003239 (2013)]. In view of the subsequent general recognition that "transcription factors activate genes through the phase-separation capacity of their activation domain" [Boija et al., Cell 175:1842-1855.e16 (2018)] which is applicable to EWS in particular [Johnson et al., JACS 146:8071-8085 (2024)], the experimental data in Song et al. 2013 (see Figure 3A of this reference) suggests that cation-Y interactions are stronger than cation-F interactions in promoting phase separation, thus generalizing the authors' observations (which focus primarily on Y-Y, Y-F and F-F interactions) to most situations in which cation-Y and cation-F interactions are relevant to biomolecular condensation.

      (7) Page 9: The observation of weaker effective F-F (and a few other nonpolar-nonpolar) interactions in a largely aqueous environment (as in an IDP condensate) than in a nonpolar environment (as in the core of a folded protein) is intimately related to (and expected from) the long-recognized distinction between "bulk" and "pair" as well as size dependence of hydrophobic effects that have been addressed in the context of protein folding [Wood & Thompson, PNAS 87:8921-8927 (1990); Shimizu & Chan, JACS 123:2083-2084 (2001); Proteins 49:560-566 (2002)]. It will be useful to add a brief pointer in the current manuscript to this body of relevant resources in protein science.

    3. Reviewer #2 (Public review):

      Summary:

      In this preprint, De Sancho and López use alchemical molecular dynamics simulations and quantum mechanical calculations to elucidate the origin of the observed preference of Tyr over Phe in phase separation. The paper is well written, and the simulations conducted are rigorous and provide good insight into the origin of the differences between the two aromatic amino acids considered.

      Strengths:

      The study addresses a fundamental discrepancy in the field of phase separation where the predicted ranking of aromatic amino acids observed experimentally is different from their anticipated rankings when considering contact statistics of folded proteins. While the hypothesis that the difference in the microenvironment of the condensed phase and hydrophobic core of folded proteins underlies the different observations, this study provides a quantification of this effect. Further, the demonstration of the crossover between Phe and Tyr as a function of the dielectric is interesting and provides further support for the hypothesis that the differing microenvironments within the condensed phase and the core of folded proteins is the origin of the difference between contact statistics and experimental observations in phase separation literature. The simulations performed in this work systematically investigate several possible explanations and therefore provide depth to the paper.

      Weaknesses:

      While the study is quite comprehensive and the paper well written, there are a few instances that would benefit from additional details. In the methods section, it is unclear as to whether the GGXGG peptides upon which the alchemical transforms are conducted are positioned restrained within the condensed/dilute phase or not. If they are not, how would the position of the peptides within the condensate alter the calculated free energies reported? It would also be interesting to see what the variation in the transfer of free energy is across multiple independent replicates of the transform to assess the convergence of the simulations. Additionally, since the authors use a slab for the calculation of these free energies, are the transfer free energies from the dilute phase to the interface significantly different from those calculated from the dilute phase to the interior of the condensate? The authors mention that the contact statistics of Phe and Tyr do not show significant difference and thereby conclude that the more favorable transfer of Tyr primarily originates from the dielectric of the condensate. However, the calculation of contacts neglects the differences in the strength of interactions involving Phe vs. Tyr. Though the authors consider the calculation of energy contact formation later in the manuscript, the scope of these interactions are quite limited (Phe-Phe, Tyr-Tyr, Tyr-Amide, Phe-Amide) which is not sufficient to make a universal conclusion regarding the underlying driving forces. A more appropriate statement would be that in the context of the minimal peptide investigated the driving force seems to be the difference in dielectric. However, it is worth mentioning that the authors do a good job of mentioning some of these caveats in the discussion section.

    4. Reviewer #3 (Public review):

      Summary:

      In this study, the authors address the paradox of how tyrosine can act as a stronger sticker for phase separation than phenylalanine, despite phenylalanine being higher on the hydrophobicity scale and exhibiting more prominent pairwise contact statistics in folded protein structures compared to tyrosine.

      Strengths:

      This is a fascinating problem for the protein science community with special relevance for the biophysical condensate community. Using atomistic simulations of simple model peptides and condensates as well as quantum calculations, the authors provide an explanation that relies on the dielectric constant of the medium and the hydration level that either tyrosine or phenylalanine can achieve in highly hydrophobic vs. hydrophilic media. The authors find that as the dielectric constant decreases, phenylalanine becomes a stronger sticker than tyrosine. The conclusions of the paper seem to be solid, it is well-written and it also recognises the limitations of the study. Overall, the paper represents an important contribution to the field.

      Weaknesses:

      How can the authors ensure that a condensate of GSY or GSF peptides is a representative environment of a protein condensate? First, the composition in terms of amino acids is highly limited, second the effect of peptide/protein length compared to real protein sequences is also an issue, and third, the water concentration within these condensates is really low as compared to real experimental condensates. Hence, how can we rely on the extracted conclusions from these condensates to be representative for real protein sequences with a much more complex composition and structural behaviour?

    1. eLife Assessment

      This study presents important methodologies for repeated brain ultrasound localization microscopy (ULM) in awake mice and a set of results indicating that wakefulness reduces vascularity and blood flow velocity. The data supporting these findings are solid. This study is relevant for scientists investigating vascular physiology in the brain.

    2. Reviewer #1 (Public review):

      Summary:

      Wang and Colleagues present a study aimed at demonstrating the feasibility of repeated ultrasound localization microscopy (ULM) recording sessions on mice chronically implanted with a cranial window transparent to US. They provided quantitative information on their protocol, such as the required number of Contrast enhancing microbubbles (MBs) to get a clear image of the vasculature of a brain coronal section. Also, they quantified the co-registration quality over time-distant sessions and the vasodilator effect of isoflurane.

      Strengths:

      Strengths: the study showed a remarkable performance in recording precisely the same brain coronal section over repeated imaging sessions. In addition, it sheds light on the vasodilator effect of isoflurane (an anesthetic whose effects are not fully understood) on the different brain vasculature compartments, although, as the Authors stated, some insights in this aspect have already been published with other imaging techniques. The experimental setting and protocol are very well described.

      Wang and co-authors submitted a revised version of their study, which shows improvements in the clarity of the data description.<br /> However, the flaws and limitations of this study are substantially unchanged.

      The main issues are:<br /> - Statistics are still inadequate. The TOST test proposed in this revised version is not equivalent to an ANOVA. Indeed, multivariate analyses should be the most appropriate, given that some quantifications were probably made on multiple vessels from different mice. The 3 reviewers mentioned the flaws in statistics as the primary concern.<br /> - No new data has been added, such as testing other anesthetics.<br /> - The Authors still insist on using the term Vascularity which they define as: 'proportion of the pixel count occupied by blood vessels within each ROI, obtained by binarizing the ULM vessel density maps and calculating the percentage of the pixels with MB signal.'. Why not use apparent cerebral blood volume or just CBV? Introducing an unnecessary and redundant term is not scientifically acceptable. In this revised version, vascularity is also used to indicate a higher vascular density (Line 275), which does not make sense: blood vessels do not generate from the isoflurane to the awake condition in a few minutes. Rev2 also raised this point.<br /> - The long-term recordings mentioned by the Authors refer to the 3-week time frame analyzed in this study. However, within each acquisition, the time available from imaging is only a few minutes (< 10', referring to most of the plots showing time courses) after the animals' arousal from isoflurane and before bubbles disappear. This limitation should be acknowledged.<br /> - The more precise description of the number of mice and blood vessels analyzed in Figure 6 makes it apparent the limited number of independent samples used to support the findings of this work. A limitation that should be acknowledged. The newly provided information added as Supplementary Figure 1 should be moved to the main text, eventually in the figure legends. The limited data in support of the findings was also highlighted by Rev2 and, indirectly, by Rev3.

    3. Reviewer #2 (Public review):

      Summary:

      The authors present a very interesting collection of methods and results using brain ultrasound localization microscopy (ULM) in awake mice. They emphasize the effect of the level of anesthesia on the quantifiable elements assessable with this technique (i.e. vessel diameter, flow speed, in veins and arteries, area perfused, in capillaries) and demonstrate the possibility of achieving longitudinal cerebrovascular assessment in one animal during several weeks with their protocol.<br /> The authors made a good rewriting of the article based on the reviewers' comments. One of the message of the first version of the manuscript was that variability in measurements (vessel diameter, flow velocity, vascularity) were much more pronounced under changes of anesthesia than when considering longitudinal imaging across several weeks. This message is now not quite mitigated, as longitudinal imaging seems to show a certain variability close to the order of magnitude observed under anesthesia. In that sense, the review process was useful in avoiding hasty conclusion and calls for further caution in ULM awake longitudinal imaging, in particular regarding precision of positioning and cancellation of tissue motion.

      Strengths:

      Even if the methods elements considered separately are not new (brain ULM in rodents, setup for longitudinal awake imaging similar to those used in fUS imaging, quantification of vessel diameters/bubble flow/vessel area), when masterfully combined as it is done in this paper, they answer two questions that have been long-running in the community: what is the impact of anesthesia on the parameters measured by ULM (and indirectly in fUS and other techniques)? Is it possible to achieve ULM in awake rodents for longitudinal imaging? The manuscript is well constructed, well written, and graphics are appealing.<br /> The manuscript has been much strengthened by the round of review, with more animals for the longitudinal imaging study.

      Weaknesses:

      Some weaknesses remain, not hindering the quality of the work, that the authors might want to answer or explain.<br /> - When considering fig 4e and fig 4j together: it seems that in fig 4e the vascularity reduction in the cortical ROI is around 30% for downward flow, and around 55% for upward flow; but when grouping both cortical flows in fig 4j, the reduction is much smaller (~5%), even at the individual level (only mouse 1 is used in fig 4e). Can you comment on that?<br /> - When considering fig4e, fig 4j, fig6e and fig6i altogether, it seems that vascularity can be highly variable, whether it be under anesthesia or vascular imaging, with changes between 5 to 40%. Is this vascularity quantification worth it (namely, reliable for example to quantify changes in a pathological model requiring longitudinal imaging)?

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1 (Public Review):

      • While the title is fair with respect to the data shown, in the summary and the rest of the paper, the comparison between anesthetized and awake conditions is systematically stated, while more caution should be used.

      First, isoflurane is one of the (many) anesthetics commonly used in pre-clinical research, and its effect on the brain vasculature cannot be generalized to all the anesthetics. Indeed, other anesthesia approaches do not produce evident vasodilation; see ketamine + medetomidine mixtures. Second, the imaged awake state is head-fixed and body-constrained in mice. A condition that can generate substantial stress in the animals. In this study, there is no evaluation of the stress level of the mice. In addition, the awake imaging sessions were performed a few minutes after the mouse woke up from isoflurane induction, which is necessary to inject the MB bolus. It is known that the vasodilator effects of isoflurane last a long time after its withdrawal. This aspect would have influenced the results, eventually underestimating the difference with respect to the awake state.

      These limitations should be clearly described in the Discussion.

      Looking at Figure 2e, it takes more than 5' to reach the 5 Millions MB count useful for good imaging. However, the MB count per pixel drops to a few % at that time. This information tells me that (i) repeated measurements are feasible but with limited brain coverage since a single 'wake up' is needed to acquire a single brain section and (ii) this approach cannot fit the requirements of functional ULM that requires to merge the responses to multiple stimuli to get a complete functional image. Of course, a chronic i.v. catheter would fix the issue, but this configuration is not trivial to test in the experimental setup proposed by the authors, hindering the extension of the approach to fULM.

      Thank you for highlighting these limitations, as they address aspects that were not fully considered during the experimental design and manuscript writing. In response, we have added the following paragraphs to the discussion section, addressing these limitations of our study:

      (Line 310) “Although isoflurane is widely used in ultrasound imaging because it provides long-lasting and stable anesthetic effects, it is important to note that the vasodilation observed with isoflurane is not representative of all anesthetics. Some anesthesia protocols, such as ketamine combined with medetomidine, do not produce significant vasodilation and are therefore preferred in experiments where vascular stability is essential, such as functional ultrasound imaging(47). Therefore, in future studies, it would be valuable to design more rigorous control experiments with larger sample sizes to systematically compare the effects of isoflurane anesthesia, awake states, and other anesthetics that do not induce vasodilation on cerebral blood flow.

      Our proposed method enabled repeatable longitudinal brain imaging over a three-week period, addressing a key limitation of conventional ULM imaging and offering potential for various preclinical applications. However, there are still some limitations in this study. 

      One of the limitations is the lack of objective measures to assess the effectiveness of head-fix habituation in reducing anxiety. This may introduce variability in stress levels among mice. Recent studies suggest that tracking physiological parameters such as heart rate, respiratory rate, and corticosterone levels during habituation can confirm that mice reach a low stress state prior to imaging(48). This approach would be highly beneficial for future awake imaging studies. Furthermore, alternative head-fixation setups, such as air-floated balls or treadmills, which allow the free movement of limbs, have been shown to reduce anxiety and facilitate natural behaviors during imaging(30). Adopting these approaches in future studies could enhance the reliability of awake imaging data by minimizing stress-related confounds.

      Another limitation of this study is the potential residual vasodilatory effect of isoflurane anesthesia on awake imaging sessions. The awake imaging sessions were conducted shortly after the mice had emerged from isoflurane anesthesia, required for the MB bolus injections. The lasting vasodilatory effects of isoflurane may have influenced vascular responses, potentially contributing to an underestimation of differences in vascular dynamics between anesthetized and awake state. Future applications of awake ULM in functional imaging using an indwelling jugular vein catheter presents a promising alternative to enable more accurate functional imaging in awake animals, addressing current limitations associated with anesthesia-induced vascular effects.”

      • Statistics are often poor or not properly described. 

      The legend and the text referring to Figure 2 do not report any indication of the number of animals analyzed. I assume it is only one, which makes the findings strongly dependent on the imaging quality of THAT mouse in THAT experiment. Three mice have been displayed in Figure 3, as reported in the text, but it is not clear whether it is a mouse for each shown brain section. Figure 5 reports quantitative data on blood vessels in awake VS isoflurane states but: no indication about the number of tested mice is provided, nor the number of measured blood vessels per type and if statistics have been done on mice or with a multivariate method.

      Also, a T-test is inappropriate when the goal is to compare different brain regions and blood vessel types.

      Similar issues partially apply to Figure 6, too.

      Thank you for bringing this to our attention. 

      We acknowledge that the statistical analyses were not clearly explained in the original version. In the revised manuscript, we have ensured that the statistical methods are clearly described. 

      (Fig.4 caption) “b,c, Comparisons of vessel diameter (b) and flow velocity (c) for the selected arterial and venous segments. Statistical analysis was conducted using t-test at each measurement point along the segments.”

      (Fig.6 caption) “b,c, Comparisons of vessel diameter (b) and flow velocity (c) for the selected arterial and venous segments. Statistical analysis was conducted using the two one-sided test (TOST) procedure, which evaluates the null hypothesis that the difference between the two weeks is larger than three times the standard deviation of one week.”

      Additionally, we corrected an error in the previous comparison of the violin plots on flow velocities, where a t-test was incorrectly applied; this has now been removed.

      We acknowledge that the original version did not clearly indicate the numbers of animals in the statistical analysis. In the revised manuscript, we have added Supplementary Figure 1 to specify the mice used, and we have labeled each mouse accordingly in the figures or captions. In the revised Figures 4 and 6, we have ensured that each quantitative analysis figure or its caption clearly indicate the specific mice.

      For original Figures 1 and 2, these are presented as case studies to illustrate the methodology. Since the anesthesia time required for tail vein injection for each animal varies slightly, it is challenging to have the consistent time taken for each mouse to recover from anesthesia across all mice. For instance, in Figure 1, the mouse took nearly 500 seconds to recover from anesthesia, but this duration is not consistent across all animals, which is a limitation of the bolus injection technique. We have noted this point in the discussion (discussion on the limitation of bolus injection), and we have also clarified in the results section and figure captions that these figures represent a case study of a single mouse rather than a standardized recovery time for all animals.

      We further clarified this point in the end of the Figure 2 caption:

      (Fig.2 caption) “This figure presents a case study based on the same mouse shown in Fig 1. The x-axis for d-f begins at 500 seconds because, at this point, the mouse’s pupil size stabilized, indicating it had recovered to an awake state. Consequently, ULM images were accumulated starting from this time. It is important to note that not every mouse requires 500 seconds to fully awaken; the time to reach a stable awake state varies across individual mice.” We added the following statement before introducing Figure 1e:

      (Line 93) “Due to differences in tail vein injection timing and anesthesia depth, the time required for each mouse to fully awaken varied. Although it was not feasible to get pupil size stabilized just after 500 seconds for each animal, ULM reconstruction only used the data that acquired after the animal reached full pupillary dilation, to ensure that ULM accurately captures the cerebrovascular characteristics in the awake state.”

      We added the following statement before introducing Figure 2d:

      (Line 139) “To further verify that the proposed MB bolus injection method can help to achieve ULM image saturation shortly after mice awaken from anesthesia, an analysis on the change in MB concentration over time was conducted once pupil size had stabilized (T = 500s).”

      For Figures 3, 4, and 5 (in the revised version, Figures 4 and 5 have been combined into a single Figure 4), the data represents results from three individual mice, with each coronal plane corresponding to a different mouse. In the revised version, we have added labels to indicate the specific mouse in each image to improve clarity. We also recognize that some analyses in the original submission (original Figure 5) may have lacked sufficient statistical power due to the small sample size. Therefore, in the revised version, we have focused only on findings that were consistently observed across the three mice to ensure robust conclusions.

      Reviewer 1 (Recommendations For the Authors):

      • If the study's main goal is to compare awake vs anesthetized ULM, the authors should test at least another anesthetic with no evident vasodilator effect.

      Thank you for this valuable suggestion. We would like to clarify that the primary aim of our study is not to comprehensively compare the effects of anesthesia versus the awake state, as a rigorous comparison would indeed require a more controlled experimental design, including additional anesthetics, a larger cohort of mice, and broader controls to ensure sufficient statistical power. We also add the following statement in the Discussion to clarify this point:

      (Line 314) “Therefore, in future studies, it would be valuable to design more rigorous control experiments with larger sample sizes to systematically compare the effects of isoflurane anesthesia, awake states, and other anesthetics that do not induce vasodilation on cerebral blood flow.”

      We acknowledge that the initial organization of Figures 3–5 placed excessive emphasis on comparisons between the awake and anesthetized states, but without yielding consistently significant findings. Meanwhile, our longitudinal observations in original Figure 6 were underrepresented, despite their potential importance.

      In the revised version, we shifted our focus toward the main goal of awake longitudinal imaging. By consolidating the previous Figures 4 and 5 into the new Figure 4, we emphasize conclusions that are both more consistent and broadly applicable, avoiding areas that may lack sufficient rigor or consensus. Additionally, we expanded the quantitative analysis related to longitudinal imaging, highlighting its role as the ultimate objective of this study. The awake vs. anesthetized ULM comparison was intended to demonstrate the value of awake imaging and introduce the importance of awake longitudinal imaging. In the revised text, we have reframed this comparison to emphasize the specific response to isoflurane rather than a general response to anesthesia. For example, in Figures 3 and 4, we have replaced the original term "Anesthetized" with "Isoflurane". We have also added a discussion noting that isoflurane may induces more vasodilation than other anesthetic agents.

      (Line 310) “Although isoflurane is widely used in ultrasound imaging because it provides long-lasting and stable anesthetic effects, it is important to note that the vasodilation observed with isoflurane is not representative of all anesthetics. Some anesthesia protocols, such as ketamine combined with medetomidine, do not produce significant vasodilation and are therefore preferred in experiments where vascular stability is essential, such as functional ultrasound imaging(47).”

      • The claims made about the proposed experimental protocol to be suitable for the "long-term" (line 255) are not supported by the data and should be modified according to the presented evidence.

      Thank you for your valuable feedback. We agree that our current three-week experimental results do not yet fulfill the requirements for extended longitudinal imaging that may span several months. We have revised the relevant text accordingly. For instance, the phrase “Our proposed method enabled long-term, repeatable longitudinal brain imaging” has been modified to “Our proposed method enabled repeatable longitudinal brain imaging over a threeweek period.” (Similar changes also in Line 67, Line 318, and Line 337) Additionally, we have added the following paragraph in the discussion section to indicate that extending the monitoring period to several months is a meaningful direction for future exploration:

      (Line 337) “In our longitudinal study, consistent imaging results were obtained over a three-week period, demonstrating the feasibility of awake ULM imaging for this duration. However, for certain research applications, a monitoring period of several months would be valuable. Extending the duration of longitudinal awake ULM imaging to enable such long-term studies is a potential direction for future development.”

      Recommendations for improving the writing and presentation:

      • Reporting the number of mice and blood vessels and statistics for each quantitative figure.

      Thank you for highlighting this issue. We acknowledge that the quantitative figures in the previous version lacked clarity in specifying the number of mice, vessels, and associated statistics. In the revised version, we have ensured that each quantitative figure or its caption clearly indicate the specific mice, vessels, and statistical methods used. To further minimize any potential confusion, we have also added Supplementary Figure 1 to clearly label and reference each individual mouse included in the study.

      Minor corrections to the text and figures.

      • Line 22: "vascularity reduction from anesthesia" is not clear, nor it is a codified property of brain vasculature. Explain or rephrase.

      Thank you for your comment. We apologize for any confusion caused by the phrase “vascularity reduction from anesthesia” in the abstract. We agree that this phrasing was unclear without context. To improve clarity, we have revised this statement in the abstract to make it more straightforward and easier to understand. 

      (Line 24) “Vasodilation induced by isoflurane was observed by ULM. Upon recovery to the awake state, reductions in vessel density and flow velocity were observed across different brain regions.” 

      Additionally, we have added a section in the Methods titled Quantitative Analysis of ULM Images to provide a clear definition of vascularity. This section outlines how vascularity is quantified in our study, ensuring that our terminology is well-defined. 

      The following sentence shows the definition of vascularity:

      (Line 547) “Vascularity was defined as the proportion of the pixel count occupied by blood vessels within each ROI, obtained by binarizing the ULM vessel density maps and calculating the percentage of the pixels with MB signal.”

      We have also added an instant definition when it was firstly used in Results part:

      (Line 161) “When comparing vessel density maps, ULM images that are acquired in the awake state demonstrate a global reduction of vascularity, which refers to percentage of pixels that occupied by blood vessels.”

      • Line 76: putting the mice in a tube is also intended "To further reduce animal anxiety and minimize tissue motion" I agree with tissue motion, not with animal anxiety, which, indeed, I expect to be higher than if it could, for example, run on a ball or a treadmill.

      Thank you for pointing this out. We acknowledge the limitations of our setup regarding reducing animal anxiety. We have replaced the original phrase “to further reduce animal anxiety and minimize tissue motion” with “to further minimize tissue motion.” (Line 78) Additionally, we have added the following paragraph in Discussion section to address the limitations of our setup in reducing anxiety.

      (Line 321) “One of the limitations is the lack of objective measures to assess the effectiveness of head-fix habituation in reducing anxiety. This may introduce variability in stress levels among mice. Recent studies suggest that tracking physiological parameters such as heart rate, respiratory rate, and corticosterone levels during habituation can confirm that mice reach a low stress state prior to imaging(48). This approach would be highly beneficial for future awake imaging studies. Furthermore, alternative head-fixation setups, such as air-floated balls or treadmills, which allow the free movement of limbs, have been shown to reduce anxiety and facilitate natural behaviors during imaging(30). Adopting these approaches in future studies could enhance the reliability of awake imaging data by minimizing stress-related confounds.”

      • Line 79: PMP has been used by Sieu et al., Nat Methods, 2015; it should be acknowledged.

      Thank you for highlighting this. We have now included the reference to Sieu et al. Nat Methods, 2015 to appropriately acknowledge their use of PMP. (Line 81)

      • Figure: is there a reason why the plots start at 500 sec? What happened before that time?

      Thank you for your question regarding the starting time in the plots. Figures 1 and 2 are case studies using a single mouse to demonstrate the feasibility of our method. The “zero” timepoint was defined as the moment when anesthesia was stopped, and the microbubble injection began. However, the mouse does not fully recover immediately after anesthesia is stopped. As shown in Figure 1e, there is a period of approximately 500 seconds during which the pupil gradually dilates, indicating recovery. Only after this period does the mouse reach a relatively stable physiological state suitable for ULM imaging, which is why the plots in Figure 2 begin at T = 500 seconds.

      We recognize that this was not sufficiently explained in the main text and figure captions. In the revised manuscript, we have clarified this timing rationale in both the results section and the figure captions. We added the following sentence to the result section to introduce Fig.2d:

      (Line 139) “To further verify that the proposed MB bolus injection method can help to achieve ULM image saturation shortly after mice awaken from anesthesia, an analysis on the change in MB concentration over time was conducted once pupil size had stabilized (T = 500s).”

      We also added the following statement to note that this recover time varies across individual mice:

      (Line 154, Fig.2 caption) “This figure presents a case study based on the same mouse shown in Fig 1. The x-axis for d-f begins at 500 seconds because, at this point, the mouse’s pupil size stabilized, indicating it had recovered to an awake state. Consequently, ULM images were accumulated starting from this time. It is important to note that not every mouse requires 500 seconds to fully awaken; the time to reach a stable awake state varies across individual mice.”

      Reviewer 2 (Public Review):

      • The only major comment (calling for further work) I would like to make is the relative weakness of the manuscript regarding longitudinal imaging (mostly Figure 6), compared to the exhaustive review of the effect of isoflurane on the vasculature (3 rats, 3 imaging planes, quantification on a large number of vessels, in 9 different brain regions). The 6 cortical vessels evaluated in Figure 6 feel really disappointing. As longitudinal imaging is supposed to be the salient element of this manuscript (first word appearing in the title), it should be as good and trustworthy as the first part of the paper. Figure 6c. is of major importance, and should be supported by a more extensive vessel analysis, including various brain areas, and validated on several animals to validate the robustness of longitudinal positioning with several instances of the surgical procedure. Figure 6d estimates the reliability of flow measurements on 3 vessels only. Therefore I recommend showing something similar to what is done in Figures 4 and 5: 3 animals, and more extensive quantification in different brain regions.

      We thank the reviewer for pointing out this issue. We acknowledge that the first version of the manuscript lacked in-depth quantitative analysis in the section on the longitudinal study, which should have been a focal point. It also did not provide a sufficient number of animals to demonstrate the reproducibility of the technique. In this revised version, we have included results from more animals and conducted a more comprehensive quantitative analysis, with the corresponding text updated accordingly. Specifically, we combined the previous Figures 4 and 5 into the current Figure 4 (corresponding revised text from Line 169 to Line 207). The revised Figures 5 and 6

      compare the results of the longitudinal study, presenting data from three mice (corresponding revised text from

      Line 224 to Line 258). Detailed information about the mice used has been added to Supplementary Figure 1, and Supplementary Figure 4 further provides a detailed display of the results for the three mice in longitudinal study. We hope that these adjustments will provide a more thorough validation of the longitudinal imaging.

      Reviewer 2 (Recommendations For The Authors):

      Minor comments:

      • The statistical analyses are not always explained: could they be stated briefly in the legends of each figure, or gathered in a statistical methods section with details for each figure? Be sure to use the appropriate test (e.g. student t-test is used in Fig 5 k whereas normality of distribution is not guaranteed.)

      Thank you for pointing this out. We acknowledge that the statistical analyses were not clearly explained in the original version. In the revised manuscript, we have ensured that the statistical methods are clearly described. 

      (Fig.4 caption) “b,c, Comparisons of vessel diameter (b) and flow velocity (c) for the selected arterial and venous segments. Statistical analysis was conducted using t-test at each measurement point along the segments.”

      (Fig.6 caption) “b,c, Comparisons of vessel diameter (b) and flow velocity (c) for the selected arterial and venous segments. Statistical analysis was conducted using the two one-sided test (TOST) procedure, which evaluates the null hypothesis that the difference between the two weeks is larger than three times the standard deviation of one week.”

      Additionally, we corrected an error in the previous comparison of the violin plots on flow velocities, where a t-test was incorrectly applied; this has now been removed.

      • The authors use early in the manuscript the term vascularity, e.g. in "vascularity reduction", it is not exactly clear what they mean by vascularity, and would require a proper definition at that moment. If I am correct, a quantification of that "vascularity reduction" (page 5 line 132), is then done in Figures 5 d e f and j.

      Thank you for highlighting this issue. We acknowledge that our initial use of the term “vascularity” may have been unclear and potentially confusing. In the revised manuscript, we have included a clear definition of “vascularity” in the Methods section under Quantitative Analysis of ULM Images (Line 534). 

      The following sentence shows the definition of vascularity:

      (Line 547) “Vascularity was defined as the proportion of the pixel count occupied by blood vessels within each ROI, obtained by binarizing the ULM vessel density maps and calculating the percentage of the pixels with MB signal.”

      We have also added an instant definition when it was firstly used in Results part:

      (Line 161) “When comparing vessel density maps, ULM images that are acquired in the awake state demonstrate a global reduction of vascularity, which refers to percentage of pixels that occupied by blood vessels.”

      • There is very little motion in the images presented, except for the awake "Bregma -4.2 mm" (Figure 3, directional maps), especially in the area including colliculi and mesencephalon, while the cortical vessels do not move. Can you comment on that?

      Thank you for highlighting this important aspect of motion in awake animal imaging. Motion correction is indeed a critical factor in such studies. In the original version of our discussion, we briefly addressed this issue (from Line 342 to Line 346), but we agree that a more detailed discussion is needed.

      To minimize motion artifacts, we conducted habituation to acclimate the animals to the head-fixation setup, which helps reduce anxiety during imaging. With thorough head-fixed habituation, the imaging quality is generally well-preserved. We also applied correlation-based motion correction techniques based on ULM images, which can partially correct for overall brain motion, as stated in the previous version. However, this ULM-images-based correction is limited to addressing only rigid motion.

      In the revised discussion, we have expanded on the limitations of our current motion correction approach and referenced recent work about more advanced motion correction methods:

      (Line 346) “While rigid motion correction is often effective in anesthetized animals, awake animal imaging presents greater challenges due to the more prominent non-rigid motion, particularly in deeper brain regions. This is evidenced in Supplementary Fig. 1 (Mouse 7), where cortical vessels remain relatively stable, but regions around the colliculi and mesencephalon exhibit more noticeable motion artifacts, indicating that displacement is more pronounced in deeper areas. To address these deeper, non-rigid motions, recent studies suggest estimating nonrigid transformations from unfiltered tissue signals before applying corrections to ULM vascular images(16,50). Such advanced motion correction strategies may be more effective for awake ULM imaging, which experiences higher motion variability. The development of more robust and effective motion correction techniques will be crucial to reduce motion artifacts in future awake ULM applications.”

      • Figure 1f maybe flip the color bar to have an upward up and downward down.

      Thank you for your suggestion. This display method indeed makes the images more intuitive. In the revised manuscript, all directional flow color bars have been flipped to ensure that upward flow is displayed as ‘up’ and downward flow as ‘down.’

      • Figure 2b the figure is a bit confusing in what is displayed between dashed lines, solid lines, dots... maybe it would be easier to read with

      - bigger dots and dashed lines in color for each of the 4 series

      - and so in the legend, thin solid lines in the corresponding color for the fit, but no solid line in the legend (to distinguish data/fit)

      - no lines for FWHM as they are not very visible, and the FWHM values are not mentioned for these examples.

      Thank you for your detailed suggestions. We agree that the original Fig. 2b appeared messy and confusing. Based on this feedback and other comments, we decided to replace the FWHM-based vessel diameter measurement with a more stable binarization-based approach. In the revised version, we selected a specific segment of each vessel and measured the diameter by calculating the distance from the vessel’s centerline to both side after binarization. Each point on the centerline of this segment provides a diameter measurement, which can be further used to calculate the mean and standard error. This updated method is more stable and reproducible, providing reliable measurements even for vessels that are not fully saturated. It also facilitates comparison across more vessels, helping to further demonstrate the generalizability of our saturation standard. We believe these adjustments make the revised Fig. 2b clearer and more readable.

      • Page 7, lines 144-147. This passage is not really clear when linking going up or down and going from the stem to the branches that it is specific to Figure 4a (and therefore to this particular location).

      Thank you for your insightful comments on our vessel classification method. We recognize the limitations of the previous approach and, in order to enhance the rigor of the study, we have opted not to continue using this method in the revised manuscript. We have removed all content related to vessel classification based on branchin and branch-out criteria. This includes the original Classification of Cerebral Vessels section in the Methods, the relevant descriptions in the Results section under “ULM reveals detailed cerebral vascular changes from anesthetized to awake for the full depth of the brain”, limitation of this classification method in Discussion section, as well as related content in the original Figures 4 and 5.

      In the revised analysis, for the comparison between arteries and veins, we focus solely on penetrating vessels in the cortex. For these vessels, it is generally accepted that downward-flowing vessels are arterioles, while upwardflowing vessels are venules. Accordingly, in the revised Figures 4 and 6, we analyze arterioles and venules exclusively in the cortex, without relying on the previous classification method that could be considered controversial.

      • Page 11 line 222 "higher vascular density" seems unprecise.

      Thank you for pointing this out. We have revised the sentence to more precisely convey our observations regarding changes in vascular diameter and vascularity within the ROI. We present these findings as evidence of the vasodilation effect under isoflurane, in alignment with existing research. The revised statement is as follows:

      (Line 275) “Statistical analysis from Fig. 4 shows that certain vessels exhibit a larger diameter under isoflurane anesthesia, and the vascularity, calculated as the percentage of vascular area within selected brain region ROIs, is also higher in the anesthetized state. These findings suggest a vasodilation effect induced by isoflurane, consistent with existing research(20,40,41,43,44).

      • Discussion: page 12, lines 257-267: it is not exactly clear how 3D imaging will help for the differentiation of veins/arteries. However, some methods have already been proposed to discriminate between arteries and veins using pulsatility (Bourquin et al., 2022) or 3D positioning when vessels are overlapped (Renaudin et al., 2023). The latter can also help estimate the out-of-plane positioning during longitudinal imaging.

      Bourquin, C., Poree, J., Lesage, F., Provost, J., 2022. In Vivo Pulsatility Measurement of Cerebral Microcirculation in Rodents Using Dynamic Ultrasound Localization Microscopy. IEEE Trans. Med. Imaging 41, 782-792. https://doi.org/10.1109/TMI.2021.3123912

      Renaudin, N., Pezet, S., Ialy-Radio, N., Demene, C., Tanter, M., 2023. Backscattering amplitude in ultrasound localization microscopy. Sci. Rep. 13, 11477. https://doi.org/10.1038/s41598-023-38531-w

      Thank you for pointing this out. We have revised the relevant paragraph in the discussion to clarify the potential advantages of advances in ULM imaging methods, such as those based on pulsatility (as described by Bourquin et al., 2022) or backscattering amplitude (as demonstrated by Renaudin et al., 2023). These established methods could be helpful for longitudinal imaging. Below is the revised text in the discussion section:

      (Line 370) “Advances in ULM imaging methods can benefit longitudinal awake imaging. For instance, dynamic ULM can differentiate between arteries and veins by leveraging pulsatility features(51). 3D ULM, with volumetric imaging array(52,53), enables the reconstruction of whole-brain vascular network, providing a more comprehensive understanding of vessel branching patterns. Meanwhile, 3D ULM also helps to mitigate the challenge of aligning the identical coronal plane for longitudinal imaging, a process that requires precise manual alignment in 2D ULM to ensure consistency. Additionally, this alignment issue can also be alleviated in 2D imaging using backscattering amplitude method, which may assist in estimating out-of-plane positioning during longitudinal imaging(54).”

      Reviewer 3 (Public Review):

      • It is unclear whether multiple animals were used in the statistical analysis.

      Thank you for bringing this to our attention. We acknowledge that the original version did not clearly indicate the use of animals in the statistical analysis. In the revised manuscript, we have added Supplementary Figure 1 to specify the mice used, and we have labeled each mouse accordingly in the figures or captions. In the revised Figures 4 and 6, we have ensured that each quantitative analysis figure or its caption clearly indicate the specific mice.

      • Generalizations are sometimes drawn from what seems to be the analysis of a single vessel.

      Thank you for pointing this out. To enhance the generalizability of our conclusions, we have expanded our analysis beyond single vessels in several parts of the study. For instance, in Figure 2, we analyzed three vessels at different depths within the same brain region of a single mouse, and we have included additional results in the Supplementary Figure 2 to further support these findings. Additionally, we have revised the language in the manuscript to ensure that conclusions are appropriately qualified and avoid overgeneralization.

      In Figures 4 and 6, we extended the analysis from single vessels to larger region-of-interest (ROI) analyses across entire brain regions. Unlike single-vessel measurements, which are susceptible to bias based on specific measurement locations, ROI-based analyses are less influenced by the operator and provide more objective, generalizable insights.

      • The description of the statistical analysis is mostly qualitative.

      We recognize that some aspects of the original statistical analysis (Figures 4 and 5 in the previous version) lacked rigor and description is more qualitative. The revised version of statistical analysis (Figure 4 and Figure 6) presents our findings from multiple dimensions, ranging from individual vessels to individual cortical ROI of arteries and veins, and ultimately to broader brain regions. For instance, as illustrated in the revised Figure 4f, the average cortical arterial flow speed decreases by approximately 20% from anesthesia to wakefulness, while venous flow speed decreases by an average of 40%, with the reduction in venous flow speed being significantly greater than that of arterial flow. We believe that this kind of description offers more quantitative analysis.

      For more examples, please refer to the Results section where Figure 4 (Line 169 to Line 207) and Figure 6 (Line 224 to Line 258) are described. These sections have been extensively rewritten to emphasize quantitative interpretation of the data. Each part of the analysis now focuses more heavily on quantitative analyses that consistently show similar trends across all animals.

      • Some terms used are insufficiently defined.

      • Additional limitations should be included in the discussion.

      • Some technical details are lacking. 

      Thank you for highlighting these issues. In response, we have made several improvements in the revised manuscript to address these issues. We have clarified terms such as “vascularity” (Line 547) and “saturation point” (Line 112) to ensure precision and prevent ambiguity. We have expanded the discussion (Line 310 to Line 377) to include limitations such as motion correction challenges and advances in ULM imaging methods, including dynamic ULM and backscattering amplitude techniques. We have added further details on interleaved sampling (Line 494 to Line 497), ULM tracking (Line 517 to Line 529), and quantitative analysis (Line 535 to Line 551) in the Methods section to provide a clearer understanding of our approach. 

      Please refer to our other responses for more specific adjustments.

      • Without information about whether the results obtained come from multiple animals, it is difficult to conclude that the authors generally achieved their aim. They do achieve it in a single animal. The results that are shown are interesting and could have an impact on the ULM community and beyond. In particular, the experimental setup they used along with the high reproducibility they report could become very important for the use of ULM in larger animal cohorts.

      We thank the reviewer for recognizing the impact of our work. We also acknowledge that there were some issues—specifically, we did not provide sufficient proof of reproducibility. In the revised version, we have included additional animal experiment results to ensure that the conclusions were not drawn from a single animal but are generally representative of our aim. (See supplementary figure 1 for detailed use of the animals) 

      Reviewer 3 (Recommendations For The Authors):

      • The manuscript would be more convincing by removing some of the superlatives used in the text. For instance, shouldn't "super-resolution ultrasound localization microscopy" simply be "ultrasound localization microscopy"? Expressions such as "first study", "essential", and "invaluable", etc could be replaced by more factual terms. The word "significant" is also used sometimes with statistics to back it up and sometimes without.

      Thank you for highlighting this issue. We have removed the superlatives throughout the manuscript to make the language more precise. For instance, we have simplified “super-resolution ultrasound localization microscopy” to “ultrasound localization microscopy” throughout the main text and removed expressions such as “first study” and “invaluable”. We also reviewed all uses of “essential” and “significant,” replacing “essential” with more modest alternatives where it does not indicate a strict requirement. Similarly, where “significant” does not refer to statistical significance, we have used other terms to avoid any ambiguity.

      • The section "Microbubble count serves as a quantitative metric for awake ULM image reconstruction" had several issues that I think should be addressed. Mainly, the authors make the case that after detecting 5 million microbubbles, there is no clear gain in detecting more. The argument is not very convincing as we know many vessels will not have had a microbubble circulate in them within that timeframe, which will be especially true in smaller vessels. While the analysis in Figure 2 shows nicely that the diameter estimate for vessels in the 20-30 um range is stable at 5 million microbubbles, it is not necessarily the case for smaller vessels. A better approach here might be to select, e.g., a total of 5 million detected microbubbles for practical reasons and then to determine which vessel parameters estimation (e.g., diameter, flow velocity) remain stable. In addition:

      a. Terms such as 'complete ULM reconstruction', 'no obvious change', 'ULM image saturation' are not well defined within the manuscript.

      Thank you for pointing out these issues and for offering a more rigorous approach. We completely agree with your suggestion. While our analysis demonstrated stable diameter estimates for vessels with diameter around 20 µm at 5 million microbubbles, this does not necessarily ensure stability for smaller vessels. Therefore, the choice of 5 million microbubbles was primarily for practical reasons. In the revised version, we have provided a more objective description and clarification of this limitation. We also recognize that terms such as “complete ULM reconstruction,” “no obvious change,” and “ULM image saturation” were not well defined and may have caused confusion, reducing the rigor of this manuscript. Based on your feedback, we have clearly defined “ULM image saturation” within the context of our study, removed absolute and ambiguous terms like “complete ULM reconstruction” and “no obvious change”. We revised the entire section accordingly:

      (Line 109) “To facilitate equitable comparison of brain perfusion at different states, a practical saturation point enabling stable quantification of most vessels needs to be established. Our observations indicated that when the cumulative MB count reached 5 million, ULM images achieved a relatively stable state. Accordingly, in this study, the saturation point was defined as a cumulative MB count of 5 million. There are also possible alternatives for ULM image normalization. For example, different ULM images can be normalized to have the same saturation rate. However, the proposed method of using the same number of cumulative MB count for normalization enables the analysis of blood flow distribution across different brain regions from a probabilistic perspective. The following analysis substantiates this criterion.

      Fig. 2a compares ULM directional vessel density maps and flow speed maps generated with 1, 3, 5, and 6 million MBs, using the same animal as shown in Fig. 1. To quantitatively confirm saturation, multiple vessel segments were selected for further analysis. Fig. 2b presents the measured vessel diameter for a specific segment at various MB counts. After binarizing the ULM map, the vessel diameter was measured by calculating the distance from the vessel centerline to the edge. Each point along the centerline of the segment provided a diameter measurement, enabling calculation of the mean and standard error. At low MB counts, vessels appeared incompletely filled, leading to inaccurate estimation of vessel diameter due to incomplete profiles. For example, at 1–2 million MBs, the binarized ULM map displayed a width of only one or two pixels along the segment. As a result, the measurements always yielded the same diameter values (two pixels, ~10um) with a consistently low standard error of the mean across the entire segment. With increased MB counts, the measured vessel diameter gradually rose, ultimately reaching saturation. The plots in Fig. 2b show that vessel diameter stabilized at 5 million MB count. Additionally, Fig. 2c illustrates the changes in flow velocity measured at different cumulative MB counts. The violin plots display the distribution of flow speed estimates for all valid centerline pixels within the selected segment. At low MB counts (1–3 million), flow velocity estimates fluctuated, but they stabilized as the MB count increased (4–6 million MBs). At 5 million MBs, flow velocity estimates were nearly identical to those at 6 million MBs, corroborating previous findings that vessel velocity measurements stabilize as MB count grows(39). To assess the generalizability of the 5 million MB saturation condition, vessel segments from three different mice across various brain regions were examined. The results, shown in Supplementary Fig. 2, confirm that this saturation criterion applies broadly. Although the 5 million MB threshold may not ensure absolute saturation for all vessels, it is generally effective for vessels larger than 15 μm. This MB count threshold was therefore adopted as a practical criterion.” 

      b. The choice of 10 consecutive tracking frames is arbitrary and should be described as such unless a quantitative optimization study was conducted. Was there a gap-filling parameter? What was the maximum linking distance and what is its impact on velocity estimation?

      Thank you for your comment. We acknowledge that the choice of 10 consecutive tracking frames was based on our common practice rather than a specific quantitative optimization. Additionally, with the uTrack algorithm, we set both the gap-filling parameter and maximum linking distance to 10 pixels. Setting these parameters too high could potentially overestimate velocity. These details have now been added to the Methods section for clarity:

      (Line 517) “The choice of 10 consecutive frames (10 ms) was based on established practice but can be adjusted as needed. For the uTrack algorithm, two additional key parameters were specified: the maximum linking distance and the gap-filling distance, both set to 10 pixels (~50 microns). This configuration means that only bubble centroids within 10 pixels of each other across consecutive frames are considered part of the same bubble trajectory. Additionally, when the start and end points of two tracks fall within this threshold, the gap-filling parameter merges them into a single, continuous track. It is important to select these parameters carefully, as overly large values could lead to an overestimation of flow velocity. By setting the maximum linking distance to 10 pixels, we effectively limited the measurable velocity to 50 mm/s, under the assumption that no bubble would exceed a 50-micron displacement within the 1 ms interval between frames. After determining bubble tracks with the specified parameters for uTrack algorithm, accumulating the MB tracks resulted in the flow intensity map. Considering the velocity distribution across the mouse brain, this 50 mm/s limit ensures that the vast majority of blood flow is captured accurately.”

      c. 'The plots (Figure 2b) clearly indicate that the vessel diameter stabilized beyond 5 million MB count.' This is true for one vessel. To generalize that claim, the analysis should be performed quantitatively on a larger sample of vessels in various areas of the brain, across multiple animals.

      Thank you for pointing out this limitation. We agree that conclusions drawn from a single vessel cannot be generalized across all regions. Following your suggestion, we have added Supplementary Figure 2, where we analyzed multiple vessels from different brain regions across three mice. This expanded analysis further confirms that a 5 million MB count is sufficient to stabilize vessel diameter measurements across various samples.

      (Line 133) “To assess the generalizability of the 5 million MB saturation condition, vessel segments from three different mice across various brain regions were examined. The results, shown in Supplementary Fig. 2, confirm that this saturation criterion applies broadly. Although the 5 million MB threshold may not ensure absolute saturation for all vessels, it is generally effective for vessels larger than 15 μm. This MB count threshold was therefore adopted as a practical criterion.” 

      • "Statistical analysis validates the increase in blood flow induced by anesthesia" is a very interesting section but even though a quantitative analysis was conducted in Figure 5, the language used remains mostly qualitative. I think this section should include quantitative conclusions from the statistical analysis to increase the impact of this work.

      Thank you for your valuable feedback. We recognize that some aspects of the original quantitative analysis (Figures 4 and 5 in the previous version) lacked rigor, such as the classification of arteries, veins, and capillaries, and that the data presented in each row of Figure 5 represented only one mouse per coronal section, limiting the generalizability of statistical conclusions.

      In response to the reviewers’ feedback, the revised version incorporates a new approach by merging the previous Figure 4 and Figure 5 into a single, consolidated figure (now Figure 4). This updated figure aims to present our findings from multiple dimensions, ranging from individual vessels to individual cortical ROI of arteries and veins, and ultimately to broader brain regions. We have focused on quantitative analyses that consistently show similar trends across all animals. For instance, as illustrated in the revised Figure 4f, the average cortical arterial flow speed decreases by approximately 20% from anesthesia to wakefulness, while venous flow speed decreases by an average of 40%, with the reduction in venous flow speed being significantly greater than that of arterial flow. We believe that this approach offers more insightful analysis and enhances the overall impact of the study.

      For more examples, please refer to the revised Results section where Figure 4 are described (from Line 169 to Line 212). These sections have been extensively rewritten to emphasize quantitative interpretation of the data. Each part of the analysis now focuses more heavily on quantitative analyses that consistently show similar trends across all animals.

      • In the methods, it is claimed that 6 healthy female C57 mice were used in the study, but it is hard to tell whether more than one animal is shown in the figures. It is also unclear whether the statistics were performed within or across animals. Since one of the major strengths of the manuscript is that it shows the feasibility of performing reproducible measurements using ULM, most figures should be repeated for each individual animal and provided in supplementary data and statistics should be performed across animals.

      Thank you for bringing this to our attention. We acknowledge that the original version did not clearly indicate the use of individual animals. In the revised manuscript, we have added Supplementary Figure 1 to specify the mice used, and we have labeled each mouse accordingly in the figures or captions. Additionally, we included statistics across animals in the revised Figures 4 and 6, and detailed data for each individual mouse are now provided in Supplementary Figures 3 and 4.

      • The effect of aliasing should be discussed given that 1) a high-frequency probe is used along with a correspondingly relatively low frame rate (1000 fps) and 2) Doppler filtering is used to separate upward from downward-moving microbubbles. There will be microbubbles that circulate faster than the Nyquist limit, which will thus appear as moving in the opposite direction in the Doppler spectrum. It would be important to double-check that the effect is not too important and to report this as a limitation in the discussion.

      Thank you for highlighting this important point. Aliasing is indeed a relevant issue to consider, especially for higher flow velocities in large vessels. We have added a discussion on this limitation in the revised manuscript:

      (Line 359) “Based on the maximum linking distance and gap closing parameters outlined in the Methods section, blood flow with velocities below 50 mm/s can be detected. However, the use of a directional filter to estimate flow direction may introduce aliasing. MBs moving at higher velocities may be subject to incorrect flow direction estimation due to aliasing effects. Given that the compounded frame rate is 1000 Hz, with an ultrasound center frequency of 20 MHz and a sound speed of 1540 m/s, the relationship between Doppler frequency and the axial blood flow velocity(12) indicates that aliasing will not occur for axial flow velocities below 19.25 mm/s. In all flow velocity maps presented in this study, the range is limited to a maximum of 15 mm/s, remaining below the critical threshold for aliasing. Additionally, all vessels analyzed in the violin plots for arteriovenous flow comparisons fall within this range. While cortical arterioles and venules generally exhibit moderate flow speeds, aliasing remains a factor to consider when combining directional filtering with velocity analysis.”

      • The method used to classify vessels may be incorrect and may not be needed. I would recommend the authors not use it and describe the vessels as vessels that branch in or out, etc. Applying an arbitrary threshold of 2 to detect capillaries is also not very convincing. I understand that the authors might decide to maintain this nomenclature, in which case I would recommend clearly explaining it at the beginning of the manuscript along with some of the caveats that are already reported in the discussion.

      Thank you for your comments on our vessel classification method. We recognize the limitations of the previous approach and, in order to enhance the rigor of the study, we have opted not to continue using this method in the revised manuscript.

      In the revised analysis regarding artery and vein, we focus solely on penetrating vessels in the cortex. For these vessels, it is generally accepted that downward-flowing vessels are arterioles, while upward-flowing vessels are venules. Accordingly, in the revised Figures 4 and 6, we analyze arterioles and venules exclusively in the cortex, without relying on the previous classification method that could be considered controversial.

      Additionally, we agree that classifying vessels with values below 2 as capillaries was not a robust approach. Thus, we have removed all related analyses from the revised manuscript.

      Minor comments:

      • Line 16: "resolves capillary-scale ..."; it is not clear that the resolution that is achieved in this work is at the capillary scale.

      Thank you for your valuable feedback. We understand that “capillary-scale” may overstate the achieved resolution in our work. To clarify, we have revised the sentence as follows:

      (Line 18) “Ultrasound localization microscopy (ULM) is an emerging imaging modality that resolves microvasculature in deep tissues with high spatial resolution.” 

      This adjustment more accurately reflects the resolution capabilities of ULM as used in our study.

      • Line 22: 'vascularity' is not well defined in the manuscript. Consider defining or using another term.

      Thank you for pointing out the need for clarification on vascularity. We acknowledge that our initial use of the term “vascularity” may have been unclear and potentially confusing. In the revised manuscript, we have included a clear definition of “vascularity” in the Methods section under Quantitative Analysis of ULM Images (Line 534). 

      The following sentence shows the definition of vascularity:

      (Line 547) “Vascularity was defined as the proportion of the pixel count occupied by blood vessels within each ROI, obtained by binarizing the ULM vessel density maps and calculating the percentage of the pixels with MB signal.”

      We have also added an instant definition when it was firstly used in Results part:

      (Line 161) “When comparing vessel density maps, ULM images that are acquired in the awake state demonstrate a global reduction of vascularity, which refers to percentage of pixels that occupied by blood vessels.”

      • Line 30: I'm not convinced the first two sentences are useful.

      Thank you for pointing out this issue. The opening sentence of the article lacked focus and was too broad. We have rewritten the sentence as follows:

      (Line 34) “Sensitive imaging of correlates of activity in the awake brain is fundamental for advancing our understanding of neural function and neurological diseases.”

      • Line 37: 'micron-scale capillaries': this expression is unclear. Capillaries are typically micron-scaled, so it gives the impression that ULM can image ULM at the one-micron scale, which is not the case.

      Thank you for your helpful comment. We agree that “micron-scale capillaries” could be misleading, as it might imply a resolution at the single-micron level. To clarify, we have revised the sentence as follows:

      (Line 40) “ULM is uniquely capable of imaging microvasculature situated in deep tissue (e.g., at a depth of several centimeters).”

      This revised wording more accurately describes ULM’s capability without implying single-micron level resolution.

      • Line 74: I don't think motion-free imaging is possible in the context of awake animals. Consider 'limiting motion' instead.

      Thank you for pointing out the potential issue with the term “motion-free”. We agree that achieving entirely motion-free imaging is challenging, especially in the context of awake animals. In response to your suggestion, we have revised the sentence to better reflect this limitation:

      (Line 76) “To achieve consistent ULM brain imaging while allowing limited movement in awake animals, a headfixed imaging platform with a chronic cranial window was used in this study.”

      This revised wording more accurately conveys our approach to minimizing motion without implying that motion is completely eliminated.

      • Line 134:'clearly reveals decreased vessel diameter' How was that demonstrated?

      • Line 153: 'significant' according to which statistical test?

      • Line 167: 'slight increase', by how much, is it significant?

      • Line 183: 'smaller vessels' the center of the distribution is not at 10mm/s, and velocity is not necessarily correlated with diameter.

      • Line 184: 'more large vessels', see above. What is a large vessel, and how was this measured?

      • Line 205: 'significantly lower', according to which statistical test?

      We acknowledge that the original version did not properly use the terms of statistical analysis. In the revised manuscript, we have deleted the related points, and rewritten the statistical analysis part to ensure the terms are used correctly. Please refer to the revised part of “ULM reveals an increase in blood flow induced by isoflurane anesthesia” (From Line 169 to Line 209). In the revised Figures 4 and 6, we have also ensured that each quantitative analysis figure or its caption is clearly explained.

      •    Line 398: the interleaved sampling scheme should be described in more detail.

      Thank you for pointing out this issue. The previous version did not clearly explain the details of interleaved sampling. We have now added the following paragraph to the Ultrasound imaging sequence section in Methods:

      (Line 494) “Interleaved sampling is employed to capture high-frequency echoes more effectively. With the system’s sampling rate limited to 62.5 MHz, the upper limit of the center frequency of the transducer passband is 15.625 MHz. To mitigate aliasing, two transmissions are sent per angle, staggered in time. This approach effectively doubles the sampling rate, ensuring more accurate image reconstruction.”

      • Figure 1: Which mouse is it? Are these results consistent across all animals?

      • Figure 2: Which mouse is it? Are these results consistent across all animals?

      • Figure 3: Which mouse is it? Are these results consistent across all animals?

      • Figure 4: Which mouse is it? Are these results consistent across all animals?

      • Figure 5: Is it a single mouse or multiple mice? Are these results consistent across all animals?

      We acknowledge that the original version did not clearly indicate the numbers of animals in the statistical analysis. In the revised manuscript, we have added Supplementary Figure 1 to specify the mice used, and we have labeled each mouse accordingly in the figures or captions. In the revised Figures 4 and 6, we have ensured that each quantitative analysis figure or its caption clearly indicate the specific mice.

      For original Figures 1 and 2, these are presented as case studies to illustrate the methodology. Since the anesthesia time required for tail vein injection for each animal varies slightly, it is challenging to have the consistent time taken for each mouse to recover from anesthesia across all mice. For instance, in Figure 1, the mouse took nearly 500 seconds to recover from anesthesia, but this duration is not consistent across all animals, which is a limitation of the bolus injection technique. We have noted this point in the discussion (discussion on the limitation of bolus injection), and we have also clarified in the results section and figure captions that these figures represent a case study of a single mouse rather than a standardized recovery time for all animals.

      We further clarified this point in the end of the Figure 2 caption:

      (Fig.2 caption) “This figure presents a case study based on the same mouse shown in Fig 1. The x-axis for d-f begins at 500 seconds because, at this point, the mouse’s pupil size stabilized, indicating it had recovered to an awake state. Consequently, ULM images were accumulated starting from this time. It is important to note that not every mouse requires 500 seconds to fully awaken; the time to reach a stable awake state varies across individual mice.” We added the following statement before introducing Figure 1e:

      (Line 93) “Due to differences in tail vein injection timing and anesthesia depth, the time required for each mouse to fully awaken varied. Although it was not feasible to get pupil size stabilized just after 500 seconds for each animal, ULM reconstruction only used the data that acquired after the animal reached full pupillary dilation, to ensure that ULM accurately captures the cerebrovascular characteristics in the awake state.”

      We added the following statement before introducing Figure 2d:

      (Line 139) “To further verify that the proposed MB bolus injection method can help to achieve ULM image saturation shortly after mice awaken from anesthesia, an analysis on the change in MB concentration over time was conducted once pupil size had stabilized (T = 500s).”

      For Figures 3, 4, and 5 (in the revised version, Figures 4 and 5 have been combined into a single Figure 4), the data represents results from three individual mice, with each coronal plane corresponding to a different mouse. In the revised version, we have added labels to indicate the specific mouse in each image to improve clarity. We also recognize that some analyses in the original submission (original Figure 5) may have lacked sufficient statistical power due to the small sample size. Therefore, in the revised version, we have focused only on findings that were consistently observed across the three mice to ensure robust conclusions.

      Minor corrections and typos from all reviewers:

      We would like to sincerely thank the reviewers for their careful reading of our manuscript. We appreciate the time and effort taken to point out the minor typographical errors. We have carefully addressed and corrected all the identified typos, as listed below:

      From Reviewer #1:

      • Line 316: "insensate": correct, please.

      (Line 409) “After confirming that the mouse was anesthetized, the head of the animal was fixed in the stereotaxic frame.”

      From Reviewer #3:

      • Line 15: Super-resolution ultrasound localization microscopy -- consider removing super-resolution as it gives the impression that it is different from standard ULM.

      (Line 18) “Ultrasound localization microscopy (ULM) is an emerging imaging modality that resolves microvasculature in deep tissues with high spatial resolution.”

      • Line 39: typo: activities should be activity.

      (Line 41) “ULM can also be combined with the principles of functional ultrasound (fUS) to image whole-brain neural activity at a microscopic scale.”

      • Line 47: typo: over under.

      (Line 50) “Therefore, in neuroscience research, brain imaging in the awake state is often preferred over imaging under anesthesia.”

      Once again, we are grateful for the reviewers’ thorough review and valuable input, which have helped us improve the clarity and precision of the manuscript.

    1. eLife Assessment

      This valuable paper explores the idea that transient modulations of neural gain promote switches between distinct perceptual interpretations of ambiguous stimuli. The authors provide solid evidence for this idea by pupillometry (an indirect proxy of neuromodulatory activity), fMRI, neural network modeling, and dynamical systems analyses. The highly integrative nature of this approach is rare in the field.

    2. Reviewer #1 (Public review):

      Summary:

      This paper proposes a neural mechanism underlying the perception of ambiguous images: neuromodulation changes the gain of neural circuits promoting a switch between two possible percepts. Converging evidence for this is provided by indirect measurements of neuromodulatory activity and large-scale brain dynamics which are linked by a neural network model. However, both the data analysis as well as the computational modeling are incomplete and would benefit from a more rigorous approach.

      This is a revised version of the manuscript which, in my view, is a considerable step forward compared to the original submission.

      In particular, the authors now model phasic gain changes in the RNN, based on the network's uncertainty. This is original and much closer to what is suggested by the phasic pupil responses. They also show that switching is actually a network effect because switching times depend on network configuration (Fig 2). This resolves my main comments 1 and 2 about the model.

      The mechanism, as I understand it, is different from what the authors described before in the RNN with tonic gain changes. As uncertainty increases, the network enters a regime in which the two excitatory populations start to oscillate. My intuition is that this oscillation arises from the feedback loop created by the new gain control mechanism. If my intuition is correct, I think it would be worth to explain this mechanism in the paper more explicitly.

      Overall, the modeling part of the paper has changed quite a lot and I think it is now more solid which is why I have updated my "strength of evidence" rating.

    3. Reviewer #2 (Public review):

      This paper tests the hypothesis that perceptual switches during the presentation of ambiguous stimuli are accompanied by changes in neuromodulation that alter neural gain and trigger abrupt changes in brain activity. To test this hypothesis, the study combines pupillometry, artificial recurrent network (RNN) analysis and fMRI recording. In particular, the study uses methods of energy landscape analysis inspired by physics, which is particularly interesting.

      Strengths

      - The authors should be commended for combining different methods (pupillometry, RNNs, fMRI) to test their hypothesis. This combination provides a mechanistic insight into perceptual switches in the brain and artificial neural networks.<br /> - The study combines different viewpoints and fields of scientific literature, including neuroscience, psychology, physics, and dynamical systems. In order to make this combination more accessible to the reader, the different aspects are presented in a pedagogical way to be accessible to all fields.<br /> - This combination of methods and viewpoints is rarely done, so it is very useful.<br /> - The authors introduce dynamic gain modulation in their recurrent neural network, which is novel. They devote a section of the paper to studying the dynamics, fixed points and convergence of this type of network.

      Weaknesses

      - The study may not be specific to perceptual switches. This is because the study relies on a paradigm in which participants report when they identify a switch in the item category. Therefore, it is unclear whether the effects reported in the paper are related to the perceptual switch itself, to attention, or to the detection of behaviourally relevant events. The authors are cautious and explicitly acknowledge this point in their study.<br /> - The demonstration of the causal role of gain modulation in perceptual switches is partial. This causality is clearly demonstrated in the simulation work with the RNN. However, it is not fully demonstrated in the pupil analysis and the fMRI analysis. One reason is that this work is correlative (which is already very informative). An analysis of the timing of the effect might have overcome this limitation. For example, in a previous study, the same group showed that fMRI activity in the LC region precedes changes in the energy landscape of fMRI dynamics, which is a step towards investigating causal links between gain modulation, changes in the energy landscape and perceptual switches.<br /> - Some effects may reflect the expectation of a perceptual switch rather than the perceptual switch itself. To mitigate this risk, the design of the fMRI task included catch trials, in which no switch occurs, to reduce the expectation of a switch. The pupil study, however, did not include such catch trials.<br /> - The paper uses RNN-based modelling to provide mechanistic insight into the role of gain modulation in perceptual switches. However, the RNN solves a task that differs markedly from that performed by human participants, which may limit the explanatory value of the model. The RNN is provided with two inputs characterising the sensory evidence supporting the first and last image category in the sequence (e.g. plane and shark). In contrast, observers in the task were naïve as to the identity of the last image at the beginning of the sequence. The brain first receives sensory evidence about the image category (e.g. plane) with which the sequence begins, which is very easy to recognise, then it sees a sequence of morphed images and has to discover what the final image category will be. To discover the final image category, the brain has to search a vast space of possible second images (it is a shark?, a frog?, a bird?, etc.), rather than comparing the likelihood of just two categories. This search process and the perceptual switch in the task appear to be mechanistically different from the competition between two inputs in the RNN.<br /> - Another aspect of the motivation for the RNN model remains unclear. The authors introduce dynamic gain modulation in the RNN, but it is not clear what the added value of dynamic gain modulation is. Both static (Fig. S1) and dynamic (Fig. 2F) gain modulation lead to the predicted effect: faster switching when the gain is larger.<br /> - The authors are to be commended for addressing their research questions with multiple tools and approaches. There are links between the different parts of the study. The RNN and the pupil are linked by the notion of gain modulation, the RNN and the fMRI analysis are linked by the study of the energy landscape, the fMRI study and the pupil study are indirectly linked by previous work for this group showing that the peak in LC fMRI activity precedes a flattening of the energy landscape. These links are very interesting but could have been stronger and more complete.

    1. eLife Assessment

      This useful study aimed to examine the relationship of spatial frequency selectivity of single macaque inferotemporal (IT) neurons to category selectivity. Interesting findings in this report suggest a shift in preferred spatial frequency during the response, from low to high spatial frequencies. This agrees with a coarse-to-fine processing strategy, which is in line with multiple studies in the early visual cortex. Some of the findings were difficult to evaluate because the methods are incomplete. The conclusion that single-unit spatial frequency selectivity can predict object coding requires further evidence to confirm.

    2. Reviewer #1 (Public Review):

      This study reports that spatial frequency representation can predict category coding in the inferior temporal cortex. The original conclusion was based on likely problematic stimulus timing (33 ms which was too brief). Now the authors claim that they also have a different set of data on the basis of longer stimulus duration (200 ms).

      One big issue in the original report was that the experiments used a stimulus duration that was too brief and could have weakened the effects of high spatial frequencies and confounded the conclusions. Now the authors provided a new set of data on the basis of a longer stimulus duration and made the claim that the conclusions are unchanged. These new data and the data in the original report were collected at the same time as the authors report.

      The authors may provide an explanation why they performed the same experiments using two stimulus durations and only reported one data set with the brief duration. They may also explain why they opted not to mention in the original report the existence of another data set with a different stimulus duration, which would otherwise have certainly strengthened their main conclusions.

    3. Reviewer #2 (Public Review):

      Summary:

      This paper aimed to examine the spatial frequency selectivity of macaque inferotemporal (IT) neurons and its relation to category selectivity. The authors suggest in the present study that some IT neurons show a sensitivity for the spatial frequency of scrambled images. Their report suggests a shift in preferred spatial frequency during the response, from low to high spatial frequencies. This agrees with a coarse-to-fine processing strategy, which is in line with multiple studies in the early visual cortex. In addition, they report that the selectivity for faces and objects, relative to scrambled stimuli, depends on the spatial frequency tuning of the neurons.

      Strengths:

      Previous studies using human fMRI and psychophysics studied the contribution of different spatial frequency bands to object recognition, but as pointed out by the authors little is known about the spatial frequency selectivity of single IT neurons. This study addresses this gap and shows spatial frequency selectivity in IT for scrambled stimuli that drive the neurons poorly. They related this weak spatial frequency selectivity to category selectivity, but these findings are premature given the low number of stimuli they employed to assess category selectivity.

      The authors revised their manuscript and provided some clarifications regarding their experimental design and data analysis. They responded to most of my comments but I find that some issues were not fully or poorly addressed. The new data they provided confirmed my concern about low responses to their scrambled stimuli. Thus, this paper shows spatial frequency selectivity in IT for scrambled stimuli that drive the neurons poorly (see main comments below). They related this (weak) spatial frequency selectivity to category selectivity, but these findings are premature given the low number of stimuli to assess category selectivity.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      This study reports that spatial frequency representation can predict category coding in the inferior temporal cortex.

      Thank you for taking the time to review our manuscript. We greatly appreciate your valuable feedback and constructive comments, which have been instrumental in improving the quality and clarity of our work.

      The original conclusion was based on likely problematic stimulus timing (33 ms which was too brief). Now the authors claim that they also have a different set of data on the basis of longer stimulus duration (200 ms).

      One big issue in the original report was that the experiments used a stimulus duration that was too brief and could have weakened the effects of high spatial frequencies and confounded the conclusions. Now the authors provided a new set of data on the basis of a longer stimulus duration and made the claim that the conclusions are unchanged. These new data and the data in the original report were collected at the same time as the authors report.

      The authors may provide an explanation why they performed the same experiments using two stimulus durations and only reported one data set with the brief duration. They may also explain why they opted not to mention in the original report the existence of another data set with a different stimulus duration, which would otherwise have certainly strengthened their main conclusions.

      Thank you for your comments regarding the stimulus duration used in our experiments. We appreciate the opportunity to clarify and provide further details on our methodology and decisions.

      In our original report, we focused on the early phase of the neuronal response, which is less affected by the duration of the stimulus. Observations from our data showed that certain neurons exhibited high firing rates even with the brief 33 ms stimulus duration, and the results we obtained were consistent across different durations. To avoid redundancy, we initially chose not to include the results from the 200 ms stimulus duration, as they reiterated the findings of the 33 ms duration.

      However, we acknowledge that the brief stimulus duration could raise concerns regarding the robustness of our conclusions, particularly concerning the effects of high spatial frequencies. Upon reflecting on the reviewer’s comments during the first revision, we recognized the importance of addressing these potential concerns directly. Therefore, we have included the data from the 200 ms stimulus duration in our revised manuscript.

      Furthermore, Our team is actively investigating the differences between fast (33 ms) and slow (200 ms) presentations in terms of SF processing. Our preliminary observations suggest similar processing of HSF in the early phase of the response for both fast and slow presentations, but different processing of HSF in the late phase. This was another reason we initially opted to publish the results from the brief stimulus duration separately, as we intended to explore the different aspects of SF processing in fast and slow presentations in subsequent studies.

      I suggest the authors upload both data sets and analyzing codes, so that the claim could be easily examined by interested readers.

      Thank you for your suggestion to make both data sets and the analyzing codes available for examination by interested readers.

      We have created a repository that includes a sample of the dataset along with the necessary codes to output the main results. While we cannot provide the entire dataset at this time due to ongoing investigations by our team, we are committed to ensuring transparency and reproducibility. The data and code samples we have provided should enable interested readers to verify our claims and understand our analysis process.

      Repository: https://github.com/ramintoosi/spatial-frequency-selectivity

      Reviewer #2 (Public Review):

      Summary:

      This paper aimed to examine the spatial frequency selectivity of macaque inferotemporal (IT) neurons and its relation to category selectivity. The authors suggest in the present study that some IT neurons show a sensitivity for the spatial frequency of scrambled images. Their report suggests a shift in preferred spatial frequency during the response, from low to high spatial frequencies. This agrees with a coarse-to-fine processing strategy, which is in line with multiple studies in the early visual cortex. In addition, they report that the selectivity for faces and objects, relative to scrambled stimuli, depends on the spatial frequency tuning of the neurons.

      Strengths:

      Previous studies using human fMRI and psychophysics studied the contribution of different spatial frequency bands to object recognition, but as pointed out by the authors little is known about the spatial frequency selectivity of single IT neurons. This study addresses this gap and shows spatial frequency selectivity in IT for scrambled stimuli that drive the neurons poorly. They related this weak spatial frequency selectivity to category selectivity, but these findings are premature given the low number of stimuli they employed to assess category selectivity.

      Thank you for your thorough review and insightful feedback on our manuscript. We greatly appreciate your time and effort in providing valuable comments and suggestions, which have significantly contributed to enhancing the quality of our work.

      The authors revised their manuscript and provided some clarifications regarding their experimental design and data analysis. They responded to most of my comments but I find that some issues were not fully or poorly addressed. The new data they provided confirmed my concern about low responses to their scrambled stimuli. Thus, this paper shows spatial frequency selectivity in IT for scrambled stimuli that drive the neurons poorly (see main comments below). They related this (weak) spatial frequency selectivity to category selectivity, but these findings are premature given the low number of stimuli to assess category selectivity.

      While we acknowledge that the number of instances per condition is relatively low, the overall dataset is substantial. Specifically, our study includes a total of 180 stimuli (6 spatial frequencies × 2 scrambled/non-scrambled conditions × 15 instances, including 9 fixed and 6 non-fixed) and 5400 trials (180 stimuli × 2 durations × 15 repetitions). Conducting these trials requires approximately one hour of experimental time per session.

      Extending the number of stimuli, while potentially addressing this limitation, would significantly compromise the quality of the experiment by increasing the duration and introducing potential fatigue effects in the subjects. Despite this limitation, our findings lay important groundwork by offering novel insights into object recognition through the lens of spatial frequency. We believe this work can serve as a foundation for future experiments designed to further explore and validate these theories with expanded stimulus sets.

      Main points.

      (1) They have provided now the responses of their neurons in spikes/s and present a distribution of the raw responses in a new Figure. These data suggest that their scrambled stimuli were driving the neurons rather poorly and thus it is unclear how well their findings will generalize to more effective stimuli. Indeed, the mean net firing rate to their scrambled stimuli was very low: about 3 spikes/s. How much can one conclude when the stimuli are driving the recorded neurons that poorly? Also, the new Figure 2- Appendix 1 shows that the mean modulation by spatial frequency is about 2 spikes/s, which is a rather small modulation. Thus, the spatial frequency selectivity the authors describe in this paper is rather small compared to the stimulus selectivity one typically observes in IT (stimulus-driven modulations can be at least 20 spikes/s).

      To address the concerns regarding the firing rates and the modulation of neuronal responses by spatial frequency (SF), we emphasize several key points:

      (1) Significance of Firing Rate Differences: While it is true that the mean net firing rate to our scrambled stimuli was relatively low, the firing rate differences observed were statistically significant, with p-values approximately at 1e-5. This indicates that despite the low firing rates, the observed differences are reliable and unlikely to have occurred by chance.

      (2) Classification Rate and Modulation by SF: Our analysis showed that the difference between various SF responses led to a classification rate of 44.68%, which is 24.68% higher than the chance level. This substantial increase above the chance level demonstrates that SF significantly modulates IT responses, even if the overall firing rates are modest.

      (3) Effect Size and SF Modulation: While the effect size in terms of firing rate differences may be small, it is significant. The significant modulation of IT responses by SF, as evidenced by our statistical analyses and classification rate, supports our conclusions regarding the role of SF in driving IT responses.

      (4) Expectations for Noise-like Pure SF Stimuli: We acknowledge that IT responses are typically higher for various object stimuli. Given the nature of our pure SF stimuli, which resemble noise-like patterns, we did not anticipate high responses in terms of spikes per second. The low firing rates are consistent with the expectation for such stimuli and do not undermine the significance of the observed modulation by SF.

      We believe that these points collectively support the validity of our findings and the significance of SF modulation in IT responses, despite the low firing rates. We appreciate your insights and hope this clarifies our stance on the data and its implications.

      We added the following description to the Appendix 1 - “Strength of SF selectivity” section:

      “While the firing rates and net responses to scrambled stimuli were modest (e.g., 2.9 Hz in T1), the differences across spatial frequency (SF) bands were statistically significant (p ≈ 1e-5) and led to a classification accuracy 24.68\% above chance. This demonstrates the robustness of SF modulation in IT neurons despite low firing rates. The modest responses align with expectations for noise-like stimuli, which are less effective in driving IT neurons, yet the observed SF selectivity highlights a fundamental property of IT encoding.”

      (2) Their new Figure 2-Appendix 1 does not show net firing rates (baseline-subtracted; as I requested) and thus is not very informative. Please provide distributions of net responses so that the readers can evaluate the responses to the stimuli of the recorded neurons.

      We understand the reviewer’s concern about the presentation of net firing rates. In T2 (the late time interval), the average response rate falls below the baseline, resulting in negative net firing rates, which might confuse readers. To address this, we have added the net responses to the text for clarity. Additionally, we have included the average baseline response in the figure to provide a more comprehensive view of the data.

      “To check the SF response strength, the histogram of IT neuron responses to scrambled, face, and non-face stimuli is illustrated in this figure. A Gamma distribution is also fitted to each histogram. To calculate the histogram, the neuron response to each unique stimulus is calculated for each neuron in spike/seconds (Hz). In the early phase, T1, the average firing rate to scrambled stimuli is 26.3 Hz which is significantly higher than the response in -50 to 50ms which is 23.4 Hz. In comparison, the mean response to intact face stimuli is 30.5 Hz, while non-face stimuli elicit an average response of 28.8 Hz. The average net responses to the scrambled, face, and non-face stimuli are 2.9 Hz, 7.1 Hz, and 5.4 Hz, respectively. Moving to the late phase, T2, the responses to scrambled, face, and object stimuli are 19.5 Hz, 19.4 Hz, and 22.4 Hz, respectively. The corresponding average net responses are 3.9 Hz, 4.0 Hz, and 1.0 Hz below the baseline response.”

      (3) The poor responses might be due to the short stimulus duration. The authors report now new data using a 200 ms duration which supported their classification and latency data obtained with their brief duration. It would be very informative if the authors could also provide the mean net responses for the 200 ms durations to their stimuli. Were these responses as low as those for the brief duration? If so, the concern of generalization to effective stimuli that drive IT neurons well remains.

      The firing rates for the 200 ms stimulus duration are as follows: 27.7 Hz, 30.7 Hz, and 30.4 Hz for scrambled, face, and object stimuli in T1), respectively; and 26.2 Hz, 29.1 Hz, and 33.9 Hz in T2. The average baseline firing rate (−50 to 50 ms) is 23.4 Hz. Therefore, the net responses are 4.3 Hz, 7.3 Hz, and 7.0 Hz for T1; and 2.8 Hz, 5.7 Hz, and 10.5 Hz for T2 for scrambled, face, and object stimuli, respectively.

      Notably, the impact of stimulus duration is more pronounced in T2, which is consistent with the time interval of the T2 compared to T1. However, the firing rates in T1 do not show substantial changes with the longer duration. As we discussed in our response to the first comment, it is important to note that high net responses are not typically expected for scrambled or noise-like stimuli in IT neurons. Instead, the key findings of this study lie in the statistical significance of these responses and their meaningful relationship to category selectivity. These results highlight the broader implications for understanding the role of spatial frequency in object recognition.

      We added the firing rates to the, Appendix 1, “Extended stimulus duration supports LSF-preferred tuning” part as follows.

      “For the 200 ms stimulus duration, the firing rates were 27.7 Hz, 30.7 Hz, and 30.4 Hz for scrambled, face, and object stimuli in T1, respectively, and 26.2 Hz, 29.1 Hz, and 33.9 Hz in T2. The corresponding net responses were 4.3 Hz, 7.3 Hz, and 7.0 Hz in T1, and 2.8 Hz, 5.7 Hz, and 10.5 Hz in T2. While the longer stimulus duration did not substantially increase firing rates in T1, its impact was more pronounced in T2.”

      (4) I still do not understand why the analyses of Figures 3 and 4 provide different outcomes on the relationship between spatial frequency and category selectivity. I believe they refer to this finding in the Discussion: "Our results show a direct relationship between the population's category coding capability and the SF coding capability of individual neurons. While we observed a relation between SF and category coding, we have found uncorrelated representations. Unlike category coding, SF relies more on sparse, individual neuron representations.". I believe more clarification is necessary regarding the analyses of Figures 3 and 4, and why they can show different outcomes.

      Figure 3 explores the relationship between SF coding and category coding at both the single-neuron and population levels.

      ● Figures 3(a) and 3(b) examine the relationship between a single neuron’s response pattern and object decoding in the population.

      ● Figure 3(c) investigates the relationship between a single neuron’s SF decoding capabilities and object decoding in the population.

      ● Figure 3(d) assesses the relationship between a single neuron’s object decoding capabilities and SF decoding in the population.

      In summary, Figure 3 demonstrates a relation between SF coding/response pattern at the single level and category coding at the population level.

      Figure 4, on the other hand, addresses the uncorrelated nature of SF and category coding.

      ● Figure 4(a) shows the uncorrelated relation between a single neuron’s SF decoding capability and its object decoding capability. This suggests that a neuron's ability to decode SF does not predict its ability to decode object categories.

      ● Figure 4(b) illustrates that the contribution of a neuron to the population decoding of SF is uncorrelated with its contribution to the population decoding of object categories. This further supports the idea that the mechanisms behind SF coding and object coding are uncorrelated.

      In summary, Figure 4 suggests that while there is a relation between SF coding and category coding as illustrated in Figure 3, the mechanisms underlying SF coding and object coding operate independently (in terms of correlation), highlighting the distinct nature of these processes.

      We hope this explanation clarifies why the analyses in Figures 3 and 4 present different outcomes. Figure 3 provides insight into the relationship between SF and category coding, while Figure 4 emphasizes the uncorrelated nature of these processes. We also added the following explanation in the “Uncorrelated mechanisms for SF and category coding” section.

      Based on your command, to clarify the presentation of the work, we added the following description to the “Uncorrelated mechanisms for SF and category coding” section:

      “Figures 3 and 4 examine different aspects of the relationship between SF and category coding. Figure 3 highlights a relationship between SF coding at the single-neuron level and category coding at the population level. Conversely, Figure 4 demonstrates the uncorrelated mechanisms underlying SF and category coding, showing that a neuron’s ability to decode SF is not predictive of its ability to decode object categories. This distinction underscores that while SF and category coding are related at broader levels, their underlying mechanisms are independent, emphasizing the distinct processes driving each form of coding.”

      (5) The authors found a higher separability for faces (versus scrambled patterns) for neurons preferring high spatial frequencies. This is consistent for the two monkeys but we are dealing here with a small amount of neurons. Only 6% of their neurons (16 neurons) belonged to this high spatial frequency group when pooling the two monkeys. Thus, although both monkeys show this effect I wonder how robust it is given the small number of neurons per monkey that belong to this spatial frequency profile. Furthermore, the higher separability for faces for the low-frequency profiles is not consistent across monkeys which should be pointed out.

      We appreciate the reviewer’s concern regarding the relatively small number of neurons in the high spatial frequency group (16 neurons, 6% of the total sample across the two monkeys) and the consistency of the results. While we acknowledge this limitation, it is important to note that findings involving sparse subsets of neurons can still be meaningful. For example, Dalgleish et al. (2020) demonstrated that perception can arise from the activity of as few as ~14 neurons in the mouse cortex, supporting the sparse coding hypothesis. This underscores the potential robustness of results derived from small neuronal populations when the activity is statistically significant and functionally relevant.

      Regarding the higher separability for faces among neurons preferring high spatial frequencies, the consistency of this finding across both monkeys suggests that this effect is robust within this subgroup. For neurons preferring low spatial frequencies, we agree that the lack of consistency across monkeys should be explicitly noted. These differences may reflect individual variability or differences in sampling across subjects and merit further investigation in future studies.

      To address this concern, we have updated the text to explicitly discuss the small size of the high spatial frequency group, its implications, and the observed inconsistency in the low spatial frequency profiles between monkeys. We have added the following description to the discussion.

      “Next, according to Figure 3(a), 6% of the neurons are HSF-preferred and their firing rate in HSF is comparable to the LSF firing rate in the LSF-preferred group. This analysis is carried out in the early phase of the response (70-170ms). While most of the neurons prefer LSF, this observation shows that there is an HSF input that excites a small group of neurons. Importantly, findings involving small neuronal populations can still be meaningful, as studies like Dalgleish et al. (2020) have demonstrated that perception can arise from the activity of as few as ~14 neurons in the mouse cortex, emphasizing the robustness of sparse coding.”

      Regarding the separability of faces for the low-frequency profiles, we added the following to the appendix section,

      “For neurons preferring LSF, LP profile, it is important to note the lack of consistency in responses across monkeys. This variability may reflect individual differences in neural processing or variations in sampling between subjects.”

      And in the discussion:

      “Our results are based on grouping the neurons of the two monkeys; however, the results remain consistent when looking at the data from individual monkeys as illustrated in Appendix 2. However, for neurons preferring LSF, we observed inconsistency across monkeys, which may reflect individual differences or sampling variability. These findings highlight the complexity of SF processing in the IT cortex and suggest the need for further research to explore these variations.”

      * Henry WP Dalgleish, Lloyd E Russel, lAdam M Packer, Arnd Roth, Oliver M Gauld, Francesca Greenstreet, Emmett J Thompson, Michael Häusser (2020) How many neurons are sufficient for perception of cortical activity? eLife 9:e58889.

      (6) I agree that CNNs are useful models for ventral stream processing but that is not relevant to the point I was making before regarding the comparison of the classification scores between neurons and the model. Because the number of features and trial-to-trial variability differs between neural nets and neurons, the classification scores are difficult to compare. One can compare the trends but not the raw classification scores between CNN and neurons without equating these variables.

      We appreciate the reviewer’s follow-up comment and agree that differences in the number of features and trial-to-trial variability between IT neurons and CNN units make direct comparisons of raw classification scores challenging. As the reviewer suggests, it is more appropriate to focus on comparing trends rather than absolute scores when analyzing the similarities and differences between these systems. In light of this, we have revised the text to clarify that our intention was not to equate raw classification scores but to highlight the qualitative patterns and trends observed in spatial frequency encoding between IT and CNN units.

      “SF representation in the artificial neural networks

      We conducted a thorough analysis to compare our findings with CNNs. To assess the SF coding capabilities and trends of CNNs, we utilized popular architectures, including ResNet18, ResNet34, VGG11, VGG16, InceptionV3, EfficientNetb0, CORNet-S, CORTNet-RT, and CORNet-z, with both pre-trained on ImageNet and randomly initialized weights. Employing feature maps from the four last layers of each CNN, we trained an LDA model to classify the SF content of input images. Figure 5(a) shows the SF decoding accuracy of the CNNs on our dataset (SF decoding accuracy with random (R) and pre-trained (P) weights, ResNet18: P=0.96±0.01 / R=0.94±0.01, ResNet34 P=0.95±0.01 / R=0.86±0.01, VGG11: P=0.94±0.01 / R=0.93±0.01, VGG16: P=0.92±0.02 / R=0.90±0.02, InceptionV3: P=0.89±0.01 / R=0.67±0.03, EfficientNetb0: P=0.94±0.01 / R=0.30±0.01, CORNet-S: P=0.77±0.02 / R=0.36±0.02, CORTNet-RT: P=0.31±0.02 / R=0.33±0.02, and CORNet-z: P=0.94±0.01 / R=0.97±0.01). Except for CORNet-z, object recognition training increases the network's capacity for SF coding, with an improvement as significant as 64\% in EfficientNetb0. Furthermore, except for the CORNet family, LSF content exhibits higher recall values than HSF content, as observed in the IT cortex (p-value with random (R) and pre-trained (P) weights, ResNet18: P=0.39 / R=0.06, ResNet34 P=0.01 / R=0.01, VGG11: P=0.13 / R=0.07, VGG16: P=0.03 / R=0.05, InceptionV3: P=<0.001 / R=0.05, EfficientNetb0: P=0.07 / R=0.01). The recall values of CORNet-Z and ResNet18 are illustrated in Figure 5(b). However, while the CNNs exhibited some similarities in SF representation with the IT cortex, they did not replicate the SF-based profiles that predict neuron category selectivity. As depicted in Figure 5(c) although neurons formed similar profiles, these profiles were not associated with the category decoding performances of the neurons sharing the same profile.”

      Discussion:

      “Finally, we compared SF's representation trends and findings within the IT cortex and the current state-of-the-art networks in deep neural networks.”

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      The mean baseline firing rate of their neurons (23.4 Hz) was rather high for single IT neurons (typically around 10 spikes/s or lower). Were these well-isolated units or mainly multiunit activity?

      We confirm that the recordings in our study were from both well-isolated single units and multi-unit activities (remaining after isolation neurons) sorted based on our spike sorting toolbox. The higher baseline firing rate is likely due to the experimental design, particularly the inclusion of the responsive neurons from the selectivity phase. We added the following statement to the methods section.

      “In our analysis, we utilized both well-isolated single units and multi-unit activities (which represent neural activities that could not be further sorted into single units), ensuring a comprehensive representation of neural responses across the recorded population.”

    1. eLife Assessment

      This important study identifies species- and sex-specific neuronal cell types and gene expression in the preoptic area (POA) to help understand the evolutionary divergence of social behaviors. The evidence from single-nucleus RNA sequencing and immunostaining is compelling and suggests that cellular differences in the POA may contribute to behavioral variations such as mating and parental care that are apparent in two closely related deer mouse species. These rich observations provide an entry point for future hypothesis-driven experiments to demonstrate a causal role for these populations in sex- or species-variable behaviors in vertebrates. These data will be a resource that is of value to behavioral neuroscientists.

    2. Reviewer #1 (Public review):

      (1) Summary of the Paper:

      This paper by Chen et al. examines the cellular composition and gene expression of the hypothalamic medial preoptic area (MPOA) in two closely related deer mouse species (P. maniculatus and P. polionotus) that exhibit distinct social behaviors. Through single-nucleus RNA sequencing (snRNA-seq), Chen et al., identify sex- and species-specific neuronal cell types that likely contribute to differences in mating and parental care. By comparing monogamous and promiscuous species, the study provides insights into how neuronal diversity and gene expression changes in the MPOA might underlie the evolution of social behaviors.

      (2) Strengths of the Paper:

      The paper excels in several areas. First, the data presentation is clear and well-organized, making the complex findings easy to follow. The writing is straightforward and highly accessible, which enhances the overall readability. The experimental design is innovative, particularly in how they combined samples from different species into the same dataset and then used post-hoc identification to distinguish cell types by species. This dramatically controls for potential batch effects in my opinion. Additionally, the authors contextualize their findings within the framework of previously published studies on Mus musculus, providing a strong comparative analysis that enhances the significance of their work.

      (3) Weaknesses of the Paper:

      The major limitation of the study is the absence of causal experiments linking the observed changes in MPOA cell types to species-specific social behaviors. While the study provides valuable correlational data, it lacks functional experiments that would demonstrate a direct relationship between the neuronal differences and behavior. For instance, manipulating these cell types or gene expressions in vivo and observing their effects on behavior would have strengthened the conclusions, although I certainly appreciate the difficulty in this, especially in non-musculus mice. Without such experiments, the study remains speculative about how these neuronal differences contribute to the evolution of social behaviors.

    3. Reviewer #2 (Public review):

      Summary:

      The authors report several interesting species and sex differences in cell type expression that may relate to species differences in behavior. The differential cell type abundance findings build on previously observed species/sex differences in behavior and brain anatomy. These data will be a valuable resource for behavioral neuroscientists. These findings are important but the manuscript goes too far in attributing causal influences to differences in behavior. A second important problem is that dissections used for the sequencing data include other neuropeptide-rich areas of the hypothalamus like the PVN. Although histology is included, the results into the main manuscript often do not include the mPOA making it hard to know if species/sex differences are consistent across different hypothalamic regions. The manuscript would benefit from more precise language.

      Strengths:

      The data are novel because cell-type atlases are available for only a few species.

      The authors have clearly defined appropriate steps taken to obtain trustworthy estimations of cell type abundance. Furthermore, the criteria for each cell type assignment was described in a way for readers to easily replicate. The rigor in comparing cell abundance provides convincing evidence that these species have differences in MPOA cellular composition.

      The authors have a good explanation for why 19 of the 53 neuron clusters were not classified (possible Mus/Peromyscus anatomical differences, some cell types don't have well-defined transcriptional profiles)

      Validated findings with histology.

    4. Reviewer #3 (Public review):

      Summary:

      The authors performed snRNA-seq in the pre-optic area (POA), a heterogeneous brain region implicated in multiple innate behaviors, comparing two species of Peromyscus mice that possess strikingly different parenting behaviors. P. polionotus show high levels of parental care from both sexes of parent, and P. maniculatus show lower levels of care, predominantly displayed by dams rather than sires. The overall goal of understanding the genomic basis of behavioral variation is significant and of broad interest and comparative studies in POA in these two species is an excellent approach to tackle this question. The authors correctly point out that existing studies largely compare species that are highly divergent, such as mice and humans, which confounds the association of specific neuronal populations or gene expression patterns with distinct behaviors. They identify neuronal populations with differential abundance between species and sexes, and additionally report sex and species differences in gene expression within each transcriptomic cell type. Their cell type classification is aided by mapping their Peromyscus cells onto a previously existing POA single cell dataset generated in lab mice. The detection and validation of previously observed sex differences in the Gal/Moxd1 cell type, and species differences in Avp expression provides additional support that their data are robust. Importantly, the authors demonstrate reduced sexual dimorphism in the POA of P. polionotus, compared to P. maniculatus, and prior knowledge in rats and mice. This finding suggests a potential neural substrate for the increased parental behavior in P. polionotus.

      Strengths:

      This is a pioneering comparative snRNA-seq study that provides a roadmap for similar approaches in non-traditional model organisms.

      The authors have identified populations that may underlie sex- and species- differences in parenting behavior in rodents.

      A significant strength of the manuscript is the histological validation of their most robust marker genes.

      Weaknesses:

      My primary concern is that the dataset is limited: 52,121 neuronal nuclei across 24 samples, which does not provide many cells per cluster to analyze comparatively across sex and species, particularly given the heterogeneity of the large region dissected, which contains adjacent regions such as the PVN and SCN.

      There is no explanation for the finding that there is a female-bias in gene expression across all cell types in P. polionotus.

    5. Author response:

      The following is the authors’ response to the original reviews.

      We thank the reviewers for their thoughtful comments.

      Based on their suggestions we will:

      (1) Use more accurate language to describe the hypothalamus regions under investigation in this study. While we aimed to primarily investigate the medial preoptic area (MPOA), our dissections and sequencing data in fact capture several regions of the anterior hypothalamus including the anteroventral periventricular (AVPV), paraventricular (PVN), supraoptic (SON), suprachiasmatic nuclei (SCN), and more. We will revise the language in our manuscript to reflect that our study in fact investigates the cellular evolution of the anterior hypothalamus across behaviorally divergent deer mice.

      (2) Revise our language to clarify that while our study provides a rich dataset for generating hypotheses about which cell types may contribute to behavioral differences, it does not provide any evidence of causal relationships. We hope to investigate this further in future work.

      (3) Clarify specific methodological choices for which reviewers had questions, especially about the hypothalamic regions for which we did histology to validate cell abundance differences and methodological choices related to mapping our cell clusters to Mus cell types.

      Our responses to each reviewer’s specific comments are below.

      Reviewer #1:

      The major limitation of the study is the absence of causal experiments linking the observed changes in MPOA cell types to species-specific social behaviors. While the study provides valuable correlational data, it lacks functional experiments that would demonstrate a direct relationship between the neuronal differences and behavior. For instance, manipulating these cell types or gene expressions in vivo and observing their effects on behavior would have strengthened the conclusions, although I certainly appreciate the difficulty in this, especially in non-musculus mice. Without such experiments, the study remains speculative about how these neuronal differences contribute to the evolution of social behaviors.

      Yes, we agree the study lacks functional experiments. We hope that the dataset is of value for generating hypotheses about how hypothalamic neuronal cell types may govern species-specific social behaviors, and for these hypotheses to be functionally tested by us and others in future work.

      Reviewer #2:

      Some methodology could be further explained, like the decision of a 15% cutoff value for cell type assignment per cluster, or the necessity of a multi-step analysis pipeline for gene enrichment studies.

      A 15% cutoff value for cell type assignment was chosen to include all known homology correspondences between our dataset and the Mus atlas. For example, i14:Avp/Cck cells from the Mus atlas represent Avp cells from the suprachiasmatic nuclei (SCN). Though only 17.3% of cluster 15 maps to i14:Avp/Cck, we know these two clusters correspond based on the expression of Avp and additional SCN marker genes in cluster 15 (Supp Fig 6). We will further explain this cutoff in the revised manuscript.

      Our gene enrichment study includes a multi-step analysis pipeline because we wanted to control for confounders that may be introduced because of gene expression level. Genes that are more highly expressed are more accurately quantified and thus more likely to be identified as differentially expressed. Therefore, we wanted to test for gene enrichments in our set of DE genes against a background of genes with similar expression levels. We will clarify this motivation in the revised manuscript.

      The authors should exercise strong caution in making inferences about these differences being the basis of parental behavior. It is possible, given connections to relevant research, but without direct intervention, direct claims should be avoided. There should be clear distinctions of what to conclude and what to propose as possibilities for future research.

      Yes, we agree that we are unable to make direct claims about neuronal differences being the basis of parental behavior. We will revise our language to be clearer about which relationships we are hypothesizing and what we propose as possibilities for future research.

      Histology is not performed on all regions included in the sequencing analysis.

      We apologize that our language describing the hypothalamic regions included in the sequencing analysis and those included in the histology is unclear. We aimed to dissect the medial preoptic region for the sequencing analysis, but additionally captured parts of the anterior hypothalamus including the paraventricular (PVN), supraoptic (SON), and suprachiasmatic nuclei (SCN), and more.  Our histology was performed across the entire hypothalamus and includes all regions included in the sequencing data. We will revise the manuscript to more accurately describe the hypothalamic regions for which we investigated.

      Reviewer #3:

      My primary concern is that the dataset is limited: 52,121 neuronal nuclei across 24 samples, which does not provide many cells per cluster to analyze comparatively across sex and species, particularly given the heterogeneity of the region dissected. The Supplementary table reports lower UMIs/genes per cell than is typically seen as well. Perhaps additional information could be obtained from the data by not restricting the analyses to cells that can be assigned to Mus types. A direct comparison of the two Peromyscus species could be valuable as would a more complete Peromyscus POA atlas.

      Our dataset reports ~1,500 genes and ~1,000 UMIs per nuclei which is indeed lower than is typically reported in other single nuclei datasets. Some of this discrepancy is due to a lower quality genome and annotated transcriptome available for Peromyscus compared to Mus musculus, which results in a lower mapping rate than is typically reported in Mus studies. However, our dataset was sufficient to identify known peptidergic cell types (Supp Fig 6) and to map homology to Mus cell types for 34 (64%) of our 53 clusters. Additionally, although some of our clusters contain small numbers of cells, our differential abundance analysis accounts for the variance in cell numbers observed across samples and should be robust against any increase in variance due to small numbers. In fact, even differential abundance of very small cell clusters such as oxytocin neurons (cell type 40) was validated by histology.

      We would like to clarify that all analyses were performed on all cell clusters, regardless of whether or not they could be assigned homology to a Mus cell type. All the cell types that we identified as differentially abundant or contained significant sex differences happened to be cell types for which homology to a Mus cell type could be defined. This may arise for a relatively uninteresting reason: cell types that have more distinct transcriptional signatures will be more accurately clustered, leading to more accurate identification of homology as well as more accurate measurements of differential abundance / expression. We will revise language to make this more clear in our manuscript.

      In Supplement 7, it appears that most neurons can be assigned as excitatory or inhibitory, but then so many of these cells remain in the unassigned "gray blob" seen in panel 1E. Clustering of excitatory and inhibitory neurons separately, as in prior cited work in Mus POA (refs 31 and 57) may boost statistical power to detect sex and species differences in cell types. Perhaps the cells that cannot be assigned to Mus contain too few reads to be useful, in which case they should be filtered out in the QC. The technical challenges of a comparative single-cell approach are considerable, so it benefits the scientific community to provide transparency about them.

      We are not certain about why we are unable to cluster and assign homology to many of our cells (i.e. cells in the unassigned “gray blob”). However, we note that even in the Mus atlas, many cells did not belong to obvious clusters by UMAP visualization and that several clusters lacked notable marker genes and were designated simply as “Gaba” and “Glut” clusters. Therefore, it is unsurprising that our own dataset also contains cells that lack the transcriptional signatures needed to be clustered and/or mapped to Mus cell types. We do know, however, that the median number of reads/nuclei is uniform across cell clusters and does not explain why some clusters could not be assigned to Mus. We will add this information to our revised manuscript.

      We do not think that a two-stage clustering (i.e. clustering first by excitatory vs. inhibitory neurons) is expected to gain power to resolve cell types in this case. Excitatory vs. inhibitory neurons are clearly separable on our UMAP (Supp Fig 7) so that information is already being used by our clustering procedure. However, we will explore this further in our revised manuscript to see if doing so will boost statistical power.

      The Calb1 dimorphism as observed by immunostaining, appears much more extensive in P. maniculatus compared to P. polionotus (Figures 3 E and F). This finding is not reflected in the counts of the i20:Gal/Moxd1 cluster. The use of Calb1 staining as a proxy for the Gal/Moxd1 cluster would be strengthened if the number of POA Calb1+ neurons that are found in each cluster was apparent. There may be additional Calb+ neurons in the cells that are not annotated to a Mus cluster. This clarification would add support to the overall conclusion that there is reduced sexual dimorphism in P. polionotus.

      From the Mus MPOA atlas (which includes both single-cell sequencing data and imaging-based spatial information), it is known that the i20:Gal/Moxd1 cluster comprises sexually dimorphic cells that make up both the BNST and the SDN-POA. These sexually dimorphic cells are well-studied and known to be marked by Calb1, which we used in immunostaining as a proxy for i20:Gal/Moxd1.

      However, we would like to clarify that in our study, the immunostaining of Calb1+ neurons and the sequencing counts of the i20:Gal/Moxd1 cluster are not completely reflective of each other because our sequencing dataset only captured the ventral portion of the BNST. Therefore our i20:Gal/Moxd1 counts contain a combination of some Calb1+ BNST cells and likely all Calb1+ SDN-POA cells and is difficult to interpret on its own. Our histology, however, covers the entire hypothalamus and is more reliable for identifying sex and species differences in each region. We will clarify this in the revised manuscript.

      The relationship between the sex steroid receptor expression and the sex bias in gene expression would be improved if the sex bias in sex steroid receptor expression was included in Supplementary Figure 10.

      We will include this in the revised manuscript.

      There is no explanation for the finding that there is a female bias in gene expression across all cell types in P. polionotus.

      We also find this observation interesting but don’t have a good explanation for why at this point. We plan to follow this up in future work.

    1. Author Response:

      We appreciate the reviewers' detailed feedback, which has highlighted several areas where our study could be strengthened. Although we acknowledge the relatively limited scope of our CRISPR-based gene-deletion screen, we successfully demonstrated the immunogenic role of Pccb in our syngenetic pancreatic cancer mouse model. Specifically, loss of PCCB in our mutant KRAS/p53 PIK3CA-null (αKO) cells blocked host T cell killing of tumor cells.

      Furthermore, blocking the PD1/PD-L1 interaction reverses this anti-tumor immunogenic effect. We agree with the reviewers regarding the limitations of our study, such as the sample size in our scTCR sequencing and the lack of direct cytotoxicity assays to confirm tumor-specific T cell clones. However, our results are consistent across multiple experimental approaches that strongly suggest meaningful differences in host T cell response to the three implanted tumor types, KPC, αKO and p-αKO. We agree that future mechanistic studies will be important to determine how PCCB is involved in this immunogenic response. We also agree with the reviewers that future additional studies with other KPC cell lines will strength our conclusion regarding PCCB. Finally, we acknowledge the inherent limitations of IHC techniques to assess the involvement of other T cell checkpoints that might also be involved in this anti-tumor immunogenic effect. In summary, despite these limitations, our findings provide novel insight into the role of PCCB in pancreatic tumor immunogenicity and contribute to the ongoing discussion of how to improve therapeutic strategies for this deadly cancer.

      Reviewer 1:

      Weaknesses:

      (1) Clonal expansion of cytotoxic T cells infiltrating the pancreatic αKO tumors

      a. Only two tumor-bearing hosts were evaluated by single-cell TCR sequencing, thus limiting conclusions that may be drawn regarding repertoire diversity and expansion.

      We agree with the reviewer that possible repertoire diversity and expansion could be observed by sequencing more tumor-bearing hosts. However, our current data reveal a marked consistency in the transcriptional expression within the two tumors analyzed per group. Importantly, these features are significantly divergent between the αKO and p-αKO groups. While recognizing the limited sample size, the observed within-group consistency and the clear distinction between groups strongly support the validity of the reported trends.

      b. High abundance clones in the TME do not necessarily have tumor specificity, nor are they necessarily clonally expanded. They may be clones which are tissue-resident or highly chemokine-responsive and accumulate in larger numbers independent of clonal expansion. Please consider softening language to clonal enrichment or refer to clone size as clonal abundance throughout the paper.

      We agree with the reviewer that it’s possible that the high abundance clones are not necessarily tumor specific. Our previous work (N. Sivaram 2019) demonstrated the critical role of increased pancreatic CD8+ T cells in αKO tumor regression within B6 mice. Therefore, antigen specific CD8+ T cell clonal expansion within the pancreas is an anticipated observation. However, as the reviewer pointed out, a portion of this expansion may be attributable to factors independent of tumor antigens. While the low T cell infiltration observed in KPC-implanted mice argues against a purely tissue-resident explanation, further investigation is required to definitively establish the tumor specificity of individual clones. We have revised the manuscript to reflect this nuance, replacing "clonal expansion" with "clonal enrichment".

      c. The whole story would be greatly strengthened by cytotoxicity assays of abundant TCR clones to show tumor antigen specificity.

      As mentioned above, we agree with the reviewer that future studies are needed to investigate each of the specific clones. Due to the extended timeframe required, it’s beyond the scope of the present study.

      (2) A genome-wide CRISPR gene-deletion screen to identify molecules contributing to Pik3camediated pancreatic tumor immune evasion"

      a. CRISPR mutagenesis yielded outgrowth of only 2/8 tumors. A more complete screen with an increased total number of tumors would yield much stronger gene candidates with better statistical power. It is unsurprising that candidates were observed in only one of the two tumors. Nevertheless, the authors moved forward successfully with Pccb.

      We agree that by including more mice in the CRISPR screen, it’s possible that we could have identified more candidates. Regardless, we have successfully demonstrated PCCB’s role in pancreatic tumorgenicity with our mouse model.

      (3) T cells infiltrate p-αKO tumors with increased expression of immune checkpoint

      *a. In Figure 4D, cell counts are not normalized to totalCD8+ T cell counts making it difficult to directly compare aKO to p-aKO tumors. Based on quantifications from Figure 4D, I suspect normalization will strengthen the conclusion that CD8+ infiltrate is more exhausted in p-aKO tumors. *

      Due to the use of distinct tumor sections for quantifying CD8+ cells and T cell checkpoint inhibitory receptor expression, direct normalization of these counts is challenging. However, we observed comparable CD8+ cell numbers between αKO and p-αKO tumors, with p-αKO tumors exhibiting nearly double the expression of immune checkpoint receptors. Therefore, even accounting for potential normalization discrepancies, we anticipate that p-αKO tumors would still demonstrate a significantly higher percentage of immune checkpoint receptorpositive cells compared to αKO tumors.

      b. Flow cytometric analysis to further characterize the myeloid compartment is incomplete (single replicate) and does not strengthen the argument that p-aKO TME is more immunosuppressive. It could, however, strengthen the argument that TIL has less anti-tumor potential if effector molecule expression in CD8+ infiltrating cells were quantified.

      We agree that including more tumor samples will strengthen the argument that p-αKO TME is more immunosuppressive. Future studies need to be done to characterize CD8+ T cells.

      (4) Inhibition of PD1/PD-L1 checkpoint leads to elimination of most p-αKO tumors

      a. It is reasonable to conclude that p-aKO tumors are responsive to immune checkpoint blockade. However, there is no data presented to support the statement that checkpoint blockade reactivates an existing anti-tumor CD8+ T cell response and does not induce a de novo response

      We agree that future studies exploring the clonotypes of T cells infiltrating tumors in PD-1treated mice are necessary to determine whether observed T cell response represents reactivation of existing clones, a de novo response, or a combination of both.

      b. The discussion of these data implies that anti-PD-1 would not improve aKO tumor control, but these data are not included. As such, it is difficult to compare the therapeutic response in aKO versus p-aKO. Further, these data are at best an indirect comparison of the T cell responsiveness against tumor, as the only direct comparison is infiltrating cell count in Figure 4 and there are no public TCR clones with confirmed anti-tumor specificity to follow in the aKO versus p-aKO response.

      Since αKO tumors completely regress with 100% animal survival, we deemed anti-PD1 treatment in this group unnecessary. While we did assess anti-PD1 treatment in KPCimplanted mice, no survival benefit was observed (data not shown). The p-αKO tumor model was the only one in which anti-PD1 treatment improved survival. The complexity of the in vivo tumor microenvironment likely contributes to the lack of shared TCR clones between αKO and p-αKO tumors, even within the same tumor group. Future studies aimed at identifying tumorspecific clones may involve transferring in vivo models to in vitro assays or the generation of novel mouse strains expressing identified TCRs. However, these approaches require substantial time and resources and are beyond the scope of the present study.

      Reviewer 2:

      Weaknesses:

      (1) A major issue is that it seems these data are based on the use of a single tumor cell clone with PIK3CA deleted. Therefore, there could be other changes in this clone in addition to the deletion of PIK3CA that could contribute to the phenotype.

      We have previously tested a different KPC cell line (DT10022) with genetically downregulated PIK3CA and found mice implanted with αKO cells also showed tumor regression. However, we have not tested if deletion of Pccb in the DT10022-aKO cell line will have the same effect.

      2) The conclusion that the change in the PCCB-deficient tumor cell line is unrelated to mitochondrial metabolic changes may be incorrect based on the data provided. While it is true that in the experiments performed, there was no statistically significant change in the oxygen consumption rate or metabolite levels, this could be due to experimental error. There is a trend in the OCR being higher in the PCCB-deficient cells, although due to a high standard deviation, the change is not statistically significant. There is also a trend for there being more aKG in this cell line, but because there were only 3 samples per cell line, there is no statistically significant difference.

      Although PCCB is known to cause metabolic changes, in the context of this study, we are comparing PCCB-deficient to PCCB & PIK3CA double-deficient cells. We did not address if PCCB loss alone would cause metabolic alteration. We suspect that is the case.

      (3) More data are required to make the authors' conclusion that there are myeloid changes in the PCCB-deficient tumor cells. There is only flow data from shown from one tumor of each type.

      We agree that including more tumor samples will strengthen the argument that p-αKO TME is more immunosuppressive.

      (4) The previous published study demonstrated increased MHC and CD80 expression in the PIK3CA-deficient tumors and these differences were suggested to be the reason the tumors were rejected. However, no data concerning the levels of these proteins were provided in the current manuscript.

      Our previous hypothesis for altered MHC and CD80 levels is based on the observation that there is a dramatic increase in the number of infiltrating T cells upon Pik3ca deletion. In this study, similar levels of infiltrating T cells were observed when Pccb was deleted in αKO cells, therefore we do not expect any changes in MHC and CD80 levels since these tumors appears to be still recognized by the T cells. Indeed, we are able detect clonal enrichment in p-αKO tumors.

      Reviewer 3:

      Weaknesses:

      The IHC technique that was used to stain and characterize the exhaustion status of the tumorinfiltrating T cells.

      We agree with the reviewer that incorporating multi-color IHC or flow cytometry to characterize the exhaustion status of specific T cell subtypes would provide more comprehensive information. Unfortunately, we do not have the resources to perform these studies currently.

    1. eLife Assessment

      This is a valuable study, tackling the long-standing issue of the difficulty in imaging the inferior olive and addressing the most relevant questions with a rigorous approach. The technological advance allowed the authors to generate solid experimental evidence with high-quality data. The results are presented clearly and the analyses are rigorous.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript by Guo and Uusisaari describes a series of experiments that employ a novel approach to address long-standing questions on the inferior olive in general and the role of the nucleo-olivary projection specifically. For the first time, they optimized the ventral approach to the inferior olive to facilitate imaging in this area that is notoriously difficult to reach. Using this approach, they are able to compare activity in two olivary regions, the PO and DAO, during different types of stimulation. They demonstrate the difference between the two regions, linked to Aldoc-identities of downstream Purkinje cells, and that there is co-activation resulting in larger events when they are clustered. Periocular stimulation also drives larger events, related to co-activation. Using optogenetic stimulation they activate the nucleo-olivary (N-O) tract and observe a wide range of responses, from excitation to inhibition. Zooming in on inhibition they test the assumption that N-O activation can be responsible for suppression of sensory-evoked events. Instead, they suggest that the N-O input can function to suppress background activity while preserving the sensory-driven responses.

      Strengths:

      This is an important study, tackling the long-standing issue of the impossibility to do imaging in the inferior olive and using that novel method to address the most relevant questions. The experiments are technically very challenging, the results are presented clearly and the analysis is quite rigorous. There is quite a lot of room for interpretation, see weaknesses, but the authors make an effort to cover many options.

      Weaknesses:

      The heavy anesthesia that is required during the experiment could severely impact the findings. Because of the anesthesia, the firing rate of IO neurons is found to be ~0.1 Hz, significantly lower than the 1 Hz found in non-anesthetized mice. This is mentioned and discussed, but what the consequences could be cannot be understated and should be addressed more. Although the methods and results are described in sufficient detail, there are a few points that, when addressed, would improve the manuscript.

    3. Reviewer #2 (Public review):

      The authors developed a strategy to image inferior olive somata via viral GCaMP6s expression, an implanted GRIN lens, and a one-photon head-mounted microscope, providing the first in vivo somatic recordings from these neurons. The main new findings relate to the activation of the nucleoolivary pathway, specifically that: this manipulation does not produce a spiking rebound in the IO; it exerts a larger effect on spontaneous IO spiking than stimulus (airpuff)-evoked spiking. In addition, several findings previously demonstrated in vivo in Purkinje cell complex spikes or inferior olivary axons are confirmed here in olivary somata: differences in event sizes from single cells versus co-activated cells; reduced coactivation when activating the NO pathway; more coactivation within a single zebrin compartment.

      The study presents some interesting findings, and for the most part, the analyses are appropriate. My two principal critiques are that the study does not acknowledge major technical limitations and their impact on the claims; and the study does not accurately represent prior work with respect to the current findings.

      Several significant technical limitations necessarily impact the veracity of several of the claims:

      (1) The authors use GCaMP6s, which has a tau_1/2 of >1 s for a normal spike, and probably closer to 2 s (10.1038/nature12354) for the unique and long type of olivary spikes that give rise to axonal bursts (10.1016/j.neuron.2009.03.023). Indeed, the authors demonstrate as much (Fig. 2B1). This affects at least several claims:

      a. The authors report spontaneous spike rates of 0.1 Hz. They attribute this to anesthesia, yet other studies under anesthesia recording Purkinje complex spikes via either imaging or electrophysiology report spike rates as high as 1.5 Hz (10.1523/JNEUROSCI.2525-10.2011). This discrepancy is not acknowledged and a plausible explanation is not given. Citations are not provided that demonstrate such low anesthetized spike rates, nor are citations provided for the claim that spike rates drop increasingly with increasing levels of anesthesia when compared to awake resting conditions. More likely, this discrepancy reflects spikes that are missed due to a combination of the indicator kinetics and low imaging sensitivity (see (2)), neither of which are presented as possible plausible alternative explanations.

      b. Many claims are made throughout about co-activation ("clustering"), but with the GCaMP6s rise time to peak (0.5 s), there is little technical possibility to resolve co-activation. This limitation is not acknowledged as a caveat and the implications for the claims are not engaged with in the text.

      c. The study reports an ultralong "refractory period" (L422-etc) in the IO, but this again must be tempered by the possibility that spikes are simply being missed due to very slow indicator kinetics and limited sensitivity. Indeed, the headline numeric estimate of 1.5 s (L445) is suspiciously close to the underlying indicator kinetic limitation of ~1-2 s.

      (2) The study uses endoscopic one-photon miniaturized microscope imaging. Realistically, this is expected to permit an axial point spread function (z-PSF) on the order of ~40um, which must substantially reduce resolution and sensitivity. This means that if there *is* local coactivation, the data in this study will very likely have individual ROIs that integrate signals from multiple neighboring cells. The study reports relationships between event magnitude and clustering, etc; but a fluorescence signal that contains photons contributed by multiple neighboring neurons will be larger than a single neuron, regardless of the underlying physiology - the text does not acknowledge this possibility or limitation.

      Second, the text makes several claims for the first multicellular in vivo olivary recordings. (L11; L324, etc). I am aware of at least two studies that have recorded populations of single olivary axons using two-photon Ca2+ imaging up to 6 years ago (10.1016/j.neuron.2019.03.010; 10.7554/eLife.61593). This technique is not acknowledged or discussed, and one of these studies is not cited. No argument is presented for why axonal imaging should not "count" as multicellular in vivo olivary recording: axonal Ca2+ reflects somatic spiking.

    4. Author response:

      Reviewer #1 (Public review):

      Summary:

      This manuscript by Guo and Uusisaari describes a series of experiments that employ a novel approach to address long-standing questions on the inferior olive in general and the role of the nucleoolivary projection specifically. For the first time, they optimized the ventral approach to the inferior olive to facilitate imaging in this area that is notoriously difficult to reach. Using this approach, they are able to compare activity in two olivary regions, the PO and DAO, during different types of stimulation. They demonstrate the difference between the two regions, linked to Aldoc-identities of downstream Purkinje cells, and that there is co-activation resulting in larger events when they are clustered. Periocular stimulation also drives larger events, related to co-activation. Using optogenetic stimulation they activate the nucleoolivary (N-O) tract and observe a wide range of responses, from excitation to inhibition. Zooming in on inhibition they test the assumption that N-O activation can be responsible for suppression of sensoryevoked events. Instead, they suggest that the N-O input can function to suppress background activity while preserving the sensory-driven responses.

      Strengths:

      This is an important study, tackling the long-standing issue of the impossibility to do imaging in the inferior olive and using that novel method to address the most relevant questions. The experiments are technically very challenging, the results are presented clearly and the analysis is quite rigorous. There is quite a lot of room for interpretation, see weaknesses, but the authors make an effort to cover many options.

      Weaknesses:

      The heavy anesthesia that is required during the experiment could severely impact the findings. Because of the anesthesia, the firing rate of IO neurons is found to be 0.1 Hz, significantly lower than the 1 Hz found in non-anesthetized mice. This is mentioned and discussed, but what the consequences could be cannot be understated and should be addressed more. Although the methods and results are described in sufficient detail, there are a few points that, when addressed, would improve the manuscript.

      We sincerely thank the reviewer for their encouraging comments and recognition of our study’s significance. We fully acknowledge the confounding effects of the deep anesthesia used in our experiments, which was necessary to ensure the animals’ welfare while establishing this technically demanding methodology. We elaborate on these effects below and will further clarify them in the revised manuscript.

      Ultimately, the full resolution of this issue will require recordings in awake animals, as we consider our approach an advancement from acute slice preparations but not yet a complete representation of in vivo IO function. However, key findings from our study—such as amplitude modulation with co-activation and the potential role of IO refractoriness in complex spike generation—could be further explored in existing cerebellar cortical recordings from awake, behaving animals. We hope our work will motivate re-examination of such datasets to assess whether these mechanisms contribute to overall cerebellar function.

      Reviewer #1 (Recommendations for the authors):

      On page 10 the authors indicate that 2084 events were included for DAO and 1176 for PO. Is that the total number of events? What was the average and the range per neuron and the average recording duration?

      Thank you for pointing out lack of clarity. The sentence should say "in total, 2084 and 1176 detected events from DAO and PO were included in the study". We will add the averages and ranges of events detected per neuron in different categories, as well as the durations of the recordings (ranging from 120s to 270s) to the tables.

      On page 10 it is also stated that: "events in PO reached larger values than those in DAO even though the average values did not differ". Please clarify that statement. Which parameter + p-value in the table indicates this difference?

      Apologies for omission. Currently the observation is only visible in the longer tail to the right in the PO data in Figure 2B2. We will add the range of values (3.0-75.2 vs 3.1-39.6 for PO and DAO amplitudes, respectively) in text and the tables in the revision.

      Abbreviating airpuff to AP is confusing, I would suggest not abbreviating it.

      Understood. We will change AP to airpuff in the text. In figure labels, at least in some panels, the abbreviation will be necessary due to space constraints.

      What type of pulse was used to drive ChrimsonR? Could it be that the pulse caused a rebound-like phenomenon with the pulse duration that drove the excitation?

      As described on line 229 and in the Methods, we used 5-second trains of 5-ms LED light pulses. Importantly, these stimulation parameters were informed by our extensive in vitro examination of various stimulation patterns (Lefler et al., 2014), which consistently produced stable postsynaptic responses without inducing depolarization or rebound effects. Additionally, Loyola et al. (2024) reported no evidence of rebound activity in IO cells following optogenetic activation of N-O axons in the absence of direct neuronal depolarization. We will incorporate these considerations into the discussion, while also acknowledging that unequivocal confirmation of “direct” rebound excitation would require intracellular recordings, such as patch clamp experiments.

      The authors indicate that the excitatory activity was indistinguishable in shape from other calcium activity, but can anything be said about the timing (the scale bar in Figure 4A2 has no value, is it the same 2s pulse)?

      Apologies for oversight in labeling the scale bar in Figure 4A2 (it is 2s). While we deliberately refrain from making strong claims regarding the origin of the NO-evoked spikes, their timing can be examined in more detail in Figure 4 - Supplement 1, panels C and D. We will make sure this is clearly stated in the revised text.

      Did the authors check for accidental sparse transfection with ChrimsonR of olivary neurons in the post-mortem analysis?

      Good point! However, we have never seen this AAV9-based viral construct to drive trans-synaptic expression in the IO, nor is this version of AAV known to have the capacity for transsynaptic expression in general.

      No sign of retrograde labeling (via the CF collaterals in the cerebellar nuclei) was seen either. Notably, the hSyn promoter used to drive ChrimsonR expression is extremely ineffective in the IO. Thus, we doubt that such accidental labeling could underlie the excitatory events seen upon N-O stimulation. We will add these mentions with relevant references to the discussion of the revised manuscript.

      On page 18 the authors state that: "The lower SS rate was attributed to intrinsic factors of PNs, while the reduced frequency of CSs was speculated to result from increased inhibition of the IO via the nucleo-olivary (N-O) pathway targeting the same microzone." I think I understand what you mean to say, but this is a bit confusing.

      Agreed. We will rephrase this sentence to clarify that a lower SS rate in a given microzone may lead to increased activation of inhibitory N-O axons that target the region of IO that sends CF to the same microzone.

      Is airpuff stimulation not more likely to activate PO dan DAO because of the related modalities (more face vs. more trunk/limbs?), and thereby also more likely to drive event co-activation (as it is stated in the abstract).

      We agree that the specific innervation patterns of different IO regions likely explain the discrepancy between previous reports of airpuff-evoked complex spikes in cerebellar cortical regions targeted by DAO and the absence of airpuff responses in the particular region of DAO accessible via our surgical approach. As in the present dataset virtually no airpuff-evoked events were seen in DAO regions, we are unable to directly compare airpuff-evoked event co-activation between PO and DAO. The higher co-activation for PO was observed for "spontaneous" activity.

      The Discussion addresses the question of why N-O pathway activation does not remove the airpuff response.

      Given the potentially profound effect, I would propose to expand the discussion on the role of aneasthesia, including longer refractory periods but also potential disruption of normal network interactions (even though individually the stimulations work). Briefly indicating what is known about alpha-chloralose would help interpret the results as well.

      We fully agree that the anesthetic state introduces confounding factors that must be considered when interpreting our results. We will expand the discussion to address how anesthesia, particularly alphachloralose as well as tissue cooling, may contribute to prolonged refractory periods and potential disruptions in normal network interactions. However, we recognize that certain aspects cannot be fully resolved without recordings in awake animals. For this reason, we characterize our preparation as an "upgraded" in vitro approach rather than a fully representative in vivo model.

      Please clearly indicate that the age range of P35-45 is for the moment of virus injection and specify the age range for the imaging experiment.

      Apologies for the oversight. We will indicate these age ranges in the results (as they are currently only specified in Methods). The P35-45 range refers to moment of virus injection.

      The methods indicate that a low-pass filter of 1Hz was used. I am sure this helps with smoothing, but does it not remove a lot of potentially interesting information. How would a higher low-pass filter affect the analysis and results?

      We acknowledge that applying a 1 Hz low-pass filter inevitably removes high-frequency components, including potential IO oscillations and fine details such as spike "doublets." However, given the temporal resolution constraints of our recording approach, we prioritized capturing robust, interpretable events over attempting to extract finer features that might be obscured by both the indicator kinetics and imaging speed.

      While a higher cut-off frequency could, in principle, allow more precise measurement of rise times and peak timings, it would also amplify high-frequency noise, complicating automated event detection and reducing confidence in distinguishing genuine neural signals from artifacts. Given these trade-offs, we opted for a conservative filtering approach to ensure stable event detection. Future work, particularly with faster imaging rates and improved sensors (GCaMP8s) will be used to explore the finer temporal structure of IO activity. We will deliberate on these matters more extensively in the revised discussion.

      Reviewer #2 (Public review):

      The authors developed a strategy to image inferior olive somata via viral GCaMP6s expression, an implanted GRIN lens, and a one-photon head-mounted microscope, providing the first in vivo somatic recordings from these neurons. The main new findings relate to the activation of the nucleoolivary pathway, specifically that: this manipulation does not produce a spiking rebound in the IO; it exerts a larger effect on spontaneous IO spiking than stimulus (airpuff)-evoked spiking. In addition, several findings previously demonstrated in vivo in Purkinje cell complex spikes or inferior olivary axons are confirmed here in olivary somata: differences in event sizes from single cells versus co-activated cells; reduced coactivation when activating the NO pathway; more coactivation within a single zebrin compartment.

      The study presents some interesting findings, and for the most part, the analyses are appropriate. My two principal critiques are that the study does not acknowledge major technical limitations and their impact on the claims; and the study does not accurately represent prior work with respect to the current findings.

      We thank the reviewer for recognising the value of the findings in our "reduced" in vivo preparation, and apologize for omissions in the work that led to critique. We will elaborate on these matters below and prepare a revised manuscript.

      The authors use GCaMP6s, which has a tau1/2 of >1 s for a normal spike, and probably closer to 2 s (10.1038/nature12354) for the unique and long type of olivary spikes that give rise to axonal bursts (10.1016/j.neuron.2009.03.023). Indeed, the authors demonstrate as much (Fig. 2B1). This affects at least several claims:

      a. The authors report spontaneous spike rates of 0.1 Hz. They attribute this to anesthesia, yet other studies under anesthesia recording Purkinje complex spikes via either imaging or electrophysiology report spike rates as high as 1.5 Hz (10.1523/JNEUROSCI.2525-10.2011). This discrepancy is not acknowledged and a plausible explanation is not given. Citations are not provided that demonstrate such low anesthetized spike rates, nor are citations provided for the claim that spike rates drop increasingly with increasing levels of anesthesia when compared to awake resting conditions.

      We fully acknowledge that anesthesia is a major confounding factor in our study. Given the unusually invasive nature of our surgical preparation, we prioritized deep anesthesia to ensure the animals’ welfare. This, along with potential cooling effects from tissue removal and GRIN lens contact, likely contributed to the observed suppression of IO activity.

      We recognize that reported complex spike rates under anesthesia vary considerably across studies, and we will expand our discussion to provide a more comprehensive comparison with prior literature. Notably, different anesthetic protocols, levels of anesthesia, and recording methodologies can lead to widely different estimates of firing rates. While we cannot resolve this issue without recordings in awake animals, we will clarify that our observed rates likely reflect both the effects of anesthesia and specific methodological constraints. We will also incorporate additional references to studies examining cerebellar activity under different anesthetic conditions.

      More likely, this discrepancy reflects spikes that are missed due to a combination of the indicator kinetics and low imaging sensitivity (see (2)), neither of which are presented as possible plausible alternative explanations.

      We acknowledge that the combination of slow indicator kinetics and limited optical power in our miniature microscope setup constrains the temporal resolution of our recordings. However, we are confident that we can reliably detect events occurring at intervals of 1 second or longer. This confidence is based on data from another preparation using the same viral vector and optical system, where we observed spike rates an order of magnitude higher.

      That said, we do not make claims regarding the presence or absence of somatic events occurring at very short intervals (e.g., 100-ms "doublets," as described by Titley et al., 2019), as these would likely fall below our temporal resolution. We will clarify this limitation in the revised manuscript to ensure that the constraints of our approach are fully acknowledged.

      While GCaMP6s is not as sensitive as more recent variants (Zhang et al., 2023, PMID 36922596), our previous work (Dorgans et al., 2022) demonstrated that its dynamic range and sensitivity are sufficient to detect both spikes and subthreshold activity in vitro. Although the experimental conditions differ in the current miniscope experiments, we took measures to optimize signal quality, including excluding recordings with a low signal-to-noise ratio (see Methods). This need for high signal fidelity also informed our decision to limit the sampling rate to 20 fps. In future work, we plan to adopt newer GCaMP variants that were not available at the start of this project, which should further improve sensitivity and temporal resolution.

      Many claims are made throughout about co-activation ("clustering"), but with the GCaMP6s rise time to peak (0.5 s), there is little technical possibility to resolve co-activation. This limitation is not acknowledged as a caveat and the implications for the claims are not engaged with in the text.

      As noted in the manuscript (L492-), "interpreting fluorescence signals relative to underlying voltage changes is challenging, particularly in IO neurons with unusual calcium dynamics." We acknowledge that the slow rise time of GCaMP6s ( 0.5 s) limits our ability to precisely resolve the timing of co-activation at very short intervals. However, given the relatively slow timescales of IO event clustering and the inherent synchrony in olivary network dynamics, we believe that the observed co-activation patterns remain meaningful, even if finer temporal details cannot be fully resolved.

      To ensure clarity, we will expand this section to explicitly acknowledge the temporal resolution limitations of our approach and discuss their implications for interpreting co-activation. While the precise timing of individual spikes within a cluster may not be resolvable, the observed increase in event magnitude with coarse co-activation suggests that clustering effects remain functionally relevant even when exact spike synchrony is not detectable at millisecond resolution.

      This finding is consistent with the idea that co-activation enhances calcium influx, leading to larger amplitude events — a relationship that does not require perfect temporal resolution to be observed. The fact that this effect persists across a broad range of clustering windows (as shown in Figure 2 Supplement 2) further supports its robustness. While we cannot make strong claims about precise spike timing within these clusters nor about the mechanism underlying enhanced calcium signal, our results demonstrate that co-activation may influence IO activity in a quantifiable way. We will clarify these points in the revised manuscript to ensure that our findings are appropriately framed given the temporal constraints of our imaging approach.

      The study reports an ultralong "refractory period" (L422-etc) in the IO, but this again must be tempered by the possibility that spikes are simply being missed due to very slow indicator kinetics and limited sensitivity. Indeed, the headline numeric estimate of 1.5 s (L445) is suspiciously close to the underlying indicator kinetic limitation of 1-2 s.

      Our findings suggest a potential refractory period limiting the frequency of events in the inferior olive under our recording conditions. This interpretation is supported by the observed inter-event interval distribution, the inability of N-O stimulation to suppress airpuff-evoked events, and lower bounds reported in earlier literature on complex spike intervals recorded in awake animals under various behavioral contexts. Taking into account the likely cooling of tissue, a refractory period of 1.5s is not unreasonable. Of course, we recognize that the slow decay kinetics of GCaMP6s may cause overlapping fluorescence signals, potentially obscuring closely spaced events. This is in line with data presented in the Chen et al 2013 manuscript describing GCaMp6s (PMID: 36922596; Figure 3b showing events detected with intervals less than 500 ms).

      The consideration of refractoriness only arose late in the project while we were investigating the explanations for lack of inhibition of airpuff-evoked spikes. Future experiments, particularly in awake animals, will be instrumental in validating this interpretation. To ensure that the refractory period is understood as one possible mechanism rather than a definitive explanation, we will rephrase the discussion to clarify that while our data are compatible with a refractory period, they do not establish it conclusively.

      The study uses endoscopic one-photon miniaturized microscope imaging. Realistically, this is expected to permit an axial point spread function (z-PSF) on the order of 40um, which must substantially reduce resolution and sensitivity. This means that if there *is* local coactivation, the data in this study will very likely have individual ROIs that integrate signals from multiple neighboring cells. The study reports relationships between event magnitude and clustering, etc; but a fluorescence signal that contains photons contributed by multiple neighboring neurons will be larger than a single neuron, regardless of the underlying physiology - the text does not acknowledge this possibility or limitation.

      We acknowledge that the use of one-photon endoscopic imaging imposes limitations on axial resolution, potentially leading to signal contributions from neighboring neurons. To mitigate this, we applied CNMFe processing, which allows for the deconvolution of overlapping signals and the differentiation of multiple neuronal sources within shared pixels. However, as the reviewer points out, if two neurons are perfectly overlapping in space, they may be treated as a single unit.

      To clarify this limitation, we will expand the discussion to explicitly acknowledge the impact of one-photon imaging on signal separation and to emphasize that, while CNMFe helps resolve some overlaps, perfect separation is not always possible. As already noted in the manuscript (L495-), "the absence of optical sectioning in the whole-field imaging method can lead to confounding artifacts in densely labeled structures such as the IO’s tortuous neuropil." We will further elaborate on how this factor was considered in our analysis and interpretation.

      Second, the text makes several claims for the first multicellular in vivo olivary recordings. (L11; L324, etc).

      I am aware of at least two studies that have recorded populations of single olivary axons using two-photon Ca2+ imaging up to 6 years ago (10.1016/j.neuron.2019.03.010; 10.7554/eLife.61593). This technique is not acknowledged or discussed, and one of these studies is not cited. No argument is presented for why axonal imaging should not "count" as multicellular in vivo olivary recording: axonal Ca2+ reflects somatic spiking.

      We appreciate the reviewer’s point and acknowledge the important prior work using two-photon imaging to record olivary axonal activity in the cerebellar cortex. However, while axonal calcium signals do reflect somatic spiking, these recordings inherently lack information about the local network interactions within the inferior olive itself.

      A key motivation for our study was to observe neuronal activity within the IO at the level of its gap-junctioncoupled local circuits, rather than at the level of its divergent axonal outputs. The fan-like spread of climbing fibers across rostrocaudal microzones in the cerebellar cortex makes them relatively easy to record in vivo, but it also means that individual imaging fields contain axons from neurons that may be distributed across different IO microdomains. As a result, while previous work has provided valuable insight into olivary output patterns, it has not allowed for the examination of coordinated somatic activity within localized IO neuron clusters.

      With apologies, we recognize that this distinction was not sufficiently emphasized in our introduction. We will clarify this key point and ensure that the important climbing fiber imaging studies are properly cited and contextualized in the revised manuscript.

      Reviewer #2 (Recommendations for the authors):

      The authors state: "we found no reports that examined coactivation levels between Z+ and Z- microzones in cerebellar complex spike recordings" (L359). Multiple papers (that are not cited) using AldolaceC-tdTomato mice with two photon Purkinje dendritic calcium imaging showed synchronization (at similar levels) within but not across z+/z- bands. (2015 10.1523/JNEUROSCI.2170-14.2015, 2023 https://doi.org/10.7554/eLife.86340).

      We apologize for the misleading phrasing. We will rephrase this statement to: "While complex spike coactivation within individual zebrin zones has been extensively studied (references), we found no reports directly comparing the levels of intra-zone co-activation between Z+ and Z microzones."

      Additionally, we will ensure that the relevant studies demonstrating synchronization within zebrin zones, as well as (lack of) interactions between neighboring zones, are properly cited and discussed in the revised manuscript.

      The figures could use more proofreading, and several decisions should be reconsidered:

      Normalizing the amplitude to maximum is not a good strategy, as it can overemphasize noise or extremely small-magnitude signals, and should instead follow standard convention and present in fixed units (3A2, 4B2, and even 2C).

      As noted earlier, we have excluded recordings and cells with high noise or a low signal-to-noise ratio for event amplitudes, ensuring that such data do not influence the color-coded panels. Importantly, all quantitative analyses and traces presented in the manuscript are normalized to baseline noise level, not to maximal amplitude, ensuring that noise or low-magnitude signals do not skew the analysis.

      The decision to use max-amplitude normalization in color-coded panels was made specifically to aid visualization of temporal structure across recordings. This approach allows for clearer comparisons without the distraction of inter-cell variability in absolute signal strength. However, we recognize the potential for confusion and will revise the Results text to explicitly clarify that the color-coded visualizations use a different scaling method than the quantitative analyses.

      x axes with no units: Figures 2B2, 2E1, 3B2, 3C2, 5B2, 5C2, 5D2.

      No colorbar units: 5A3 (and should be shown in real not normalized units).

      No y axis units: 5D1.

      No x axis label or units: 5E1.

      5E3 says "stim/baseline" for the y-axis units and then the first-panel title says "absolute frequencies" meaning it’s *not* normalized and needs a separate (accurate) y-axis with units.

      Illegibly tiny fonts: 2E1, 3E1, etc.

      We will correct all these in the revised manuscript. Thank you for careful reading.

    1. eLife Assessment

      This useful study presents findings on the developmental roles of Nup107, a key nucleoporin, in regulating the larval-to-pupal transition in Drosophila melanogaster through its involvement in ecdysone signaling. The evidence supporting the authors' claims is solid, with robust experimental approaches including RNAi knockdown and rescue experiments. The findings highlight Nup107's function in regulating ecdysone biosynthesis, specifically through the regulation of EcR levels and Halloween genes expression in the prothoracic gland; additionally, rescue experiments suggest that the RTK PTTH/Torso signaling pathway is disrupted upon Nup107 depletion, further emphasizing its role in ecdysone regulation. However, finding a mechanism, addressing potential off-target effects of RNAi, and exploring alternative mutant models would strengthen the findings as the currently proposed mechanism is not fully supported by the data.

    2. Reviewer #1 (Public review):

      This study provides a thorough analysis of Nup107's role in Drosophila metamorphosis, demonstrating that its depletion leads to developmental arrest at the third larval instar stage due to disruptions in ecdysone biosynthesis and EcR signaling. Importantly, the authors establish a novel connection between Nup107 and Torso receptor expression, linking it to the hormonal cascade regulating pupariation.

      However, some contradictory results weaken the conclusions of the study. The authors claim that Nup107 is involved in the translocation of EcR from the cytoplasm to the nucleus. However, the evidence provided in the paper suggests it more likely regulates EcR expression positively, as EcR is undetectable in Nup107-depleted animals, even below background levels. Additionally, the link between Nup107 and Torso is not fully substantiated. While overexpression of Torso appears to rescue the lack of 20E production in the prothoracic gland, the distinct phenotypes of Torso and Nup107 depletion-developmental delay in the former versus complete larval arrest in the latter complicate understanding of Nup107's precise role.

      To clarify these discrepancies, further investigation into whether Nup107 interacts with other critical signaling pathways related to the regulation of ecdysone biosynthesis, such as EGFR or TGF-β, would be beneficial and could strengthen the findings.

      In summary, although the study presents some intriguing observations, several conclusions are not well-supported by the experimental data.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript by Kawadkar et al investigates the role of Nup107 in developmental progression via the regulation of ecdysone signaling. The authors identify an interesting phenotype of Nup107 whole-body RNAi depletion in Drosophila development - developmental arrest at the late larval stage. Nup107-depleted larvae exhibit mislocalization of the Ecdysone receptor (EcR) from the nucleus to the cytoplasm and reduced expression of EcR target genes in salivary glands, indicative of compromised ecdysone signaling. This mis-localization of EcR in salivary glands was phenocopied when Nup107 was depleted only in the prothoracic gland (PG), suggesting that it is not nuclear transport of EcR but the presence of ecdysone (normally secreted from PG) that is affected. Consistently, whole-body levels of ecdysone were shown to be reduced in Nup107 KD, particularly at the late third instar stage when a spike in ecdysone normally occurs. Importantly, the authors could rescue the developmental arrest and EcR mislocalization phenotypes of Nup107 KD by adding exogenous ecdysone, supporting the notion that Nup107 depletion disrupts biosynthesis of ecdysone, which arrests normal development. Additionally, they found that rescue of the Nup107 KD phenotype can also be achieved by over-expression of the receptor tyrosine kinase torso, which is thought to be the upstream regulator of ecdysone synthesis in the PG. Transcript levels of the torso are also shown to be downregulated in the Nup107KD, as are transcript levels of multiple ecdysone biosynthesis genes. Together, these experiments reveal a new role of Nup107 or nuclear pore levels in hormone-driven developmental progression, likely via regulation of levels of torso and torso-stimulated ecdysone biosynthesis.

      Strengths:

      The developmental phenotypes of an NPC component presented in the manuscript are striking and novel, and the data appears to be of high quality. The rescue experiments are particularly significant, providing strong evidence that Nup107 functions upstream of torso and ecdysone levels in the regulation of developmental timing and progression.

      Weaknesses:

      The underlying mechanism is however not clear, and any insight into how Nup107 may regulate these pathways would greatly strengthen the manuscript. Some suggestions to address this are detailed below.

      Major questions:

      (1) Determining how specific this phenotype is to Nup107 vs. to reduced NPC levels overall would give some mechanistic insight. Does knocking down other components of the Nup107 subcomplex (the Y-complex) lead to similar phenotypes? Given the published gene regulatory function of Nup107, do other gene regulatory Nups such as Nup98 or Nup153 produce these phenotypes?

      (2) In a related issue, does this level of Nup107 KD produce lower NPC levels? It is expected to, but actual quantification of nuclear pores in Nup107-depleted tissues should be added. These and the above experiments would help address a key mechanistic question - is this phenotype the result of lower numbers of nuclear pores or specifically of Nup107?

      (3) Additional experiments on how Nup107 regulates the torso would provide further insight. Does Nup107 regulate transcription of the torso or perhaps its mRNA export? Looking at nascent levels of the torso transcript and the localization of its mRNA can help answer this question. Or alternatively, does Nup107 physically bind the torso?

      (4) The depletion level of Nup107 RNAi specifically in the salivary gland vs. the prothoracic gland should be compared by RT-qPCR or western blotting.

      (5) The UAS-torso rescue experiment should also include the control of an additional UAS construct - so Nup107; UAS-control vs Nup107; UAS-torso should be compared in the context of rescue to make sure the Gal4 driver is functioning at similar levels in the rescue experiment.

      Minor:

      (6) Figures and figure legends can stand to be more explicit and detailed, respectively.

    4. Reviewer #3 (Public review):

      Summary:

      In this study by Kawadkar et al, the authors investigate the developmental role of Nup107, a nucleoporin, in regulating the larval-to-pupal transition in Drosophila through RNAi knockdown and CRISPR-Cas9-mediated gene editing. They demonstrate that Nup107, an essential component of the nuclear pore complex (NPC), is crucial for regulating ecdysone signaling during developmental transitions. The authors show that the depletion of Nup107 disrupts these processes, offering valuable insights into its role in development.

      Specifically, they find that:

      (1) Nup107 depletion impairs pupariation during the larval-to-pupal transition.<br /> (2) RNAi knockdown of Nup107 results in defects in EcR nuclear translocation, a key regulator of ecdysone signaling.<br /> (3) Exogenous 20-hydroxyecdysone (20E) rescues pupariation blocks, but rescued pupae fail to close.<br /> (4) Nup107 RNAi-induced defects can be rescued by activation of the MAP kinase pathway.

      Strengths:

      The manuscript provides strong evidence that Nup107, a component of the nuclear pore complex (NPC), plays a crucial role in regulating the larval-to-pupal transition in Drosophila, particularly in ecdysone signaling.

      The authors employ a combination of RNAi knockdown, CRISPR-Cas9 gene editing, and rescue experiments, offering a comprehensive approach to studying Nup107's developmental function.

      The study effectively connects Nup107 to ecdysone signaling, a key regulator of developmental transitions, offering novel insights into the molecular mechanisms controlling metamorphosis.

      The use of exogenous 20-hydroxyecdysone (20E) and activation of the MAP kinase pathway provides a strong mechanistic perspective, suggesting that Nup107 may influence EcR signaling and ecdysone biosynthesis.

      Weaknesses:

      The authors do not sufficiently address the potential off-target effects of RNAi, which could impact the validity of their findings. Alternative approaches, such as heterozygous or clonal studies, could help confirm the specificity of the observed phenotypes.

      NPC Complex Specificity: While the authors focus on Nup107, it remains unclear whether the observed defects are specific to this nucleoporin or if other NPC components also contribute to similar defects. Demonstrating similar results with other NPC components would strengthen their claims.

      Although the authors show that Nup107 depletion disrupts EcR signaling, the precise molecular mechanism by which Nup107 influences this process is not fully explored. Further investigation into how Nup107 regulates EcR nuclear translocation or ecdysone biosynthesis would improve the clarity of the findings.

      There are some typographical errors and overly strong phrases, such as "unequivocally demonstrate," which could be softened. Additionally, the presentation of redundant data in different tissues could be streamlined to enhance clarity and flow.

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      This study provides a thorough analysis of Nup107's role in Drosophila metamorphosis, demonstrating that its depletion leads to developmental arrest at the third larval instar stage due to disruptions in ecdysone biosynthesis and EcR signaling. Importantly, the authors establish a novel connection between Nup107 and Torso receptor expression, linking it to the hormonal cascade regulating pupariation.

      However, some contradictory results weaken the conclusions of the study. The authors claim that Nup107 is involved in the translocation of EcR from the cytoplasm to the nucleus. However, the evidence provided in the paper suggests it more likely regulates EcR expression positively, as EcR is undetectable in Nup107-depleted animals, even below background levels.

      We appreciate the concern raised in this public review. However, we must clarify that we do not claim that Nup107 regulates the translocation of EcR from the cytoplasm. It is important to note that we posited this hypothesis if Nup107 will regulate EcR nuclear translocation (9<sup>th</sup> line of 2<sup>nd</sup> paragraph on page 6). We have spelled this out more clearly as the 3<sup>rd</sup> sub-section title of the Results section, and in the discussion (8<sup>th</sup> line of 2<sup>nd</sup> paragraph on page 11). Overall, we have expressed surprise that Nup107 is not directly involved in the nuclear translocation of EcR.

      Ecdysone hormone acts through the EcR to induce the transcription of EcR also and creates a positive autoregulatory loop that enhances the EcR level through ecdysone signaling (1). Since Nup107 depletion leads to a reduction in ecdysone levels, it disrupts the transcription autoregulatory EcR expression loop. This can contribute to the reduced EcR levels seen in Nup107-depleted animals.

      Additionally, the link between Nup107 and Torso is not fully substantiated. While overexpression of Torso appears to rescue the lack of 20E production in the prothoracic gland, the distinct phenotypes of Torso and Nup107 depletion-developmental delay in the former versus complete larval arrest in the latter complicate understanding of Nup107's precise role.

      We understand that there are differences in the developmental delay when Tosro and Nup107 depletion is analyzed. However, the two molecules being compared here are very different, and the extent of Torso depletion is not evident in other studies (2). Even if the extent of depletion of Torso and Nup107 is similar, we believe that Nup107, being a more widely expressed protein, induces stronger defects owing to its importance in cellular physiology. We think that RNAi-mediated depletion of Nup107 causes a defect in 20E biosynthesis through the Halloween genes, inducing a developmental arrest.

      To clarify these discrepancies, further investigation into whether Nup107 interacts with other critical signaling pathways related to the regulation of ecdysone biosynthesis, such as EGFR or TGF-β, would be beneficial and could strengthen the findings.

      In summary, although the study presents some intriguing observations, several conclusions are not well-supported by the experimental data.

      We agree with the reviewer’s suggestion. As noted in the literature, five RTKs-torso, InR, EGFR, Alk, and Pvr-stimulate the PI3K/Akt pathway, which plays a crucial role in the PG functioning and controlling pupariation and body size (3). We have checked the torso and EGFR signaling. We rescued Nup107 defects with the torso overexpression, however, constitutively active EGFR (BL-59843) did not rescue the phenotype (data was not shown). Nonetheless, we plan to examine the EGFR pathway activation by measuring the pERK levels in Nup107-depleted PGs.

      Reviewer #2 (Public review):

      Summary:

      The manuscript by Kawadkar et al investigates the role of Nup107 in developmental progression via the regulation of ecdysone signaling. The authors identify an interesting phenotype of Nup107 whole-body RNAi depletion in Drosophila development - developmental arrest at the late larval stage. Nup107-depleted larvae exhibit mis-localization of the Ecdysone receptor (EcR) from the nucleus to the cytoplasm and reduced expression of EcR target genes in salivary glands, indicative of compromised ecdysone signaling. This mis-localization of EcR in salivary glands was phenocopied when Nup107 was depleted only in the prothoracic gland (PG), suggesting that it is not nuclear transport of EcR but the presence of ecdysone (normally secreted from PG) that is affected. Consistently, whole-body levels of ecdysone were shown to be reduced in Nup107 KD, particularly at the late third instar stage when a spike in ecdysone normally occurs. Importantly, the authors could rescue the developmental arrest and EcR mislocalization phenotypes of Nup107 KD by adding exogenous ecdysone, supporting the notion that Nup107 depletion disrupts biosynthesis of ecdysone, which arrests normal development. Additionally, they found that rescue of the Nup107 KD phenotype can also be achieved by over-expression of the receptor tyrosine kinase torso, which is thought to be the upstream regulator of ecdysone synthesis in the PG. Transcript levels of the torso are also shown to be downregulated in the Nup107KD, as are transcript levels of multiple ecdysone biosynthesis genes. Together, these experiments reveal a new role of Nup107 or nuclear pore levels in hormone-driven developmental progression, likely via regulation of levels of torso and torso-stimulated ecdysone biosynthesis.

      Strengths:

      The developmental phenotypes of an NPC component presented in the manuscript are striking and novel, and the data appears to be of high quality. The rescue experiments are particularly significant, providing strong evidence that Nup107 functions upstream of torso and ecdysone levels in the regulation of developmental timing and progression.

      Weaknesses:

      The underlying mechanism is however not clear, and any insight into how Nup107 may regulate these pathways would greatly strengthen the manuscript. Some suggestions to address this are detailed below.

      Major questions:

      (1) Determining how specific this phenotype is to Nup107 vs. to reduced NPC levels overall would give some mechanistic insight. Does knocking down other components of the Nup107 subcomplex (the Y-complex) lead to similar phenotypes? Given the published gene regulatory function of Nup107, do other gene regulatory Nups such as Nup98 or Nup153 produce these phenotypes?

      We thank this public review to raise this concern. Working with a Nup-complex like the Nup107 complex, this concern is anticipated but difficult to address as many Nups function beyond their complex identity. Our observations with all other members of the Nup107-complex, including dELYS, suggest that except Nup107, none of the other Nup107-complex members could induce larval developmental arrest.

      In this study, we primarily focused on the Nup107 complex (outer ring complex) of the NPC. We have not examined other nucleoporins outside of this complex, such as Nup98 and Nup153. However, previous studies have reported that Nup98 and Nup153 interact with chromatin, with these investigations conducted in Drosophila S2 cells (4, 5, 6). In the future, we may check whether Nup98 and Nup153 depletion can produce the arrest phenotype.

      (2) In a related issue, does this level of Nup107 KD produce lower NPC levels? It is expected to, but actual quantification of nuclear pores in Nup107-depleted tissues should be added. These and the above experiments would help address a key mechanistic question - is this phenotype the result of lower numbers of nuclear pores or specifically of Nup107?

      We agree with the concern raised here, and we plan to assess nucleoporin intensity using mAb414 antibody (exclusively FG-repeat Nup recognizing antibody) in the Nup107 depletion background. Our past observations suggest that Nup107-depletion does not affect the overall nuclear pore complex assembly in Drosophila salivary glands (Data is not shown).

      (3) Additional experiments on how Nup107 regulates the torso would provide further insight. Does Nup107 regulate transcription of the torso or perhaps its mRNA export? Looking at nascent levels of the torso transcript and the localization of its mRNA can help answer this question. Or alternatively, does Nup107 physically bind the torso?

      While the concern regarding torso transcript level is genuine, we have already reported in the manuscript that Nup107 levels directly regulate torso expression. When Nup107 is depleted torso levels go down, which in turn controls ecdysone production and subsequent EcR signaling (Figure 6B of the manuscript). However, the exact nature of Nup107 regulation on torso expression is still unclear. Since the Nup107 is known to interact with chromatin (7), it may affect torso transcription. The possibility of a physiologically relevant interaction between Nup107 and the torso in a cellular context is unlikely due to their distinct sub-cellular localizations. If we investigate this further, it will require a significant amount of time for having reagents and experimentation, and currently stands beyond the scope of this manuscript.

      (4) The depletion level of Nup107 RNAi specifically in the salivary gland vs. the prothoracic gland should be compared by RT-qPCR or western blotting.

      Although we know that the Nup107 protein signal is reduced in SG upon knockdown (Figure 3B), we have not compared the Nup107 transcript level in these two tissues (SG and PG). As suggested here, we will knock down Nup107 using SG and PG-specific drivers and quantify the Nup107 depletion level by RT-qPCR.

      (5) The UAS-torso rescue experiment should also include the control of an additional UAS construct - so Nup107; UAS-control vs Nup107; UAS-torso should be compared in the context of rescue to make sure the Gal4 driver is functioning at similar levels in the rescue experiment.

      This is a very valid point, and we took this into account while planning the experiment. To maintain the GAL4 function, we used the Nup107<sup>KK</sup>;UAS-GFP as control alongside the Nup107<sup>KK</sup>;UAS-torso. This approach ensures that GAL4 dilution does not affect observations made in the experiments. It can be noticed in Figure S7 that the presence of GFP signal in prothoracic glands and their reduced size indicates genes downstream to both UAS sequences are transcribed, and GAL4 dilution does not play a role here.

      Minor:

      (6) Figures and figure legends can stand to be more explicit and detailed, respectively.

      We will revisit all figures and their corresponding legends to ensure appropriate and explicit details are provided.

      Reviewer #3 (Public review):

      Summary:

      In this study by Kawadkar et al, the authors investigate the developmental role of Nup107, a nucleoporin, in regulating the larval-to-pupal transition in Drosophila through RNAi knockdown and CRISPR-Cas9-mediated gene editing. They demonstrate that Nup107, an essential component of the nuclear pore complex (NPC), is crucial for regulating ecdysone signaling during developmental transitions. The authors show that the depletion of Nup107 disrupts these processes, offering valuable insights into its role in development.

      Specifically, they find that:

      (1) Nup107 depletion impairs pupariation during the larval-to-pupal transition.

      (2) RNAi knockdown of Nup107 results in defects in EcR nuclear translocation, a key regulator of ecdysone signaling.

      (3) Exogenous 20-hydroxyecdysone (20E) rescues pupariation blocks, but rescued pupae fail to close.

      (4) Nup107 RNAi-induced defects can be rescued by activation of the MAP kinase pathway.

      Strengths:

      The manuscript provides strong evidence that Nup107, a component of the nuclear pore complex (NPC), plays a crucial role in regulating the larval-to-pupal transition in Drosophila, particularly in ecdysone signaling.

      The authors employ a combination of RNAi knockdown, CRISPR-Cas9 gene editing, and rescue experiments, offering a comprehensive approach to studying Nup107's developmental function.

      The study effectively connects Nup107 to ecdysone signaling, a key regulator of developmental transitions, offering novel insights into the molecular mechanisms controlling metamorphosis.

      The use of exogenous 20-hydroxyecdysone (20E) and activation of the MAP kinase pathway provides a strong mechanistic perspective, suggesting that Nup107 may influence EcR signaling and ecdysone biosynthesis.

      Weaknesses:

      The authors do not sufficiently address the potential off-target effects of RNAi, which could impact the validity of their findings. Alternative approaches, such as heterozygous or clonal studies, could help confirm the specificity of the observed phenotypes.

      This is a very valid point raised, and we are aware of the consequences of the off-target effects of RNAi. To assert the effects of authentic RNAi and reduce the off-target effects, we have used two RNAi lines (Nup107<sup>GD</sup> and Nup107<sup>KK</sup>) against Nup107. Both RNAi induced comparable levels of Nup107 reduction, and using these lines, ubiquitous and PG specific knockdown produced similar phenotypes. Although the Nup107<sup>GD</sup> line exhibited a relatively stronger knockdown compared to the Nup107<sup>KK</sup> line, we preferentially used the Nup107<sup>KK</sup> line because the Nup107<sup>GD</sup> line is based on the P-element insertion, and the exact landing site is unknown. Furthermore, there is an off-target predicted for the Nup107<sup>GD</sup> line, where a 19bp sequence aligns with the bifocal (bif) sequence. The bif-encoded protein is involved in axon guidance and regulation of axon extension. However, the Nup107<sup>KK</sup> line does not have a predicted off-target molecule, and we know its precise landing site on the second chromosome. Thus, the Nup107<sup>KK</sup> line was ultimately used in experimentation for its clearer and more reliable genetic background.

      We are also investigating Nup107 knockdown in the prothoracic gland, which exhibits polyteny. Additionally, the number of cells in the prothoracic gland is quite limited, approximately 50-60 cells (8). Given this, there is a possibility that a clonal study may not yield the phenotype. However, we will consider moving forward with this approach also.

      NPC Complex Specificity: While the authors focus on Nup107, it remains unclear whether the observed defects are specific to this nucleoporin or if other NPC components also contribute to similar defects. Demonstrating similar results with other NPC components would strengthen their claims.

      We thank this public review to raise this concern. Working with a Nup-complex like the Nup107 complex, this concern is anticipated but difficult to address as many Nups function beyond their complex identity. Our observations with all other members of the Nup107-complex, including dELYS, suggest that except Nup107, none of the other Nup107-complex members could induce larval developmental arrest. Since the study is primarily focused on the Nup107 complex (outer ring complex) of the NPC, we have not examined other nucleoporins outside of this complex.

      Although the authors show that Nup107 depletion disrupts EcR signaling, the precise molecular mechanism by which Nup107 influences this process is not fully explored. Further investigation into how Nup107 regulates EcR nuclear translocation or ecdysone biosynthesis would improve the clarity of the findings.

      We appreciate the concern raised. Through our observation, we have proposed the upstream effect of Nup107 on the PTTH-torso-20E-EcR axis regulating developmental transitions. We know that Nup107 regulates torso levels, but we do not know if Nup107 directly interacts with torso. We would like to address whether Nup107 exerts control on PTTH levels also.

      We must emphasize that Nup107 does not directly regulate the translocation of EcR. On the contrary, we have demonstrated that EcR translocation is 20E dependent and Nup107 independent. Through our observations, we have argued that Nup107 regulates the expression of Halloween genes required for ecdysone biosynthesis. We are interested in identifying if Nup107 associates directly or through some protein to chromatin to bring about the changes in gene expression required for normal development.

      There are some typographical errors and overly strong phrases, such as "unequivocally demonstrate," which could be softened. Additionally, the presentation of redundant data in different tissues could be streamlined to enhance clarity and flow.

      We thank the reviewer for this observation. We will remove all typographical errors and make reasonable statements based on our conclusions.

      References:

      (1) Varghese, Jishy, and Stephen M Cohen. “microRNA miR-14 acts to modulate a positive autoregulatory loop controlling steroid hormone signaling in Drosophila.” Genes & development vol. 21,18 (2007): 2277-82. doi:10.1101/gad.439807

      (2) Rewitz, Kim F et al. “The insect neuropeptide PTTH activates receptor tyrosine kinase torso to initiate metamorphosis.” Science (New York, N.Y.) vol. 326,5958 (2009): 1403-5. doi:10.1126/science.1176450

      (3) Pan, Xueyang, and Michael B O'Connor. “Coordination among multiple receptor tyrosine kinase signals controls Drosophila developmental timing and body size.” Cell reports vol. 36,9 (2021): 109644. doi:10.1016/j.celrep.2021.109644

      (4) Pascual-Garcia, Pau et al. “Metazoan Nuclear Pores Provide a Scaffold for Poised Genes and Mediate Induced Enhancer-Promoter Contacts.” Molecular cell vol. 66,1 (2017): 63-76.e6. doi:10.1016/j.molcel.2017.02.020

      (5) Pascual-Garcia, Pau et al. “Nup98-dependent transcriptional memory is established independently of transcription.” eLife vol. 11 e63404. 15 Mar. 2022, doi:10.7554/eLife.63404

      (6) Kadota, Shinichi et al. “Nucleoporin 153 links nuclear pore complex to chromatin architecture by mediating CTCF and cohesin binding.” Nature communications vol. 11,1 2606. 25 May. 2020, doi:10.1038/s41467-020-16394-3

      (7) Gozalo, Alejandro et al. “Core Components of the Nuclear Pore Bind Distinct States of Chromatin and Contribute to Polycomb Repression.” Molecular cell vol. 77,1 (2020): 67-81.e7. doi:10.1016/j.molcel.2019.10.017

      (8) Shimell, MaryJane, and Michael B O'Connor. “Endoreplication in the Drosophila melanogaster prothoracic gland is dispensable for the critical weight checkpoint.” microPublication biology vol. 2023 10.17912/micropub.biology.000741. 21 Feb. 2023, doi:10.17912/micropub.biology.000741

    1. eLife Assessment

      This study investigates trial-by-trial inter-areal interactions in the visual cortex of the mouse and the monkey by analyzing two previously published datasets. The authors find that activity in one layer (in mice) or one area (in monkeys) can partially predict neural activity in another layer or area on the single-trial level in different experimental contexts. This valuable finding expands previously known contributions of stimulus-independent downstream activity to neural responses in the visual cortex by demonstrating how these change under varying visual stimuli as well as in the absence of visual stimulation. While the methodology is solid, the analysis for the monkey data is incomplete and would benefit from including a second animal.

    2. Reviewer #1 (Public review):

      Summary:

      In this study, the authors propose a "unifying method to evaluate inter-areal interactions in different types of neuronal recordings, timescales, and species". The method consists of computing the variance explained by a linear decoder that attempts to predict individual neural responses (firing rates) in one area based on neural responses in another area.

      The authors apply the method to previously published calcium imaging data from layer 4 and layers 2/3 of 4 mice over 7 days, and simultaneously recorded Utah array spiking data from areas V1 and V4 of 1 monkey over 5 days of recording. They report distributions over "variance explained" numbers for several combinations: from mouse V1 L4 to mouse V1 L2/3, from L2/3 to L4, from monkey V1 to monkey V4, and from V4 to V1. For their monkey data, they also report the corresponding results for different temporal shifts. Overall, they find the expected results: responses in each of the two neural populations are predictive of responses in the other, more so when the stimulus is not controlled than when it is, and with sometimes different results for different stimulus classes (e.g., gratings vs. natural images).

      Strengths:

      (1) Use of existing data.

      (2) Addresses an interesting question.

      Weaknesses:

      Unfortunately, the method falls short of the state of the art: both generalized linear models (GLMs), which have been used in similar contexts for at least 20 years (see the many papers, both theoretical and applied to neural population data, by e.g. Simoncelli, Paninsky, Pillow, Schwartz, and many colleagues dating back to 2004), and the extension of Granger causality to point processes (e.g. Kim et al. PLoS CB 2011). Both approaches are substantially superior to what is proposed in the manuscript, since they enforce non-negativity for spike rates (the importance of which can be seen in Figure 2AB), and do not require unnecessary coarse-graining of the data by binning spikes (the 200 ms time bins are very long compared to the time scale on which communication between closely connected neuronal populations within an area, or between related areas, takes place).

      In terms of analysis results, the work in the manuscript presents some expected and some less expected results. However, because the monkey data are based on only one monkey (misleadingly, the manuscript consistently uses the plural "monkeys"), none of the results specific to that monkey, nor the comparison of that one monkey to mice, are supported by robust data. One of the main results for mice (bimodality of explained variance values, mentioned in the abstract) does not appear to be quantified or supported by a statistical test and is only present in two out of three mice. Moreover, the two data sets differ in too many aspects to allow for any conclusions about whether the comparisons reflect differences in species (mouse vs. monkey), anatomy (L2/3-L4 vs. V1-V4), or recording technique (calcium imaging vs. extracellular spiking).

    3. Reviewer #2 (Public review):

      Summary:

      In this work, the authors investigated the extent of shared variability in cortical population activity in the visual cortex in mice and macaques under conditions of spontaneous activity and visual stimulation. They argue that by studying the average response to repeated presentations of sensory stimuli, investigators are discounting the contribution of variable population responses that can have a significant impact at the single trial level. They hypothesized that, because these fluctuations are to some degree shared across cortical populations depending on the sources of these fluctuations and the relative connectivity between cortical populations within a network, one should be able to predict the response in one cortical population given the response of another cortical population on a single trial, and the degree of predictability should vary with factors such as retinotopic overlap, visual stimulation, and the directionality of canonical cortical circuits.

      To test this, the authors analyzed previously collected and publicly available datasets. These include calcium imaging of the primary visual cortex in mice and electrophysiology recordings in V1 and V4 of macaques under different conditions of visual stimulation. The strength of this data is that it includes simultaneous recordings of hundreds of neurons across cortical layers or areas. However, the weaknesses of calcium dynamics (which has lower temporal resolution and misses some non-linear dynamics in cortical activity) and multi-unit envelope activity (which reflects fluctuations in population activity rather than the variance in individual unit spike trains), underestimate the variability of individual neurons. The authors deploy a regression model that is appropriate for addressing their hypothesis, and their analytic approach appears rigorous and well-controlled.

      From their analysis, they found that there was significant predictability of activity between layer II/III and layer IV responses in mice and V1 and V4 activity in macaques, although the specific degree of predictability varied somewhat with the condition of the comparison with some minor differences between the datasets. The authors deployed a variety of analytic controls and explored a variety of comparisons that are both appropriate and convincing that there is a significant degree of predictability in population responses at the single trial level consistent with their hypothesis. This demonstrates that a significant fraction of cortical responses to stimuli is not due solely to the feedforward response to sensory input, and if we are to understand the computations that take place in the cortex, we must also understand how sensory responses interact with other sources of activity in cortical networks. However, the source of these predictive signals and their impact on function is only explored in a limited fashion, largely due to limitations in the datasets. Overall, this work highlights that, beyond the traditionally studied average evoked responses considered in systems neuroscience, there is a significant contribution of shared variability in cortical populations that may contextualize sensory representations depending on a host of factors that may be independent of the sensory signals being studied.

      Strengths:

      This work considers a variety of conditions that may influence the relative predictability between cortical populations, including receptive field overlap, latency that may reflect feed-forward or feedback delays, and stimulus type and sensory condition. Their analytic approach is well-designed and statistically rigorous. They acknowledge the limitations of the data and do not over-interpret their findings.

      Weaknesses:

      The different recording modalities and comparisons (within vs. across cortical areas) limit the interpretability of the inter-species comparisons. The mechanistic contribution of known sources or correlates of shared variability (eye movements, pupil fluctuations, locomotion, whisking behaviors) were not considered, and these could be driving or a reflection of much of the predictability observed and explain differences in spontaneous and visual activity predictions. Previous work has explored correlations in activity between areas on various timescales, but this work only considered a narrow scope of timescales. The observation that there is some degree of predictability is not surprising, and it is unclear whether changes in observed predictability with analysis conditions are informative of a particular mechanism or just due to differences in the variance of activity under those conditions. Some of these issues could be addressed with further analysis, but some may be due to limitations in the experimental scope of the datasets and would require new experiments to resolve.

    4. Reviewer #3 (Public review):

      Neural activity in the visual cortex has primarily been studied in terms of responses to external visual stimuli. While the noisiness of inputs to a visual area is known to also influence visual responses, the contribution of this noisy component to overall visual responses has not been well characterized.

      In this study, the authors reanalyze two previously published datasets - a Ca++ imaging study from mouse V1 and a large-scale electrophysiological study from monkey V1-V4. Using regression models, they examine how neural activity in one layer (in mice) or one cortical area (in monkeys) predicts activity in another layer or area. Their main finding is that significant predictions are possible even in the absence of visual input, highlighting the influence of non-stimulus-related downstream activity on neural responses. These findings can inform future modeling work of neural responses in the visual cortex to account for such non-visual influences.

      A major weakness of the study is that the analysis includes data from only a single monkey. This makes it hard to interpret the data as the results could be due to experimental conditions specific to this monkey, such as the relative placement of electrode arrays in V1 and V4. The authors perform a thorough analysis comparing regression-based predictions for a wide variety of combinations of stimulus conditions and directions of influence. However, the comparison of stimulus types (Figure 4) raises a potential concern. It is not clear if the differences reported reflect an actual change in predictive influence across the two conditions or if they stem from fundamental differences in the responses of the predictor population, which could in turn affect the ability to measure predictive relationships. The authors do control for some potential confounds such as the number of neurons and self-consistency of the predictor population. However, the predictability seems to closely track the responsiveness of neurons to a particular stimulus. For instance, in the monkey data, the V1 neuronal population will likely be more responsive to checkerboards than to single bars. Moreover, neurons that don't have the bars in their RFs may remain largely silent. Could the difference in predictability be just due to this? Controlling for overall neuronal responsiveness across the two conditions would make this comparison more interpretable.

    5. Author response:

      Reviewer #1:

      Summary:

      In this study, the authors propose a "unifying method to evaluate inter-areal interactions in different types of neuronal recordings, timescales, and species". The method consists of computing the variance explained by a linear decoder that attempts to predict individual neural responses (firing rates) in one area based on neural responses in another area.

      The authors apply the method to previously published calcium imaging data from layer 4 and layers 2/3 of 4 mice over 7 days, and simultaneously recorded Utah array spiking data from areas V1 and V4 of 1 monkey over 5 days of recording. They report distributions over "variance explained" numbers for several combinations: from mouse V1 L4 to mouse V1 L2/3, from L2/3 to L4, from monkey V1 to monkey V4, and from V4 to V1. For their monkey data, they also report the corresponding results for different temporal shifts. Overall, they find the expected results: responses in each of the two neural populations are predictive of responses in the other, more so when the stimulus is not controlled than when it is, and with sometimes different results for different stimulus classes (e.g., gratings vs. natural images).

      Strengths:

      (1) Use of existing data.

      (2) Addresses an interesting question.

      Unfortunately, the method falls short of the state of the art: both generalized linear models (GLMs), which have been used in similar contexts for at least 20 years (see the many papers, both theoretical and applied to neural population data, by e.g. Simoncelli, Paninsky, Pillow, Schwartz, and many colleagues dating back to 2004), and the extension of Granger causality to point processes (e.g. Kim et al. PLoS CB 2011). Both approaches are substantially superior to what is proposed in the manuscript, since they enforce non-negativity for spike rates (the importance of which can be seen in Figure 2AB), and do not require unnecessary coarse-graining of the data by binning spikes (the 200 ms time bins are very long compared to the time scale on which communication between closely connected neuronal populations within an area, or between related areas, takes place).

      We thank the reviewer for this suggestion. Our goal was to use a simple and unified linear ridge regression framework that can be applied to both calcium imaging (mouse) and MUAe (monkey) data.

      We will perform a GLM-based analysis enforcing non-negativity as suggested, including in the GLM any additional available variables that may contribute to the neuronal responses.

      We also would like to note that:

      ● Macaque data: Our MUAe data are binned at 25 ms, not 200 ms. We used the envelope

      of multi-unit activity as reported in the original study [1]. We did not perform spike sorting on these data and therefore, strictly speaking, this is not a point process and methods developed for point processes are not directly applicable.

      ● Mouse data: The Stringer et al. dataset [2,3] uses two-photon calcium imaging sampled at 2.5 or 3 Hz. Additionally, responses were computed by averaging two frames per stimulus (yielding an effective bin size of 666 ms or 800 ms), dictated by acquisition constraints. We will emphasize the low temporal resolution of these signals as a limitation in the discussion section, but we cannot improve the temporal resolution with our analyses. These signals are not point processes either (although there is a correlation between two-photon calcium signals and spike rates).

      Regardless of these considerations, the reviewer’s points are well taken, and we will conduct additional analyses as described above.

      In terms of analysis results, the work in the manuscript presents some expected and some less expected results. However, because the monkey data are based on only one monkey (misleadingly, the manuscript consistently uses the plural ‘monkeys’), none of the results specific to that monkey, nor the comparison of that one monkey to mice, are supported by robust data.

      We will add data from at least two more monkeys, as suggested by the reviewer:

      ● First, we will include a second monkey from the same dataset [1]. The reason this monkey was not included in the original submission is that the dataset for this second monkey consisted of much less data than the original. For example, for the lights-off condition, the number of V4 channels with signal-to-noise ratio greater than 2 (recommended electrodes to use by dataset authors) is 9-12 in this second monkey, compared to 68-74 in the first monkey [1]. However, we will still add results for this second monkey.

      ● Additionally, we will include data from a new monkey by collaborating with the Ponce lab who will collect new data for this study.

      One of the main results for mice (bimodality of explained variance values, mentioned in the abstract) does not appear to be quantified or supported by a statistical test.

      We appreciate this point. We will conduct statistical tests to quantify the degree of bimodality and clarify these findings in the results.

      Moreover, the two data sets differ in too many aspects to allow for any conclusions about whether the comparisons reflect differences in species (mouse vs. monkey), anatomy (L2/3-L4 vs. V1-V4), or recording technique (calcium imaging vs. extracellular spiking).

      We agree that the methodological and anatomical differences between the mouse and monkey datasets make any direct cross-species comparisons hard to interpret. We explicitly discuss this point in the Discussion section. We will add a section within the Discussion entitled “Limitations of this study”. We will further emphasize that our goal is not to attempt a direct quantitative comparison across species. We will further emphasize that the two experiments differ in terms of: (i) differences in recording modalities (calcium vs. electrophysiology) and associated differences in temporal resolution, neuronal types, and SNR, (ii) cortical targets (layers vs. areas), (iii) sample size, (iv) stimuli, (v) task conditions. In the revised manuscript, we will further highlight that our primary aim is to investigate inter-areal interactions within each species rather than to draw comparisons across species.

      Reviewer #2:

      Summary:

      In this work, the authors investigated the extent of shared variability in cortical population activity in the visual cortex in mice and macaques under conditions of spontaneous activity and visual stimulation. They argue that by studying the average response to repeated presentations of sensory stimuli, investigators are discounting the contribution of variable population responses that can have a significant impact at the single trial level. They hypothesized that, because these fluctuations are to some degree shared across cortical populations depending on the sources of these fluctuations and the relative connectivity between cortical populations within a network, one should be able to predict the response in one cortical population given the response of another cortical population on a single trial, and the degree of predictability should vary with factors such as retinotopic overlap, visual stimulation, and the directionality of canonical cortical circuits.

      To test this, the authors analyzed previously collected and publicly available datasets. These include calcium imaging of the primary visual cortex in mice and electrophysiology recordings in V1 and V4 of macaques under different conditions of visual stimulation. The strength of this data is that it includes simultaneous recordings of hundreds of neurons across cortical layers or areas. However, the weaknesses of calcium dynamics (which has lower temporal resolution and misses some non-linear dynamics in cortical activity) and multi-unit envelope activity (which reflects fluctuations in population activity rather than the variance in individual unit spike trains), underestimate the variability of individual neurons. The authors deploy a regression model that is appropriate for addressing their hypothesis, and their analytic approach appears rigorous and well-controlled.

      We agree that both calcium imaging and multi-unit envelope recordings have inherent limitations in capturing the variability of individual neuron spiking. Among other factors, the slower temporal resolution of calcium signals can blur fast spiking events, and multi-unit envelopes can mask single-unit heterogeneity. In the Discussion, we will explicitly mention these modality-specific caveats and note that our approach is meant to capture shared variability at the population level rather than the fine temporal structure of individual neurons and individual spikes.

      From their analysis, they found that there was significant predictability of activity between layer II/III and layer IV responses in mice and V1 and V4 activity in macaques, although the specific degree of predictability varied somewhat with the condition of the comparison with some minor differences between the datasets. The authors deployed a variety of analytic controls and explored a variety of comparisons that are both appropriate and convincing that there is a significant degree of predictability in population responses at the single trial level consistent with their hypothesis. This demonstrates that a significant fraction of cortical responses to stimuli is not due solely to the feedforward response to sensory input, and if we are to understand the computations that take place in the cortex, we must also understand how sensory responses interact with other sources of activity in cortical networks. However, the source of these predictive signals and their impact on function is only explored in a limited fashion, largely due to limitations in the datasets. Overall, this work highlights that, beyond the traditionally studied average evoked responses considered in systems neuroscience, there is a significant contribution of shared variability in cortical populations that may contextualize sensory representations depending on a host of factors that may be independent of the sensory signals being studied.

      We will include a section within the Discussion to emphasize the limitations in the datasets used in this study. We also agree and appreciate the reviewer’s description and will borrow some of the reviewer’s terminology to provide context in the Discussion section.

      The different recording modalities and comparisons (within vs. across cortical areas) limit the interpretability of the inter-species comparisons.

      We agree that the methodological and anatomical differences between the mouse and monkey datasets make any direct cross-species comparisons hard to interpret. We explicitly discuss this point in the Discussion section. We will add a section within the Discussion entitled “Limitations of this study”. We will further emphasize that our goal is not to attempt a direct quantitative comparison across species. We will further emphasize that the two experiments differ in terms of: (i) differences in recording modalities (calcium vs. electrophysiology) and associated differences in temporal resolution, neuronal types, and SNR, (ii) cortical targets (layers vs. areas), (iii) sample size, (iv) stimuli, (v) task conditions. In the revised manuscript, we will further highlight that our primary aim is to investigate inter-areal interactions within each species rather than to draw comparisons across species.

      Strengths:

      This work considers a variety of conditions that may influence the relative predictability between cortical populations, including receptive field overlap, latency that may reflect feed-forward or feedback delays, and stimulus type and sensory condition. Their analytic approach is well-designed and statistically rigorous. They acknowledge the limitations of the data and do not over-interpret their findings.

      Weaknesses:

      The different recording modalities and comparisons (within vs. across cortical areas) limit the interpretability of the inter-species comparisons.The mechanistic contribution of known sources or correlates of shared variability (eye movements, pupil fluctuations, locomotion, whisking behaviors) were not considered, and these could be driving or a reflection of much of the predictability observed and explain differences in spontaneous and visual activity predictions.

      We also appreciate this important point. We agree that multiple behavioral factors may significantly contribute to shared variability. In our analyses of the mouse data, we addressed non-visual influences by projecting out “non-visual ongoing neuronal activity” (as shown in Figure 6C, following the approach in Stringer et al. 2019). Additionally, we will further evaluate the contribution of behavioral measures available in the open dataset—such as running speed, whisking, pupil area, and “eigenface” components– to predictivity of neuronal responses.

      For the macaque data, the head-fixed and eye-fixation conditions help minimize some of these other potential behavioral contributions. Moreover, we have performed comparisons of eyes-open versus eyes-closed conditions (see Figure 5D). We will also analyze pupil size specifically for the lights-off condition. We do not have access to any other behavioral data from monkeys.

      Previous work has explored correlations in activity between areas on various timescales, but this work only considered a narrow scope of timescales.

      We appreciate this suggestion. We will perform additional analyses to evaluate predictivity at different temporal scales, as suggested.

      The observation that there is some degree of predictability is not surprising, and it is unclear whether changes in observed predictability with analysis conditions are informative of a particular mechanism or just due to differences in the variance of activity under those conditions. Some of these issues could be addressed with further analysis, but some may be due to limitations in the experimental scope of the datasets and would require new experiments to resolve.

      Our initial analyses in Fig.6A examined the effect of variance in activity and predictability in mice. As the reviewer intuited, there is a correlation between variance and predictability, at least when presenting a stimulus. Importantly, however, this is not the case when predicting activity in the absence of any stimulus. In the macaque, we cannot compute the variance across stimuli in the checkerboard case (single stimulus), but we will compute it for the conditions of the 4 moving bars. In addition, inspired by the reviewer’s question, we will perform an analysis where we further normalize the variance in activity.

      We would like to note that our key contribution is not to merely show that some degree of predictability is possible (which we agree is not surprising) but rather: (i) to use a simple approach to quantify this predictability, (ii) to assess directional differences in predictability, (iii) to evaluate how this predictability depends on neuronal properties and receptive field overlap, (iv) how it depends on the stimuli, and, importantly, (v) to compare predictability during visual stimulation versus absence of visual input.

      We agree with the limitations in the datasets. We will include a section within the Discussion to emphasize these limitations.

      Reviewer #3:

      Neural activity in the visual cortex has primarily been studied in terms of responses to external visual stimuli. While the noisiness of inputs to a visual area is known to also influence visual responses, the contribution of this noisy component to overall visual responses has not been well characterized.

      In this study, the authors reanalyze two previously published datasets - a Ca++ imaging study from mouse V1 and a large-scale electrophysiological study from monkey V1-V4. Using regression models, they examine how neural activity in one layer (in mice) or one cortical area (in monkeys) predicts activity in another layer or area. Their main finding is that significant predictions are possible even in the absence of visual input, highlighting the influence of non-stimulus-related downstream activity on neural responses. These findings can inform future modeling work of neural responses in the visual cortex to account for such non-visual influences.

      A major weakness of the study is that the analysis includes data from only a single monkey. This makes it hard to interpret the data as the results could be due to experimental conditions specific to this monkey, such as the relative placement of electrode arrays in V1 and V4.

      We will add data from at least two more monkeys, as suggested by the reviewer:

      ● First, we will include a second monkey from the same dataset [1]. The reason this monkey was not included in the original submission is that the dataset for this second monkey consisted of much less data than the original. For example, for the lights-off condition, the number of V4 channels with signal-to-noise ratio greater than 2 (recommended electrodes to use by dataset authors) is 9-12 in this second monkey, compared to 68-74 in the first monkey [1]. However, we will still add results for this second monkey.

      ● Additionally, we will include data from a new monkey by collaborating with the Ponce lab who will collect new data for this study.

      The authors perform a thorough analysis comparing regression-based predictions for a wide variety of combinations of stimulus conditions and directions of influence. However, the comparison of stimulus types (Figure 4) raises a potential concern. It is not clear if the differences reported reflect an actual change in predictive influence across the two conditions or if they stem from fundamental differences in the responses of the predictor population, which could in turn affect the ability to measure predictive relationships. The authors do control for some potential confounds such as the number of neurons and self-consistency of the predictor population. However, the predictability seems to closely track the responsiveness of neurons to a particular stimulus. For instance, in the monkey data, the V1 neuronal population will likely be more responsive to checkerboards than to single bars. Moreover, neurons that don't have the bars in their RFs may remain largely silent. Could the difference in predictability be just due to this? Controlling for overall neuronal responsiveness across the two conditions would make this comparison more interpretable.

      This is also a valid concern. As the reviewer noted, we controlled for the number of neurons and degree of self-consistency (Fig. 3A, 3C), and this was always done within their respective stimulus type.

      As the reviewer intuits, in Fig. 6A in mice, we show that predictability correlates with neuronal responsiveness. This observation only held during the stimulus condition and not during the gray screen condition. We also showed correlations with self-consistency metrics as a proxy for responsiveness in Fig. 6A and 6C. However, we will directly assess the impact of responsiveness in two ways: (i) by correlating predictability directly with neuronal responsiveness and (ii) by following the same subsampling approach in Fig. 3 to normalize the degree of responsiveness and recompute the predictability metrics.

      REFERENCES

      (1) Chen, X., Morales-Gregorio, A., Sprenger, J., Kleinjohann, A., Sridhar, S., van Albada, S.J., Grün, S., and Roelfsema, P.R. (2022). 1024-channel electrophysiological recordings in macaque V1 and V4 during resting state. Sci Data 9, 77. https://doi.org/10.1038/s41597-022-01180-1.

      (2) Stringer, C., Pachitariu, M., Steinmetz, N., Carandini, M., and Harris, K.D. (2019). High-dimensional geometry of population responses in visual cortex. Nature 571, 361–365. https://doi.org/10.1038/s41586-019-1346-5.

      (3) Stringer, C., Pachitariu, M., Carandini, M., and Harris, K. (2018). Recordings of 10,000 neurons in visual cortex in response to 2,800 natural images. (Janelia Research Campus). https://doi.org/10.25378/janelia.6845348.v4 https://doi.org/10.25378/janelia.6845348.v4.

    1. eLife Assessment

      This important study offers a molecular characterization of neurons and glia in the adult nervous system of the fruit fly Drosophila melanogaster. The study focuses on the progeny of a specific set of neural stem cells, called Type II neuroblasts that contribute to the central complex, a conserved brain region that plays key roles in sensorimotor integration. The data are convincing and collected using validated methodology, generating an invaluable resource for future studies. The study will be of interest to developmental neurobiologists.

    2. Reviewer #1 (Public review):

      Summary:

      Epiney et al. use single-nuclei RNA sequencing (snRNA-seq) to characterize the lineage of Type-2 (T2) neuroblasts (NBs) in the adult Drosophila brain. To isolate cells born from T2 NBs, the authors used a genetic tool that specifically allows the permanent labeling of T2-derived cell types, which are then FAC-sorted for snRNA-seq. This effective labeling approach also allows them to compare the isolated T2 lineage cells with T1-derived cell types by a simple exclusion method. The authors begin by describing a transcriptomic atlas for all T1 and T2-derived neuronal and glia clusters, reporting that the T2-derived lineage comprises 161 neuronal clusters, in contrast to the T1 lineage which comprises 114 of them. The authors then use the expression of VAChT, VGlut, Gad1, Tbh, Ple, SerT, and Tdc2 to show that T2 neuroblasts generate all major neuron classes of fast-acting neurotransmitters. Strikingly, they show that a subset of glia and neuronal clusters have disproportionate enrichment in males or females, suggesting that T2 neuroblasts generate sex-biased cell types. The authors then proceed to characterize neuropeptide expression across T2-derived neuronal clusters and argue that the same neuropeptide can be expressed across different cell types, while similar cell types can express distinct neuropeptides. The functional implication of both observations, however, remains to be tested. Furthermore, the authors describe combinatorial transcription factor (TF) codes that are correlated with neuropeptide expression for T2-derived neurons along with an overall TF code for all T2-derived cell types, both of which will serve as an important starting point for future investigations. Finally, the authors map well-studied neuronal types of the central complex to the clusters of their T2-derived snRNA-seq dataset. They use known marker combinations, bulk RNA-seq data and highly specific split-GAL4 driver lines to annotate their T2-derived atlas, establishing a comprehensive transcriptomic atlas that would guide future studies in this field.

      Strengths:

      This study provides an in-depth transcriptomic characterization of neurons and glia derived from Type-2 neuroblast lineages. The results of this manuscript offer several future directions to investigate the mechanisms of diversifying neuronal identity. The datasets of T1-derived and T2-derived cells will pave the way for studies focused on the functional analysis of combinatorial TF codes specifying cell identity, sex-based differences in neurogenesis and gliogenesis, the relationship between neuropeptide (co)expression and cell identity, and the differential contributions of distinct progenitor populations to the same cell type.

      Weaknesses:

      The study presents several important observations based on the characterization of Type II neuroblast-derived lineages. However, a mechanistic insight is missing for most observations. The idea that there is a sex-specific bias to certain T2-derived neurons and glial clusters is quite interesting, however, the functional significance of this observation is not tested or discussed extensively. Finally, the authors do not show whether the combinatorial TF code is indeed necessary for neuropeptide expression or if this is just a correlation due to cell identity being defined by TFs. Functional knockdown of some candidate TFs for a subset of neuropeptide-expressing cells would have been helpful in this case.

    3. Reviewer #2 (Public review):

      In this manuscript, Epiney et al., present a single-nucleus sequencing analysis of Drosophila adult central brain neurons and glia. By employing an ingenious permanent labeling technique, they trace the progeny of T2 neuroblasts, which play a key role in the formation of the central complex. This transcriptomic dataset is poised to become a valuable resource for future research on neurogenesis, neuron morphology, and behavior.

      The authors further delve into this dataset with several analyses, including the characterization of neurotransmitter expression profiles in T2-derived neurons. While some of the bioinformatic analyses are preliminary, they would benefit from additional experimental validation in future studies.

    1. eLife Assessment

      The paper addresses the question of gene epistasis and asks what is the correct null model for which we should declare no epistasis. By reanalyzing synthetic gene array datasets regarding single and double-knockout yeast mutants, and considering two theoretical models of cell growth, the authors reach the valuable conclusion that the product function is a good null model. The analysis is still incomplete, as some assumptions and hypotheses are not fully justified. However, once verified, the results have the potential to be of value to the field of gene epistasis.

    2. Reviewer #1 (Public review):

      Summary:

      Detecting unexpected epistatic interactions among multiple mutations requires a robust null expectation - or neutral function - that predicts the combined effects of multiple mutations on phenotype, based on the effects of individual mutations. This study assessed the validity of the product neutrality function, where the fitness of double mutants is represented as the multiplicative combination of the fitness of single mutants, in the absence of epistatic interactions. The authors utilized a comprehensive dataset on fitness, specifically measuring yeast colony size, to analyze epistatic interactions.

      The study confirmed that the product function outperformed other neutral functions in predicting the fitness of double mutants, showing no bias between negative and positive epistatic interactions. Additionally, in the theoretical portion of the study, the authors applied a well-established theoretical model of bacterial cell growth to simulate the growth rates of both single and double mutants under various parameters. The simulations further demonstrated that the product function was superior to other functions in predicting the fitness of hypothetical double mutants. Based on these findings, the authors concluded that the product function is a robust tool for analyzing epistatic interactions in growth fitness and effectively reflects how growth rates depend on the combination of multiple biochemical pathways.

      Strengths:

      By leveraging a previously published extensive dataset of yeast colony sizes for single- and double-knockout mutants, this study validated the relevance of the product function, commonly used in genetics to analyze epistatic interactions. The finding that the product function provides a more reliable prediction of double-mutant fitness compared to other neutral functions offers significant value for researchers studying epistatic interactions, particularly those using the same dataset.

      Notably, this dataset has previously been employed in studies investigating epistatic interactions using the product neutrality function. The current study's findings affirm the validity of the product function, potentially enhancing confidence in the conclusions drawn from those earlier studies. Consequently, both researchers utilizing this dataset and readers of previous research will benefit from the confirmation provided by this study's results.

      Weaknesses:

      This study exhibits several significant logical flaws, primarily arising from the following issues: a failure to differentiate between distinct phenotypes, instead treating them as identical; an oversight of the substantial differences in the mechanisms regulating cell growth between prokaryotes and eukaryotes; and the adoption of an overly specific and unrealistic set of assumptions in the mutation model. Additionally, the study fails to clearly address its stated objective-investigating the mechanistic origin of the multiplicative model. Although it discusses conditions under which deviations occur, it falls short of achieving its primary goal. Moreover, the paper includes misleading descriptions and unsubstantiated reasoning, presented without proper citations, as if they were widely accepted facts. Readers should consider these issues when evaluating this paper. Further details are discussed below.

      (1) Misrepresentation of the dataset and phenotypes

      The authors analyze a dataset on the fitness of yeast mutants, describing it as representative of the Malthusian parameter of an exponential growth model. However, they provide no evidence to support this claim. They assert that the growth of colony size in the dataset adheres to exponential growth kinetics; in contrast, it is known to exhibit linear growth over time, as indicated in [Supplementary Note 1 of https://doi.org/10.1038/nmeth.1534]. Consequently, fitness derived from colony size should be recognized as a different metric and phenotype from the Malthusian parameter. Equating these distinct phenotypes and fitness measures constitutes a fundamental error, which significantly compromises the theoretical discussions based on the Malthusian parameter in the study.

      (2) Misapplication of prokaryotic growth models

      The study attempts to explain the mechanistic origin of the multiplicative model observed in yeast colony fitness using a bacterial cell growth model, particularly the Scott-Hwa model. However, the application of this bacterial model to yeast systems lacks valid justification. The Scott-Hwa model is heavily dependent on specific molecular mechanisms such as ppGpp-mediated regulation, which plays a crucial role in adjusting ribosome expression and activity during translation. This mechanism is pivotal for ensuring the growth-dependency of the ribosome fraction in the proteome, as described in [https://doi.org/10.1073/pnas.2201585119]. Unlike bacteria, yeast cells do not possess this regulatory mechanism, rendering the direct application of bacterial growth models to yeast inappropriate and potentially misleading. This fundamental difference in regulatory mechanisms undermines the relevance and accuracy of using bacterial models to infer yeast colony growth dynamics.

      If the authors intend to apply a growth model with macroscopic variables to yeast double-mutant experimental data, they should avoid simply repurposing a bacterial growth model. Instead, they should develop and rigorously validate a yeast-specific growth model before incorporating it into their study.

      (3) Overly specific assumptions in the theoretical model

      The theoretical model in question assumes that two mutations affect only independent parameters of specific biochemical processes, an overly restrictive premise that undermines its ability to broadly explain the occurrence of the multiplicative model in mutations. Additionally, experimental evidence highlights significant limitations to this approach. For example, in most viable yeast deletion mutants with reduced growth rates, the expression of ribosomal proteins remains largely unchanged, in direct contradiction to the predictions of the Scott-Hwa model, as indicated in [https://doi.org/10.7554/eLife.28034]. This discrepancy emphasizes that the Scott-Hwa model and its derivatives do not reliably explain the growth rates of mutants based on current experimental data, suggesting that these models may need to be reevaluated or alternative theories developed to more accurately reflect the complex dynamics of mutant growth.

      (4) Lack of clarity on the mechanistic origin of the multiplicative model

      The study falls short of providing a definitive explanation for its primary objective: elucidating the "mechanistic origin" of the multiplicative model. Notably, even in the simplest case involving the Scott-Hwa model, the underlying mechanistic basis remains unexplained, leaving the central research question unresolved. Furthermore, the study does not clearly specify what types of data or models would be required to advance the understanding of the mechanistic origin of the multiplicative model. This omission limits the study's contribution to uncovering the biological principles underlying the observed fitness patterns.

    3. Reviewer #2 (Public review):

      The paper deals with the important question of gene epistasis, focusing on asking what is the correct null model for which we should declare no epistasis.

      In the first part, they use the Synthetic Genetic Array dataset to claim that the effects of a double mutation on growth rate are well predicted by the product of the individual effects (much more than e.g. the additive model). The second (main) part shows this is also the prediction of two simple, coarse-grained models for cell growth.

      I find the topic interesting, the paper well-written, and the approach innovative.

      One concern I have with the first part is that they claim that:<br /> "In these experiments, the colony area on the plate, a proxy for colony size, followed exponential growth kinetics. The fitness of a mutant strain was determined as the rate of exponential growth normalized to the rate in wild type cells."

      There are many works on "range expansions" showing that colonies expand at a constant velocity, the speed of which scales as the square root of the growth rate (these are called "Fisher waves", predicted in the 1940', and there are many experimental works on them, e.g. https://www.pnas.org/doi/epdf/10.1073/pnas.0710150104) If that's the case, the area of the colony should be proportional to growth_rate X time^2 , rather than exp(growth_rate*time), so the fitness they might be using here could be the log(growth_rate) rather than growth_rate itself? That could potentially have a big effect on the results.

      Additional comments/questions:

      (1) What is the motivation for the model where the effect of two genes is the minimum of the two?

      (2) How seriously should we take the Scott-Hwa model? Should we view it as a toy model to explain the phenomenon or more than that? If the latter, then since the number of categories in the GO analysis is much more than two (47?) in many cases the analysis of the experimental data would take pairs of genes that both affect one process in the Scott-Hwa model - and then the product prediction should presumably fail? The same comment applies to the other coarse-grained model.

      (3) There are many works in the literature discussing additive fitness contributions, including Kaufmann's famous NK model as well as spin-glass-type models (e.g. Guo and Amir, Science Advances 2019, Reddy and Desai, eLife 2021, Boffi et al., eLife 2023) These should be addressed in this context.

      (4) The experimental data is for deletions, but it would be interesting to know the theoretical model's prediction for the expected effects of beneficial mutations and how they interact since that's relevant (as mentioned in the paper) for evolutionary experiments. Perhaps in this case the question of additive vs. multiplicative matters less since the fitness effects are much smaller.

    1. eLife Assessment

      This study presents a valuable finding of novel markers that may potentially identify resident tendon stem/progenitor cells (TSPCs). The study also presents a comprehensive single-cell transcriptional dataset that will be of value to the field. The evidence supporting the identification of novel markers of a TSPC is incomplete, requiring clarification of current analyses, additional analyses between ages, and additional validation experiments to demonstrate that these markers are indeed specific and these cells are indeed TSPCs. This work will be of interest to biologists and engineers focused on tendons and ligaments.

    2. Reviewer #1 (Public review):

      This study is focused on identifying unique, innovative surface markers for mature Achilles tendons by combining the latest multi-omics approaches and in vitro evaluation, which would address the knowledge gap of the controversial identity of TPSCs with unspecific surface markers. The use of multi-omics technologies, in vivo characterization, in vitro standard assays of stem cells, and in vitro tissue formation is a strength of this work and could be applied for other stem cell quantification in musculoskeletal research. The evaluation and identification of Cd55 and Cd248 in TPSCs have not been conducted in tendons, which is considered innovative. Additionally, the study provided solid sequencing data to confirm co-expressions of Cd55 and Cd248 with other well-described surface markers such as Ly6a, Tpp3, Pdgfra, and Cd34. Generally, the data shown in the manuscript support the claims that the identified surface antigens mark TPSCs in juvenile tendons.

      However, there are missing links between scientific questions aimed to be addressed in Introduction and Methodology/Results. If the study focuses on unsatisfactory healing responses of mature tendons and understanding of mature TPSCs, at least mature Achilles tendons from more than 12-week-old mice and their comparison with tendons from juvenile/neonatal mice should be conducted. However, either 2-week or 6-week-old mice, used for characterization here, are not skeletally mature, Additionally, there is a lack of complete comparison of TPSCs between 2-week and 6-week-old mice in the transcriptional and epigenetic levels.

      In order to distinguish TPSCs and characterize their epigenetic activities, the authors used scRNA-seq, snRNA-seq, and snATAC-seq approaches. The integration, analysis, and comparison of sequencing data across assays and/or time points is confusing and incomplete. For example, it should be more comprehensive to integrate both scRNA-seq and snRNA-seq data (if not, why both assays were used for Achilles tendons of both 2-week and 6-week timepoints). snRNA-seq and snATAC-seq data of 6-week-old mice were separately analyzed. No comparison of difference and similarity of TPSCs of 2-week and 6-week-old mice was conducted.

      Given the goal of this work to identify specific TPSC markers, the specificity of Cd55 and Cd248 for TPSCs is not clear. First, based on the data shown here, Cd55 and Cd248 mark the same cell population which is identified by Ly6a, TPPP3, and Pdgfra. Although, for instance, Cd34 is expressed by other tissues as discussed here, no data/evidence is provided by this work showing that Cd55 and Cd248 are not expressed by other musculoskeletal tissues/cells. Second, the immunostaining of Cd55 and Cd248 doesn't support their specificity. What is the advantage of using Cd55 and Cd248 for TPSCs compared to using other markers?

    3. Reviewer #2 (Public review):

      Summary:

      The molecular signature of tendon stem cells is not fully identified. The endogenous location of tendon stem cells within the native tendon is also not fully elucidated. Several molecular markers have been identified to isolate tendon stem cells but they lack tendon specificity. Using the declining tendon repair capacity of mature mice, the authors compared the transcriptome landscape and activity of juvenile (2 weeks) and mature (6 weeks) tendon cells of mouse Achilles tendons and identified CD55 and CD248 as novel surface markers for tendon stem cells. CD55+ CD248+ FACS-sorted cells display a preferential tendency to differentiate into tendon cells compared to CD55neg CD248neg cells.

      Strengths:

      The authors generated a lot of data on juvenile and mature Achilles tendons, using scRNAseq, snRNAseq, and ATACseq strategies. This constitutes a resource dataset.

      Weaknesses:

      The analyses and validation of identified genes are not complete and could be pushed further. The endogenous expression of newly identified genes in native tendons would be informative. The comparison of scRNAseq and snRNAseq datasets for tendon cell populations would strengthen the identification of tendon cell populations.

    4. Reviewer #3 (Public review):

      Summary:

      In their report, Tsutsumi et al., use single nucleus transcriptional and chromatin accessibility analyses of mouse achilles tendon in an attempt to uncover new markers of tendon stem/progenitor cells. They propose CD55 and CD248 as novel markers of tendon stem/progenitor cells.

      Strengths:

      This is an interesting and important research area. The paper is overall well written.

      Weaknesses:

      Major problems:

      (1) It is not clear what tissue exactly is being analyzed. The authors build a story on tendons, but there is little description of the dissection. The authors claim to detect MTJ and cartilage cells, but not bone or muscle cells. The tendon sheath is known to express CD55, so the population of "progenitors" may not be of tendon origin.

      (2) Cluster annotations are seemingly done with a single gene. Names are given to cells without functional or spatial validation. For example, MTJ cells are annotated based on Postn, but it is never shown that Postn is only expressed at the MTJ, and not in other anatomical locations in the tendon.

      (3) The authors compare their data to public data based on interrogating single genes in their dataset. It is now standard practice to integrate datasets (eg, using harmony), or at a minimum using gene signatures built into Seurat (eg AddModuleScore).

      (4) Progenitor populations (SP1, SP2). The authors claim these are progenitors but show very clearly that they express macrophage genes. What are they, macrophages or fibroblasts?

      (5) All omics analysis is done on single data points (from many mice pooled). The authors make many claims on n=1 per group for readouts dependent on sample number (eg frequency of clusters).

      (6) The scRNAseq atlas in Figure 1 is made by analyzing 2W and 6W tendons at the same time. The snRNAseq and ATACseq atlas are built first on 2W data, after which the 6W data is compared. Why use the 2W data as a reference? Why not analyze the two-time points together as done with the scRNAseq?

      (7) Figure 5: The authors should show the gating strategy for FACS. Were non-fibroblasts excluded (eg, immune cells, endothelia...etc). Was a dead cell marker used? If not, it is not surprising that fibroblasts form colonies and express fibroblast genes when compared to CD55-CD248- immune cells, dead cells, or debris. Can control genes such as Ptprc or Pecam1 be tested to rule out contamination with other cell types?

      Minor problems:

      (1) Report the important tissue processing details: type of collagenase used. Viability before loading into 10x machine.

    1. eLife Assessment

      This manuscript presents an interesting new framework (VARX) for simultaneously quantifying effective connectivity in brain activity during sensory stimulation and how that brain activity is being driven by that sensory stimulation. The reviewers thought the model was original and its conclusion that intrinsic connectivity is largely unaltered during sensory stimulation is very interesting, but that future use of the model could potentially be affected by false positive conclusions. Overall, this work is important with solid evidence for its conclusions - it will be of interest to neuroscientists working on brain connectivity and dynamics.

    2. Reviewer #1 (Public review):

      This manuscript presents an interesting new framework (VARX) for simultaneously quantifying effective connectivity in brain activity during sensory stimulation and how that brain activity is being driven by that sensory stimulation. The core idea is to combine the Vector Autoregressive model that is often used to infer Granger-causal connectivity in brain data with an encoding model that maps the features of a sensory stimulus to that brain data. The authors do a nice job of explaining the framework. And then they demonstrate its utility through some simulations and some analysis of real intracranial EEG data recorded from subjects as they watched movies. They infer from their analyses that the functional connectivity in these brain recordings is essentially unaltered during movie watching, that accounting for the driving movie stimulus can protect one against misidentifying brain responses to the stimulus as functional connectivity, and that recurrent brain activity enhances and prolongs the putative neural responses to a stimulus.

      This manuscript presents an interesting new framework (VARX) for simultaneously quantifying effective connectivity in brain activity during sensory stimulation and how that brain activity is being driven by that sensory stimulation. Overall, I thought this was an interesting manuscript with some rich and intriguing ideas. That said, I had some concerns also - one potentially major - with the inferences drawn by the authors on the analyses that they carried out.

      Main comments:

      (1) My primary concern with the way the manuscript is written right now relates to the inferences that can be drawn from the framework. In particular, the authors want to assert that, by incorporating an encoding model into their framework, they can do a better job of accounting for correlated stimulus-driven activity in different brain regions, allowing them to get a clearer view of the underlying innate functional connectivity of the brain. Indeed, the authors say that they want to ask "whether, after removing stimulus-induced correlations, the intrinsic dynamic itself is preserved". This seems a very attractive idea indeed. However, it seems to hinge critically on the idea of fitting an encoding model that fully explains all of the stimulus-driven activity. In other words, if one fits an encoding model that only explains some of the stimulus-driven response, then the rest of the stimulus-driven response still remains in the data and will be correlated across brain regions and will appear as functional connectivity in the ongoing brain dynamics - according to this framework. This residual activity would thus be misinterpreted. In the present work, the authors parameterize their stimulus using fixation onsets, film cuts, and the audio envelope. All of these features seem reasonable and valid. However, they surely do not come close to capturing the full richness of the stimuli, and, as such, there is surely a substantial amount of stimulus-driven brain activity that is not being accounted for by their "B" model and that is being absorbed into their "A" model and misinterpreted as intrinsic connectivity. This seems to me to be a major limitation of the framework. Indeed, the authors flag this concern themselves by (briefly) raising the issue in the first paragraph of their caveats section. But I think it warrants much more attention and discussion.

      (2) Related to the previous comment, the authors make what seems to me to be a complex and important point on page 6 (of the pdf). Specifically, they say "Note that the extrinsic effects captured with filters B are specific (every stimulus dimension has a specific effect on each brain area), whereas the endogenous dynamic propagates this initial effect to all connected brain areas via matrix A, effectively mixing and adding the responses of all stimulus dimensions. Therefore, this factorization separates stimulus-specific effects from the shared endogenous dynamic." It seems to me that the interpretation of the filter B (which is analogous to the "TRF") for the envelope, say, will be affected by the fact that the matrix A is likely going to be influenced by all sorts of other stimulus features that are not included in the model. In other words, residual stimulus-driven correlations that are captured in A might also distort what is going on in B, perhaps. So, again, I worry about interpreting the framework unless one can guarantee a near-perfect encoding model that can fully account for the stimulus-driven activity. I'd love to hear the authors' thoughts on this. (On this issue - the word "dominates" on page 12 seems very strong.)

      (3) Regarding the interpretation of the analysis of connectivity between movies and rest... that concludes that the intrinsic connectivity pattern doesn't really differ. This is interesting. But it seems worth flagging that this analysis doesn't really account for the specific dynamics in the network that could differ quite substantially between movie watching and rest, right? At the moment, it is all correlational. But the dynamics within the network could be very different between stimulation and rest I would have thought.

      (4) I didn't really understand the point of comparing the VARX connectivity estimate with the spare-inverse covariance method (Figure 2D). What was the point of this? What is a reader supposed to appreciate from it about the validity or otherwise of the VARX approach?

      (5) I think the VARX model section could have benefitted a bit from putting some dimensions on some of the variables. In particular, I struggled a little to appreciate the dimensionality of A. I am assuming it has to involve both time lags AND electrode channels so that you can infer Granger causality (by including time) between channels. Including a bit more detail on the dimensionality and shape of A might be helpful for others who want to implement the VARX model.

      (6) A second issue I had with the inferences drawn by the authors was a difficulty in reconciling certain statements in the manuscript. For example, in the abstract, the authors write "We find that the recurrent connectivity during rest is largely unaltered during movie watching." And they also write that "Failing to account for ... exogenous inputs, leads to spurious connections in the intrinsic "connectivity".

    3. Reviewer #2 (Public review):

      Summary:

      The authors apply the recently developed VARX model, which explicitly models intrinsic dynamics and the effect of extrinsic inputs, to simulated data and intracranial EEG recordings. This method provides a directed method of 'intrinsic connectivity'. They argue this model is better suited to the analysis of task neuroimaging data because it separates the intrinsic and extrinsic activity. They show: that intrinsic connectivity is largely unaltered during a movie-watching task compared to eyes open rest; intrinsic noise is reduced in the task; and there is intrinsic directed connectivity from sensory to higher-order brain areas.

      Strengths:

      (1) The paper tackles an important issue with an appropriate method.

      (2) The authors validated their method on data simulated with a neural mass model.

      (3) They use intracranial EEG, which provides a direct measure of neuronal activity.

      (4) Code is made publicly available and the paper is written well.

      Weaknesses:

      It is unclear whether a linear model is adequate to describe brain data. To the author's credit, they discuss this in the manuscript. Also, the model presented still provides a useful and computationally efficient method for studying brain data - no model is 'the truth'.

      Appraisal of whether the authors achieve their aims:

      As a methodological advancement highlighting a limitation of existing approaches and presenting a new model to overcome it, the authors achieve their aim. Generally, the claims/conclusions are supported by the results.

      The wider neuroscience claims regarding the role of intrinsic dynamics and external inputs in affecting brain data could benefit from further replication with another independent dataset and in a variety of tasks - but I understand if the authors wanted to focus on the method rather than the neuroscientific claims in this manuscript.

      Impact:

      The authors propose a useful new approach that solves an important problem in the analysis of task neuroimaging data. I believe the work can have a significant impact on the field.

    1. eLife Assessment

      This study presents useful findings on the differences between male and hermaphrodite C. elegans connectomes and how they may result in changes in locomotory behavioural outputs. However, the study appears incomplete with respect to the relationship between sex-specific AVA wiring and male mate-finding. Another area of concern is that the analysis does not consider animal-to-animal variability in the wiring when attempting to identify significant differences between the male and hermaphrodite.

    2. Reviewer #1 (Public review):

      Summary:

      This work seeks to predict differences in neural function and behavior between male and hermaphrodite C. elegans by comparing their nervous system maps of synaptic wiring. The authors then seek to validate some of their predictions by measuring differences in neural activity or behavior, including in response to neuron-specific genetic manipulations. In particular, the authors focus on the role of neuron AVA which has notable differences in its connectivity between the male and hermaphrodite, and they use this and behavior measurements to argue for a role of AVA in mate-searching behavior in males.

      Strengths:

      A major strength of this work is its approach to investigating differences in wiring between males and hermaphrodites in a systematic and quantitative way. The work laudably takes advantage of recently available comprehensive connectomes, including across sexes of the same species, and applies concepts from network science to mining their differences. Another strength of the work is that it supplements network analysis with measurements of behavior, including with cell-specific genetic manipulations. The measurements and analysis will be of value to the scientific community.

      Weaknesses:

      The evidence to support conclusions about the special relationship between differences in AVA's wiring and male mate-finding appears incomplete. The authors selected AVA based on changes in wiring and then observed a decrease in male chemotaxis towards hermaphrodites for animals in which neuron AVA is inhibited. This is presented as evidence that specifically AVA is important for mate-finding, and therefore that changes in wiring inform changes in function. But given AVA's known role in all reversal-related locomotion, it is important to more forcefully rule out an alternative hypothesis that the observed deficits in mate-finding could be explained by any reversal circuitry motor defect (including those without wiring differences), rather than specifically attributed to AVA and its wiring. Similarly, more evidence is needed to show that deficits in reversal circuitry preferentially affect mate-seeking compared to other goal-directed navigation behaviors.

      There are some areas where methods would benefit from further justification or clarification. For example, the work would benefit from better justification for selecting sub-networks to study, or for combining bilaterally symmetric neurons. More details are also needed to better interpret calcium imaging studies, such as details about the indicator and illumination wavelength and intensity.

      Finally, there are some weaknesses inherent to the entire field of connectomic analysis that are necessarily also present here. For example, it is unclear how to weight the relative contributions of chemical versus electrical gap junctions when performing analyses of the wiring diagram, and the choice could potentially influence results. The wiring diagram also lacks information about timescales of neural dynamics or the role of neuromodulators or other molecular details that may influence the strength or function of various connections, and this poses a major challenge for predicting neural dynamics from neural wiring. For example, in their neural dynamics simulation, the authors assume that all neurons have the same conductance and reversal potentials - a standard practice - despite known diversity among neurons that limits the usefulness of this approach. It will be helpful to further acknowledge these limitations of the broader field.

    3. Reviewer #2 (Public review):

      Summary:

      In their study, Wang and co-workers aimed to identify sexual dimorphisms in the connectomes of male and hermaphrodite C. elegans, and link these to sex-related behaviors. To this end they analyzed and compared various network properties of simplified male and hermaphrodite connectome datasets, and then focused on the AVA premotor neurons, linking their distinctive connectivity with their differential influence on reversing behaviors between the two sexes.

      Strengths:

      The study employs a range of basic methods from network and computational neuroscience and provides experimental testing of one of the predictions of the analysis.

      Weaknesses:

      Various aspects of sexual dimorphism in the nervous system of C. elegans have already been described and discussed (reviewed, for example, in Emmons 2018, Walsh et al. 2021). In particular, Cook et al, (2019), who mapped the male connectome (which serves as the key data in the current study), included in their work an analysis of connectome-level differences between males and hermaphrodites. Unfortunately, the foundations of the current study are somewhat problematic, and the results it provides are rather rudimentary and do not provide substantial new insight.

      My critique of the study can be organized around several major issues.

      (1) Source data

      A large portion of the work is based on the analysis of a single male and a single hermaphrodite connectome datasets from Cook et al. 2019. These original connectomes were simplified in the current study, merging most individual neurons into neuron class nodes. As a measure of edge weight, the authors used the number of synaptic contacts between each two nodes. Cook et al. 2019 estimated this number to be of high variance, and even when considering unweighted connectivity (whether two nodes are at all connected or not) substantial variability exists between independent connectome datasets (e.g., Birari and Rabinowitch, 2024). Therefore, basing the analysis on synaptic weights from a single connectome (for each sex) may be somewhat unreliable.

      On top of this, a huge gap may exist between connectome structure and function, especially when overlooking: (1) the sign of the synapses (excitatory vs. inhibitory), (2) synaptic efficiency (a single strong synapse may be more efficient than multiple weak synapses), (3) the spatial distribution of the synapses (clusters of synapses, for example, may be stronger than scattered synapses). These should at the very least be acknowledged. Moreover, the pooling of electrical and chemical synapses done by the authors is problematic, as is assuming all electrical synapses are bidirectional. These and other factors may undermine the results of the analysis, and, again, at the very least should be considered and discussed.

      A minimal validation of the analysis could be achieved by sensitivity analyses. For example, studying how consistent the results are when: separately analyzing the chemical and electrical networks; binarizing synaptic contacts to existing vs. non-existing connections regardless of weight; and comparing with additional connectome datasets (at least for hermaphrodites).

      Another important approach for validation would be synaptic labeling of key pathways, in order to establish the extent to which they maintain sexual dimorphism across the population (as performed, for example, by Cook et al., 2019; Pechuk et al. 2022).

      (2) Statistical analysis

      Comparing any two connectomes will show differences in connectivity and other network properties. The question is to what degree the differences found in the current study between two particular male and hermaphrodite connectomes transcend such basic inconsistencies. This fundamental question is not addressed in the manuscript.

      A second major concern is that a considerable portion of the results are based on improper comparisons between male and hermaphrodite connectome measures.

      In Figure 1D,I,M,V, Figure 2D,H,L, Figure 4E,I there is no sense in statistically testing the differences between hermaphrodite sex-specific (N=2) and shared nodes. The sample size is way too small. Corresponding conclusions about male-specific neurons being different from hermaphrodite-specific neurons in terms of connectivity are thus improperly founded. Similarly, the analyses in Figure 1P,S, 2O,R contain more data points, because of connectivity, but could still be misleading, since all the edges there contain either HSN or VC (just two nodes).

      More so, any claim comparing the differences between two measures in males vs. hermaphrodites should be based on a 2X2 (or 3X2) design (e.g., tested using 2-way ANOVA with an interaction term). It is erroneous to interpret comparisons between two effects without directly comparing them (Makin et al., 2019).

      When more than one comparison is performed, a one-way ANOVA should precede post hoc analyses, and corrections for multiple comparisons should be carried out and reported.

      The plots in Figure 1E,W and Figure 4F,J are illustrative but do not contain any statistical test to support the claims about which functions are emphasized in which sex. They also rely on a very superficial categorization of individual neuron class function, whereas in reality, in C. elegans many neurons serve multiple functions.

      In Figures 5-7 individual data points should be plotted, and the error bars and boxes should be defined (in all figures).

      Finally, Figure 3C,F,I,L,N,P and Figure 5A-C lack statistical analysis (e.g., via bootstrapping). In addition, the term 'significantly' in the text should be reserved for statistical significance.

      (3) Testing network predictions

      A key emphasis of the network analysis concerns the AVA premotor neurons. It is well established that reversing behavior is controlled by premotor neurons such as AVA (e.g., Maricq et al. 1995) and that AVA activity is spontaneous and coupled to reversing (e.g., Chronis et al. 2007). More so, it has already been shown that male reversal frequency is higher than that of hermaphrodites (e.g., Mah et al. 1992; Zhao et al. 2003). Similar findings in the current study are thus not very surprising. The current study does add some new detail. Namely, the higher frequency of AVA activity in adult males compared to hermaphrodites, and the presumably sex-specific roles of RIC and DVC as well as several AVA glutamate receptors, in modulating reversing. At the same time, PQR, for example, showed no such role, contrary to the predictions.

      Incidentally, AVA is not a commander neuron, but rather a command or, preferably, a premotor neuron. Altogether, the major specific focus of the analysis, predicting a sexually dimorphic role for AVA, is not very novel.

      (4) Further predictions

      The discussion section presents several additional predictions stemming from the analysis. However, to me, they seem almost arbitrary.

      The statement claiming that the authors found the male pharyngeal connectome to be more strongly wired to the main connectome as opposed to previous findings, is unclear. Sex-specific differences in connectivity between the pharyngeal and somatic networks are immediately evident from the connectomes and do not require graph theoretical tools to be discovered (page 4 and discussion of Figure 3N).

      The prediction that the AIY→RIA→RMD_DV circuit may facilitate pheromone-guided olfactory steering behavior in males is not very strong. On the one hand, it is known that males respond to sex pheromones (notably, however, if these pheromone receptors are ectopically expressed in hermaphrodites then hermaphrodites also respond to the pheromones [Wan et al. 2019]). Since these pheromone-sensing neurons are also involved in other sensory processes, it is quite trivial that the circuits involved in general sensory-based steering should be shared with specific pheromone-based steering. The fact that the interneurons in the circuit may be more strongly connected (excitatory, inhibitory, electrical?) in males could imply many things but does not add much to the picture.

      The authors also mention AFD as having more synaptic contacts with AIY in males, and link this somehow to the dimorphic expression of insulin-like peptides in AFD. However, neuropeptide-based transmission is largely independent of synaptic connections, so I don't see the relevance.

      (5) Methods

      The example provided in the Methods section for calculating graph measures is very helpful. I am not sure, however, why the length of a path was defined as the reciprocal sum of the edge weights of the connections within the path. Why the reciprocal? Is it the sum of the reciprocals? Do more synaptic contacts imply a shorter path?

      The description in the text (as opposed to the Methods section) of node strength is not very clear: "The node strength measures how strongly a node directly possesses with other nodes in the network" - This should be clarified.

      For the RC simulation, I assume the sodium and potassium conductances are fixed. If so, they are leak currents themselves. What does the extra leak current represent? Obviously the simulation includes multiple arbitrary assumptions and parameter values. It would be useful to discuss at least the considerations for choosing the model design and parameters. I also assume that the delayed responses in the bottom neurons in Figure 4A (that still respond) are due to indirect synaptic connections (path lengths > 1)?

    1. eLife Assessment

      This study reports that activation of TFEB promotes lysosomal exocytosis and clearance of cholesterol from lysosomes, the strength of evidence for which is convincing with appropriate and validated methodology in line with current state-of-the-art. The significance of the findings is important in the context of Niemann-Pick Disease Type C as well as other subfields.

    2. Reviewer #2 (Public review):

      Summary:

      This study presents an important finding that the activation of TFEB by sulforaphane (SFN) could promote lysosomal exocytosis and biogenesis in NPC, suggesting a potential mechanism by SFN for the removal of cholesterol accumulation, which may contribute to the development of new therapeutic approaches for NPC treatment.

      Strengths:

      The cell-based assays are convincing, utilizing appropriate and validated methodologies to support the conclusion that SFN facilitates the removal of lysosomal cholesterol via TFEB activation.

      Comments on revisions:

      The authors have addressed most of my questions. I have only one minor technical point to emphasize, which does not affect the overall strength of the evidence for this project.

      The pKa values of pHrodo Green (P35368, pKa=6.757) and pHrodo Red-Dex (P10361, pKa=6.816) are very similar. Prof. Xu's article, cited in the response letter (Hu, Li et al. 2022), is an excellent example of lysosomal pH measurement. He used LysoTracker Red DND-99 for a rough estimation of lysosomal acidity, and for accurate monitoring of lysosomal pH, he employed the ratiometric OG488-dex (pKa 4.6).

    1. eLife Assessment

      This revision of important work is a versatile addition to the chemical protein modifications and bioconjugation toolbox in synthetic biology. The technology developed cleverly uses Connectase to irreversibly fuse proteins of interest together so they can be studied in their native context, with compelling well-controlled data showing the technique works for various protein partners. This work will help multiple fields to explore multi-function constructs in basic synthetic biology. This work will also be of interest to those studying fusion oncoproteins commonly expressed in various human pathologies.

    2. Reviewer #1 (Public review):

      Fuchs describes a novel method of enzymatic protein-protein conjugation using the enzyme Connectase. The author is able to make this process irreversible by screening different Connectase recognition sites to find an alternative sequence that is also accepted by the enzyme. They are then able to selectively render the byproduct of the reaction inactive, preventing the reverse reaction, and add the desired conjugate with the alternative recognition sequence to achieve near-complete conversion. I agree with the authors that this novel enzymatic protein fusion method has several applications in the field of bioconjugation, ranging from biophysical assay conduction to therapeutic development. Previously the author has published on the discovery of the Connectase enzymes and has shown its utility in tagging proteins and detecting them by in-gel fluorescence. They now extend their work to include the application of Connectase in creating protein-protein fusions, antibody-protein conjugates, and cyclic/polymerized proteins. As mentioned by the author, enzymatic protein conjugation methods can provide several benefits over other non-specific and click chemistry labeling methods. Connectase specifically can provide some benefits over the more widely used Sortase, depending on the nature of the species that is desired to be conjugated. Overall, this method provides a novel, reproducible way to enzymatically create protein-protein conjugates.

      The manuscript is well-written and will be of interest to those who are specifically working on chemical protein modifications and bioconjugation.

      Comments on revisions:

      The authors have improved the manuscript significantly by clarifying the questions raised adding new text, providing additional references and/or adding additional data. The thorough study and efficiency of the method for enzymatic protein-protein conjugation using the enzyme Connectase warrants publication of this manuscript in its current form.

    3. Reviewer #2 (Public review):

      Summary:

      Unlike previous traditional protein fusion protocols, the author claims their proposed new method is fast, simple, specific, reversible, and results in a complete 1:1 fusion. A multi-disciplinary approach from cloning and purification, biochemical analyses, and proteomic mass spec confirmation revealed fusion products were achieved.

      Strengths:

      The author provides convincing evidence that an alternative to traditional protein fusion synthesis is more efficient with 100% yields using connectase. The author optimized the protocol's efficiency with assays replacing a single amino acid and identification of a proline aminopeptidase, Bacilius coagulans (BcPAP), as a usable enzyme to use in the fusion reaction. Multiple examples including Ubiquitin, GST, and antibody fusion/conjugations reveal how this method can be applied to a diverse range of biological processes.

      Weaknesses:

      Though the ~100% ligation efficiency is an advancement, the long recognition linker may be the biggest drawback. For large native proteins that are challenging/cannot be synthesized and require multiple connectase ligation reactions to yield a complete continuous product, the multiple interruptions with long linkers will likely interfere with protein folding, resulting in non-native protein structures. This method will be a good alternative to traditional approaches as the author mentioned but limited to generating epitope/peptide/protein tagged proteins, and not for synthetic protein biology aimed at examining native/endogenous protein function in vitro.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Fuchs describes a novel method of enzymatic protein-protein conjugation using the enzyme Connectase. The author is able to make this process irreversible by screening different Connectase recognition sites to find an alternative sequence that is also accepted by the enzyme. They are then able to selectively render the byproduct of the reaction inactive, preventing the reverse reaction, and add the desired conjugate with the alternative recognition sequence to achieve near-complete conversion. I agree with the authors that this novel enzymatic protein fusion method has several applications in the field of bioconjugation, ranging from biophysical assay conduction to therapeutic development. Previously the author has published on the discovery of the Connectase enzymes and has shown its utility in tagging proteins and detecting them by in-gel fluorescence. They now extend their work to include the application of Connectase in creating protein-protein fusions, antibody-protein conjugates, and cyclic/polymerized proteins. As mentioned by the author, enzymatic protein conjugation methods can provide several benefits over other non-specific and click chemistry labeling methods. Connectase specifically can provide some benefits over the more widely used Sortase, depending on the nature of the species that is desired to be conjugated. However, due to a similar lengthy sequence between conjugation partners, the method described in this paper does not provide clear benefits over the existing SpyTag-SpyCatcher conjugation system.  Additionally, specific disadvantages of the method described are not thoroughly investigated, such as difficulty in purifying and separating the desired product from the multiple proteins used. Overall, this method provides a novel, reproducible way to enzymatically create protein-protein conjugates.

      The manuscript is well-written and will be of interest to those who are specifically working on chemical protein modifications and bioconjugation.

      I'd like to comment on two points.

      (1) The benefits over the SpyTag-SpyCatcher system. Here, the conjugation partners are fused via the 12.3 kDa SpyCatcher protein, which is considerably larger than the Connectase fusion sequence (19 aa). This is mentioned in the introduction (p. 1 ln 24-26). Furthermore, SpyTag-SpyCatcher fusions are truly irreversible, while Connectase/BcPAP fusions may be reversed (p. 8, ln 265-273). For example, target proteins (e.g., AGAFDADPLVVEI-Protein) may be covalently fused to functionalized magnetic beads (e.g., Bead-ELASKDPGAFDADPLVVEI) in order to perform a pulldown assay. After the assay, the target protein and any bound interactors could be released from the beads by the addition of a Connectase / peptide (AGAFDAPLVVEI) mixture.

      In a related technology, the SpyTag-SpyCatcher system was split into three components, SpyLigase, SpyTag and KTag  (Fierer et al., PNAS 2014). The resulting method introduces a sequence between the fusion partners (SpyTag (13aa) + KTag (10aa)), which is similar in length to the Connectase fusion sequence (p. 8, ln 297 - 298). Compared to the original method, however, this approach seems to require longer incubation times, while yielding less fusion product (Fierer et al., Figure 2).

      (2) Purification of the fusion product. The method is actually advantageous in this respect, as described in the discussion (p. 8, ln 258-264). Examples are now provided in Figure 6.

      Reviewer #2 (Public review):

      Summary:

      Unlike previous traditional protein fusion protocols, the author claims their proposed new method is fast, simple, specific, reversible, and results in a complete 1:1 fusion. A multi-disciplinary approach from cloning and purification, biochemical analyses, and proteomic mass spec confirmation revealed fusion products were achieved.

      Strengths:

      The author provides convincing evidence that an alternative to traditional protein fusion synthesis is more efficient with 100% yields using connectase. The author optimized the protocol's efficiency with assays replacing a single amino acid and identification of a proline aminopeptidase, Bacilius coagulans (BcPAP), as a usable enzyme to use in the fusion reaction. Multiple examples including Ubiquitin, GST, and antibody fusion/conjugations reveal how this method can be applied to a diverse range of biological processes.

      Weaknesses:

      Though the ~100% ligation efficiency is an advancement, the long recognition linker may be the biggest drawback. For large native proteins that are challenging/cannot be synthesized and require multiple connectase ligation reactions to yield a complete continuous product, the multiple interruptions with long linkers will likely interfere with protein folding, resulting in non-native protein structures. This method will be a good alternative to traditional approaches as the author mentioned but limited to generating epitope/peptide/protein tagged proteins, and not for synthetic protein biology aimed at examining native/endogenous protein function in vitro.

      The assessment is fair, and I have no further comments to add.

      Reviewer #1 (Recommendations for the authors):

      Major/Experimental Suggestions:

      (1) Throughout the paper only one reaction shown via gels had 100% conversion to desired product (Figure 3C). It is misleading to title a paper with absolutes such as "100% product yield", when the majority of reactions show >95% product yield, without any purification. Please change the title of the manuscript to something along the lines of "Novel Irreversible Enzymatic Protein Fusions with Near-Complete Product Yield".

      The conjugation reaction is thermodynamically favored. It is driven by the hydrolysis of a peptide bond (P|GADFDADPLVVEI), which typically releases 8 - 16 kJ/mol energy. This should result in a >99.99% complete reaction (DG° = -RT ln (Product/Educt)). In line with this, 99% - 100% of the less abundant educts (LysS, Figure 3A; MBP, Figure 3B; Ub-Strep, Figure 3C) are converted in the time courses (Figure 3D-F show different reaction conditions, which slow down conjugate formation). 100% conversion are also shown in Figure 5, Figure 6, and Figure S4. Likewise, 99.6% relative fusion product signal intensity in an LCMS analysis (Figure S2) after 4h reaction time (0.13% and 0.25% educts). In this experiment, the proline had been removed from 99.8% of the peptide byproducts (P|GADFDADPLVVEI). It is clear that this reaction is still ongoing and that >99.99% of the prolines will be removed from the peptides in time. These findings suggest that the conjugation reaction gradually slows down the less educt is available, but eventually reaches completion.

      For some experiments, lower product yields (e.g. 97% in Figure 3B) are reported in the paper. These were calculated with Yield = 100% x Product / (Educt1 + Educt 2 + Product). With this formula, 100% conjugation can only be achieved with exactly equimolar educt quantities, because both educt 1 and educt 2 need to be converted entirely. If one educt 1 is available in excess, for example because of protein concentration measurement inaccuracies or pipetting errors, some of it will be left without fusion partner. In case of Figure 3B, 3% more GST seemed to have been in the mixture. These are methodological inaccuracies.

      (2) Please provide at least one example of a purified desired product, and mention the difficulties involved as a disadvantage to this particular method. Separating BcPAP, Connectase, and the desired protein-protein conjugate may prove to be quite difficult, especially when Connectase cleaves off affinity tags.

      Examples are now provided in Figure 6. As described in the discussion (p. 8, ln 258-264), the simple product purification is one of the advantages of the method.

      (3) For the antibody conjugate, please provide an example of conjugating an edduct that would prove to be more useful in the context of antibodies. For example, as you mention in the introduction, conjugation of fluorophores, immobilization tags such as biotin, and small molecule linker/drugs are useful bioconjugates to antibodies.

      Antibody-biotinylation is now shown in Figure S6; Antibody-fluorophore conjugates are part of Figures S5 and S7.

      (4) Please assess the stability of these protein-protein conjugates under various conditions (temperature, pH, time) to ensure that the ligation via Connectase is stable over a broad array of conditions. In particular, a relevant antibody-conjugate stability assay should be done over the period of 1-week in both buffer and plasma to show applicability for potential therapeutics.

      The stability of an antibody-biotin conjugate in blood plasma over 7 days at different temperatures is now shown in Figure S7.

      Generally, Connectase introduces a regular peptide bond (Asp-Ala) with a high chemical and physical stability (e.g. 10 min incubation at 95°C in SDS-PAGE loading buffer; H2O-formic acid / acetonitrile gradients for LC-MS). The sequence may be susceptible to proteases, although this is not the case in HEK293 cells (antibody expression), E. coli, or blood plasma (Figure S7).

      (5) Please conduct functional assays with the antibody-protein/peptide conjugates to show that the antibody retains binding capabilities to the HER-2 antigen and the modification was site-selective, not interfering with the binding paratope or binding ability of the antibody in any way. This can be done through bio-layer interferometry, surface plasmon resonance, ELISA, etc.

      We plan the immobilization of the HER2 antibody on microplates and its use in an ELISA. However, this experiment requires significant testing and optimizations. It will be part of a future paper on the use of Connectase for protein immobilization.

      For now, the mass spectrometry data provide clear evidence of a single site-selective conjugation, as the C-terminal ELASKDPGAFDADPLVVEI-Strep sequence is replaced by ELASKDAGAFDADPLVVEI(-Ub). Given that the conjugation sites at the C-termini are far from the antigen binding sites, and have already been used in a number of other approaches (e.g., SpyTag, SnapTag, Sortase), it appears unlikely that these conjugations interfere with antigen binding.

      (6) Please include gels of all proteins used in ligation reactions after purification steps in the SI to show that each species was pure.

      The pure proteins are now shown in Figure S9.

      (7) Please provide the figures (not just tables) of LC/MS deconvoluted mass spectra graphs for all conjugates, either in the main text or the SI.

      Please specify which spectra you are missing. I believe all relevant spectra are shown in Figures 4, 5, and S3. The primary data can be found in Dataset S2.

      (8) Please provide more information in the methods section on exactly how the densitometry quantification of gel bands was performed with ImageJ.

      Details on the quantification with Image Studio Lite 5.2 were added in the method section (p. 17, ln 461-463).

      Minor Suggestions:

      (1) Page 1, line 19: can include one sentence on what assays these particular bioconjugations are usefule for (e.g. internalization cell studies, binding assays, etc.)

      I prefer not to provide additional details here to keep the text concise and focused.

      (2) Page 1, line 22: "three to ten equivalents" instead of 3x-10x.

      Done.

      (3) Page 1, line 23: While NHS labeling is widely considered non-specific, maleimide conjugation to free cysteines is generally considered specific for engineered free cysteine residues, since native proteins often do not have free cysteine residues available for conjugation. If you are referring to the potential of maleimides to label lysines as well, that should be specifically stated.

      I modified the sentence, now stating that these methods are "can be" unspecific.

      As pointed out, it is possible to achieve specificity by eliminating all other free cysteines and/or engineering a cysteine in an appropriate position. In many other cases, however (e.g., natural antibodies), several cysteines are available, or the sample contains other proteins/peptides. I did not want to go into more detail here and refer to the cited review.

      (4) Page 1, line 31: "and an oligoglycine G(1-5)-B"

      Done.

      (5) Page 1, line 34: It is not clear where in the source these specific Km values are coming from, considering these are variable based on specific conditions/substrates and tend to be reaction-specific.

      I cited another review, which lists the same values, along with a few other measurements (Jacobitz et al., Adv Protein Chem Struct Biol 2017, Table 2). It is clear that each of these measurements differs somewhat, but they are generally comparable (K<sub>M</sub>(LPETG) = 5500 - 8760 µM; K<sub>M</sub>(GGGGG) = 140 - 196 µM). I chose the cited study (Frankel et al., Biochemistry 2005), because it also investigated hydrolysis rates. In this study, the measurements are derived from the plots in Figure 2.

      (6) Page 1, line 47: the comparison to western blots feels a little like apples to oranges, even though this comparison was made in previous literature. Engineering an expressed protein to have this tag and then using the tag to detect and quantify it, feels more akin to a tagging/pull down assay than a western blot in which unmodified proteins are easily detected.

      It is akin to a frequently used type of western blots with tag-specific antiboies, e.g. Anti-His<sub>6</sub>, -Streptavidin, -His<sub>6</sub>, -HA ,-cMyc, -Flag. I modified the sentence to clarify this.

      (7) Page 2, line 51: "Connectase cleaves between the first D and P amino acids in the recognition sequence, resulting in an N-terminal A-ELASKD-Connectase intermediate and a C-terminal PGAFDADPLVVEI peptide."

      I prefer the current sentence, because we assume that a bond between the aspartate and Connectase is formed before PGAFDADPLVVEI is cleaved off.

      (8) Page 3, line 94: "Exact determination is not possible due to reversibility of the reaction", the way it is stated now sounds like it is a flaw in the methods. Also, update Figure 2 to read "Estimated relative ligation rate".

      Done.

      (9) Page 3, lines 101-107: This is worded in a confusing way. It can either be X<sub>1</sub> or X<sub>2</sub> that is inactivated depending on if the altered amino acid is on the original protein sequence or on the desired edduct to conjugate. You first give examples of how to render other amino acids inactive, but then ultimately state that proline made inactive, so separate the two distinct possibilities a bit more clearly.

      The reaction requires the inactivation of X<sub>1</sub>, without affecting X<sub>2</sub> (ln 100 - 102). This is true, no matter whether it is X<sub>1</sub> = A, C, S, or P that is inactivated. I added a sentence to clarify this (ln 102 – 103).

      (10) Page 4, line 118: Give a one-sentence justification for why these proteins were chosen to work with (easy to express, stable, etc).

      Done.

      (11) Page 5, line 167: "payload molecules".

      Done.

      (12) Page 5, lines 170-173: Word this more clearly- "full conversion with many of these methods is difficult on antibodies due to each heavy and light chain being modified separately, resulting in only a total yield of 66% DAR4 even when 90% of each chain is conjugated."

      I rephrased the section.

      (13) Page 8, line 290: Discuss other disadvantages of this method including difficulties purifying and in incorporating such a long sequence into proteins of interest.

      Product purification is shown in the new Figure 6. As stated above, I consider the simple purification process an advantage of the method.  The genetic incorporation of the sequence into proteins is a routine process and should not make any difficulties. The disadvantages of long linker sequences between fusion partners are now discussed (p.8 – 9, ln 300-302).

      (14) Page 10, line 341: 'The experiment is described and discussed in detail in a previously published paper.31"

      Done.

      Reviewer #2 (Recommendations for the authors):

      Minor Points:

      (1) It's unclear how the author derived 100% ligation rate with X = Proline in Figure 2 when there is still residual unligated UB-Strep at 96h. Please provide an expanded explanation for those not familiar with the protocol. Is the assumption made that there will be no UB-Strep if the assay was carried out beyond 96h?

      I clarified the figure legend. The assay shows the formation of an equilibrium between educts and products. Therefore, only ~50% Ub-Strep is used with X = Proline (see p. 2, ln 79 - 81). The "relative ligation rate" refers to the relative speed with which this equilibrium is established. The highest rate is seen with X = Proline, and it is set to 100%. The other rates are given relative to the product formation with X = Proline.

      (2) Though the qualitative depiction of the data in Figure 3 is appreciated, an accompanying graphical representation of the data in the same figure will greatly enhance reception and better comprehension of several of the author's conclusions.

      Graphs are now shown in Figure S1.

      (3) Figure 3 panel E is misaligned. Please align it with panel B above it.

      Done, thank you.

      (4) The author refers to 'The resulting circular assemblies (37% UB2...)' in the text but identifies it as UB-C2 in Figure 5B. Is this a mistake or does UB2 refer to another assembly not mentioned in the Figures? Please check for inconsistencies.

      All circular assemblies are now labeled Ub-C <sub>1-6</sub>.

      (5) Finishing with a graphical schematic that depicts the entire protocol in a simple image would be much appreciated and well-received by readers. Including the scheme with A and B proteins, the recognition linkers, the addition of connectase and BcPAP, etc. to the final resulting protein with connected linker.

      A graphical summary of the reaction is now included in Figure 6.

    1. eLife Assessment

      This manuscript addresses a mechanism by which dopamine (DA) regulates synaptic plasticity. The authors build upon their previous finding that DA applied after a timing pattern that ordinarily induces long-term depression (LTD) now induces long-term potentiation (LTP). The new findings that this "DA-dependent LTP" involves de novo protein synthesis, a cyclicAMP signalling pathway, and calcium-permeable AMPA receptors (CP-AMPARs) are of valuable significance. The conclusions are convincing and largely supported by the evidence provided.

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Fuchsberger et al. demonstrate a set of experiments which ultimately identifies the de novo synthesis of GluA1-, but not GluA2-containing Ca2+ permeable AMPA receptors as a key driver of dopamine-dependent LTP (DA-LTP) during conventional post-before-pre spike-timing dependent (t-LTD) induction. The authors further identify adenylate cyclase 1/8, cAMP, and PKA as the crucial mitigators of these actions. While some comments have been identified below, the experiments presented are thorough and address the aims of the manuscript, figures are presented clearly (with minor comments), and experimental samples sizes and statistical analyses are suitable. Suitable controls have been utilized to confirm the role of Ca2+ permeable AMPAR. This work provides a valuable step forward built on convincing data towards understanding the underlying mechanisms of spike-timing dependent plasticity and dopamine.

      Strengths:

      Appropriate controls were used.

      The flow of data presented is logical and easy to follow.

      The quality of the data is solid.

      Weaknesses:

      Our concerns raised within the first round of review have been appropriately addressed by the authors.

    3. Reviewer #2 (Public review):

      Summary:

      The aim was to identify the mechanisms that underlie a form of long-term potentiation (LTP) that requires activation of dopamine (DA).

      Strengths:

      The authors have provided multiple lines of evidence that supports their conclusions; namely that this pathway involves activation of a cAMP / PKA pathway that leads to the insertion of calcium permeable AMPA receptors.

      Weaknesses:

      Some of the experiments could have been conducted in a more convincing manner.

    4. Reviewer #3 (Public review):

      The manuscript of Fuchsberger et al. investigates the cellular mechanisms underlying dopamine-dependent long-term potentiation (DA-LTP) in mouse hippocampal CA1 neurons. The authors conducted a series of experiments to measure the effect of dopamine on the protein synthesis rate in hippocampal neurons and its role in enabling DA-LTP. The key results indicate that protein synthesis is increased in response to dopamine and neuronal activity in the pyramidal neurons of the CA1 hippocampal area, mediated via the activation of adenylate cyclases subtypes 1 and 8 (AC1/8) and the cAMP-dependent protein kinase (PKA) pathway. Additionally, the authors show that postsynaptic DA-induced increases in protein synthesis are required to express DA-LTP, while not required for conventional t-LTP.

      The increased expression of the newly synthesized GluA1 receptor subunit in response to DA supports the formation of homomeric calcium-permeable AMPA receptors (CP-AMPARs). This evidence aligns well with data showing that DA-LTP expression requires the GluA1 AMPA subunit and CP-AMPARs, as DA-LTP is absent in the hippocampus of a GluA1 genetic knock-out mouse model.

      Comments on revisions:

      The authors addressed adequately all my comments.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Fuchsberger et al. demonstrate a set of experiments that ultimately identifies the de novo synthesis of GluA1-, but not GluA2-containing Ca2+ permeable AMPA receptors as a key driver of dopamine-dependent LTP (DA-LTP) during conventional post-before-pre spike-timing dependent (t-LTD) induction. The authors further identify adenylate cyclase 1/8, cAMP, and PKA as the crucial mitigators of these actions. While some comments have been identified below, the experiments presented are thorough and address the aims of the manuscript, figures are presented clearly (with minor comments), and experimental sample sizes and statistical analyses are suitable. Suitable controls have been utilized to confirm the role of Ca2+ permeable AMPAR. This work provides a valuable step forward built on convincing data toward understanding the underlying mechanisms of spike-timing-dependent plasticity and dopamine.

      Strengths:

      Appropriate controls were used.

      The flow of data presented is logical and easy to follow.

      The quality of the data, except for a few minor issues, is solid.

      Weaknesses:

      The drug treatment duration of anisomycin is longer than the standard 30-45 minute duration (as is the 500uM vs 40uM concentration) typically used in the field. Given the toxicity of these kinds of drugs long term it's unclear why the authors used such a long and intense drug treatment.

      In an initial set of control experiments (Figure S 1C-D) we wanted to ensure that protein synthesis was definitely blocked and therefore used a relatively high concentration of anisomycin and a relatively long pre-incubation period. We agree with the Reviewer that we cannot exclude the possibility that this treatment could compromise cell health in addition to the protein synthesis block. Therefore, we carried out an additional experiment with an alternative protein synthesis inhibitor cycloheximide at a lower standard concentration (10 µM) which confirmed a significant reduction in the puromycin signal (Figure S 1A-B). Together these results support the conclusion that puromycin signal is specific to protein synthesis in our labelling assay.

      Furthermore, in the electrophysiology experiments, we used 500 μM anisomycin in the patch pipette solution. Under these conditions, we recorded a stable EPSP baseline for 60 minutes, indicating that the treatment did not cause toxic effects to the cell (Figure S1F). This high concentration would ensure an effective block of local translation at dendritic sites. Nevertheless, we also carried out this experiment with cycloheximide at a lower standard concentration (10 µM) and observed a similar result with both protein synthesis inhibitors (Figure 1F).

      With some of the normalizations (such as those in S1) there are dramatic differences in the baseline "untreated" puromycin intensities - raising some questions about the overall health of slices used in the experiments.

      We agree with the Reviewer that there is a large variability in the normalised puromycin signal which might be due to variability in the health of slices. However, we assume that the same variability would be present in the treated slices, which showed, despite the variability, a significant inhibition of protein synthesis. To avoid any bias by excluding slices with low puromycin signal in the control condition, we present the full dataset.

      The large set of electrophysiology experiments carried out in our study (all recorded cells were evaluated for healthy resting membrane potential, action potential firing, and synaptic responses) confirmed that, generally, the vast majority of our slices were indeed healthy. 

      Reviewer #2 (Public Review):

      Summary:

      The aim was to identify the mechanisms that underlie a form of long-term potentiation (LTP) that requires the activation of dopamine (DA).

      Strengths:

      The authors have provided multiple lines of evidence that support their conclusions; namely that this pathway involves the activation of a cAMP / PKA pathway that leads to the insertion of calcium-permeable AMPA receptors.

      Weaknesses:

      Some of the experiments could have been conducted in a more convincing manner.

      We carried out additional control experiments and analyses to address the specific points that were raised.

      Reviewer #3 (Public Review):

      The manuscript of Fuchsberger et al. investigates the cellular mechanisms underlying dopamine-dependent long-term potentiation (DA-LTP) in mouse hippocampal CA1 neurons. The authors conducted a series of experiments to measure the effect of dopamine on the protein synthesis rate in hippocampal neurons and its role in enabling DA-LTP. The key results indicate that protein synthesis is increased in response to dopamine and neuronal activity in the pyramidal neurons of the CA1 hippocampal area, mediated via the activation of adenylate cyclases subtypes 1 and 8 (AC1/8) and the cAMP-dependent protein kinase (PKA) pathway. Additionally, the authors show that postsynaptic DA-induced increases in protein synthesis are required to express DA-LTP, while not required for conventional t-LTP.

      The increased expression of the newly synthesized GluA1 receptor subunit in response to DA supports the formation of homomeric calcium-permeable AMPA receptors (CP-AMPARs). This evidence aligns well with data showing that DA-LTP expression requires the GluA1 AMPA subunit and CP-AMPARs, as DA-LTP is absent in the hippocampus of a GluA1 genetic knock-out mouse model. Overall, the study is solid, and the evidence provided is compelling. The authors clearly and concisely explain the research objectives, methodologies, and findings. The study is scientifically robust, and the writing is engaging. The authors' conclusions and interpretation of the results are insightful and align well with the literature. The discussion effectively places the findings in a meaningful context, highlighting a possible mechanism for dopamine's role in the modulation of protein-synthesis-dependent hippocampal synaptic plasticity and its implications for the field. Although the study expands on previous works from the same laboratory, the findings are novel and provide valuable insights into the dynamics governing hippocampal synaptic plasticity.

      The claim that GluA1 homomeric CP-AMPA receptors mediate the expression of DA-LTP is fascinating, and although the electrophysiology data on GluA1 knock-out mice are convincing, more evidence is needed to support this hypothesis. Western blotting provides useful information on the expression level of GluA1, which is not necessarily associated with cell surface expression of GluA1 and therefore CP-AMPARs. Validating this hypothesis by localizing the protein using immunofluorescence and confocal microscopy detection could strengthen the claim. The authors should briefly discuss the limitations of the study.

      Although it would be possible to quantify the surface expression of GluA1 using immunofluorescence, it would not be possible to distinguish  between GluA1 homomers and GluA1-containing heteromers. It would therefore not be informative as to whether these are indeed CP-AMPARs. This is an interesting problem, which we have briefly discussed in the Discussion section.

      Additional comments to address:

      (1) In Figure 2A, the representative image with PMY alone shows a very weak PMY signal. Consequently, the image with TTX alone seems to potentiate the PMY signal, suggesting a counterintuitive increase in protein synthesis.

      We agree with the Reviewer that the original image was not representative and have replaced it with a more representative image.

      (2) In Figures 3A-B, the Western blotting representative images have poor quality, especially regarding GluA1 and α-actin in Figure 3A. The quantification graph (Figure 3B) raises some concerns about a potential outlier in both the DA alone and DA+CHX groups. The authors should consider running a statistical test to detect outlier data. Full blot images, including ladder lines, should be added to the supplementary data.

      We have replaced the western blot image in Figure 3A and have also presented full blot images including ladder lines in supplementary Figure S3.

      Using the ROUT method (Q=1%) we identified one outlier in the DA+CHX group of the western blot quantification. The quantification for this blot was then removed from the dataset and the experiment was repeated to ensure a sufficient number of repeats.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) How the authors perform these experiments with puromycin, these are puromycilation experiments - not SuNSET. The SuNSET protocol (surface sensing of translation) specifically refers to the detection of newly synthesized proteins externally at the plasma membrane. I'd advise to update the terminology used.

      We thank the Reviewer for pointing this out. We have updated this to ‘puromycin-based labelling assay’.

      (2) The legend presented in Figure 2F suggests WT is green and ACKO is orange, however, in Figure 2G the WT LTP trace is orange, consider changing this to green for consistency.

      We thank the Reviewer for this suggestion and agree that a matching colour scheme makes the Figure clearer. This has been updated.

      (3) In the results section, it is recommended to include units for the values presented at the first instance and only again when the units change thereafter.

      The units of the electrophysiology data were [%], this is included in the Results section. Results of western blots and IHC images were presented as [a.u.]. While we included this in the Figures, we have not specifically added this to the text of individual results. 

      (4) Two hours pre-treatment with anisomycin vs 30 minutes pretreatment with cycloheximide seems hard to directly compare - as the pharmokinetics of translational inhibition should be similar for both drugs. What was the rationale for the extremely long anisomycin pretreatment? What controls were taken to assess slice health either prior to or following fixation? This is relevant to the below point (5).

      In an initial set of control experiments (Figure S 1C-D) we wanted to ensure that protein synthesis was definitely blocked and therefore used a relatively high concentration of anisomycin and a relatively long pre-incubation period. We agree with the Reviewer that we cannot exclude the possibility that this treatment could compromise cell health in addition to the protein synthesis block. Therefore, we carried out an additional experiment with an alternative protein synthesis inhibitor cycloheximide at a lower standard concentration (10 µM) which confirmed a significant reduction in the puromycin signal (Figure S1A-B). Together these results support the conclusion that puromycin signal is specific to protein synthesis in our labelling assay.

      IHC slices were visually assessed for health. The large set of electrophysiology experiments carried out in our study (all recorded cells were evaluated for healthy resting membrane potential, action potential firing, and synaptic responses) also confirmed that, generally, the vast majority of our slices were indeed healthy. 

      (5) In Supplementary Figure 1, there is a dramatic difference in the a.u. intensities across CHX (B) and AM (D), please explain the reason for this. It is understood these are normalised values to nuclear staining, please clarify if this is a nuclear area.

      We agree with the Reviewer that there is a large variability in normalised puromycin signal which may be due to variability in the health of the slices. However, we assume that the same variability would be present in the treated slices, which showed, despite the variability, a significant effect of protein synthesis inhibition. To prevent introducing bias by excluding slices with low puromycin signal in the control condition, we present the full dataset.

      The CA1 region of the hippocampus contains of a dense layer of neuronal somata (pyramidal cell layer). We normalized against the nuclear area as it provides a reliable estimate of the number of neurons present in the image. This approach minimizes bias by accounting for variation in the number of neurons within the visual field, ensuring consistency and accuracy in our analysis.

      (6) Please clarify the decision to average both the last 5 minutes of baseline recordings and the last 5 minutes of the recording for the normalisation of EPSP slopes.

      The baseline usually stabilises after a few minutes of recording, thus the last 5 minutes were used for baseline measurement, which are the most relevant datapoints to compare synaptic weight change to. After induction of STDP, potentiation or depression of synaptic weights develops gradually. Based on previous results, evaluating the EPSP slopes at 30-40 minutes after the induction protocol gives a reliable estimate of the amount of plasticity.

      Reviewer #2 (Recommendations For The Authors):

      The concentration of anisomycin used (0.5 mM) is very high.

      As described above, in an initial set of control experiments (Figure S 1C-D) we wanted to ensure that protein synthesis was definitely blocked and therefore used a relatively high concentration of anisomycin and a relatively long pre-incubation period. We agree with the Reviewer that this is higher than the standard concentration used for this drug and we cannot exclude the possibility that this treatment could compromise cell health in addition to the protein synthesis block. Therefore, we carried out an additional experiment with an alternative protein synthesis inhibitor cycloheximide at a lower standard concentration (10 µM) which confirmed a significant reduction in the puromycin signal (Figure S1A-B). Together these results support the conclusion that puromycin signal is specific to protein synthesis in our labelling assay.

      Furthermore, in the electrophysiology experiments, we also used 500 µM anisomycin in the patch pipette solution. Under these conditions, we recorded a stable EPSP baseline for 60 minutes, indicating that the treatment did not cause toxic effects to the cell (Figure S1F). This high concentration would ensure an effective block of local translation at dendritic sites. Nevertheless, we also carried out this experiment with cycloheximide at a lower standard concentration (10 µM) and observed a similar result with both protein synthesis inhibitors (Figure 1F).

      The authors conclude that the effect of DA is mediated via D1/5 receptors, which based on previous work seems likely. But they cannot conclude this from their current study which used a combination of a D1/D5 and a D2 antagonist.

      We thank the Reviewer for pointing this out. We agree and have updated this in the Discussion section to ‘dopamine receptors’, without specifying subtypes.

      There is no mention that I can see that the KO experiments were conducted in a blinded manner (which I believe should be standard practice). Did they verify the KOs using Westerns?

      Only a subset of the experiments was conducted in a blinded manner. However, the results were collected by two independent experimenters, who both observed significant effects in KO mice compared to WTs (TF and ZB).

      We received the DKO mice from a former collaborator, who verified expression levels of the KO mice (Wang et al., 2003). We verified DKO upon arrival in our facility using genotyping.

      Maybe I'm misunderstanding but it appears to me that in Figure 1F there is LTP prior to the addition of DA. (The first point after pairing is already elevated). I think the control of pairing without DA should be added.

      We thank the Reviewer for pointing this out. Based on previous results (Brzosko et al., 2015) we would expect potentiation to develop over time once DA is added after pairing, however, it indeed appears in the Figure here as if there was an immediate increase in synaptic weights after pairing. It should be noted, however, that when comparing the first 5 minutes after pairing to the baseline, this increase was not significant (t(9)=1.810, p =0.1037). Nevertheless, we rechecked our data and noticed that this initial potentiation was biased by one cell with an increasing baseline, which had both the test and control pathway strongly elevated. We had mistakenly included this cell in the dataset, despite the unstable conditions (as stated in the Methods section, the unpaired control pathway served as a stability control). We apologise for the error and this has now been corrected (Figure 1F). In addition, we present the control pathway in Figure S1G and I.

      We have also now included the control for post-before-pre pairing (Δt = -20 ms) without dopamine in a supplemental figure (Figure S1E and F).

      The Westerns (Figure 3A) are fairly messy. Also, it is better to quantify with total protein. Surface biotinylation of GluA1 and GluA2 would be more informative.

      We carried out more repeats of Western blots and have exchanged blots in Figure 3A.

      We observed that DA increases protein synthesis, we therefore cannot exclude the possibility that application of DA could also affect total protein levels. Thus quantifying with total protein may not be the best choice here. Quantification with actin is standard practice.

      While we agree with the Reviewer that surface biotinylation of GluA1 and GluA2 could in principle be more informative, we do not think it would work well in our experimental setup using acute slice preparation, as it strictly requires intact cells. Slicing generates damaged cells, which would take up the surface biotin reagents. This would cause unspecific biotinylation of the damaged cells, leading to a strong background signal in the assay.

      In Figure 4 panels D and E the baselines are increasing substantially prior to induction. I appreciate that long stable baselines with timing-dependent plasticity may not be possible but it's hard to conclude what happened tens of minutes later when the baseline only appears stable for a minute or two. Panels A and B show that relatively stable baselines are achievable.

      We agree with the Reviewer that the baselines are increasing, however, when looking at the baseline for 5 minutes prior to induction (5 last datapoints of the baseline), which is what we used for quantification, the baselines appeared stable. Unfortunately, longer baselines are not suitable for timing-dependent plasticity. In addition, all experiments were carried out with a control pathway which showed stable conditions throughout the recording.

      In general, the discussion could be better integrated with the current literature. Their experiments are in line with a substantial body of literature that has identified two forms of LTP, based on these signalling cascades, using more conventional induction patterns.

      We thank the Reviewer for this suggestion and have added more discussion of the two forms of LTP in the Discussion section.

      It would be helpful to include the drug concentrations when first described in the results.

      Drug concentration have now been included in the Results section.

      It is now more common to include absolute t values (not just <0.05 etc).

      While we indicate significance in Figures using asterisks when p values are below the indicated significance levels, we report absolute values of p and t values in the Results section.

      Similarly full blots should be added to an appendix / made available.

      We have now included full blot images in Supplementary Figure S3.

      A 30% tolerance for series resistance seems generous to me. (10-20% would be more typical).

      We thank the Reviewer for their suggestion, and will keep this in mind for future studies. However, the error introduced by the higher tolerance level is likely to be small and would not influence any of the qualitative conclusions of the manuscript.

      Whereas series resistance is of course extremely important in voltage-clamp experiments, changes in series resistance would be less of a concern in current-clamp recordings of synaptic events. We use the amplifier as a voltage follower, and there are two problems with changes in the electrode, or access, resistance. First, there is the voltage drop across the electrode resistance. Clearly this error is zero if no current is injected and is also negligible for the currents we use in our experiments to maintain the membrane voltage at -70 mV. For example, the voltage drop would be 0.2 mV for 20 pA current through a typical 10 MOhm electrode resistance, and a change in resistance of 30% would give less than 0.1 mV voltage change even if the resistance were not compensated. The second problem is distortion of the EPSP shape due to the low-pass filtering properties of the electrode set up by the pipette capacitance and series resistance (RC). This can be a significant problem for fast events, such as action potentials, but less of a problem for the relatively slow EPSPs recorded in pyramidal cells. Nevertheless, we take on board the advice provided by the Reviewer and will use the conventional tolerance of 20% in future experiments.

      Reviewer #3 (Recommendations For The Authors):

      In the references, the entry for Burnashev N et al. has a different font size. Please ensure that all references are formatted consistently.

      We thank the Reviewer for spotting this and have updated the font size of this reference.

    1. eLife Assessment

      Birdsong production depends on precise neural sequences in a vocal motor nucleus HVC. In this useful biophysical model, Daou and colleagues identify specific biophysical parameters that result in sparse neural sequences observed in vivo. While the model is presently incomplete because it is overfit to produce sequences and therefore not robust to real biological variation, the model has the potential to address some outstanding issues in HVC function.

    2. Reviewer #1 (Public review):

      Summary:

      The paper presents a model for sequence generation in the zebra finch HVC, which adheres to cellular properties measured experimentally. However, the model is fine-tuned and exhibits limited robustness to noise inherent in the inhibitory interneurons within the HVC, as well as to fluctuations in connectivity between neurons. Although the proposed microcircuits are introduced as units for sub-syllabic segments (SSS), the backbone of the network remains a feedforward chain of HVC_RA neurons, similar to previous models.

      Strengths:

      The model incorporates all three of the major types of HVC neurons. The ion channels used and their kinetics are based on experimental measurements. The connection patterns of the neurons are also constrained by the experiments.

      Weaknesses:

      The model is described as consisting of micro-circuits corresponding to SSS. This presentation gives the impression that the model's structure is distinct from previous models, which connected HVC_RA neurons in feedforward chain networks (Jin et al 2007, Li & Greenside, 2006; Long et al 2010; Egger et al 2020). However, the authors implement single HVC_RA neurons into chain networks within each micro-circuit and then connect the end of the chain to the start of the chain in the subsequent micro-circuit. Thus, the HVC_RA neuron in their model forms a single-neuron chain. This structure is essentially a simplified version of earlier models.

      In the model of the paper, the chain network drives the HVC_I and HVC_X neurons. The role of the micro-circuits is more significant in organizing the connections: specifically, from HVC_RA neurons to HVC_I neurons, and from HVC_I neurons to both HVC_X and HVC_RA neurons.

      How useful is this concept of micro-circuits? HVC neurons fire continuously even during the silent gaps. There are no SSS during these silent gaps.

      A significant issue of the current model is that the HVC_RA to HVC_RA connections require fine-tuning, with the network functioning only within a narrow range of g_AMPA (Figure 2B). Similarly, the connections from HVC_I neurons to HVC_RA neurons also require fine-tuning. This sensitivity arises because the somatic properties of HVC_RA neurons are insufficient to produce the stereotypical bursts of spikes observed in recordings from singing birds, as demonstrated in previous studies (Jin et al 2007; Long et al 2010). In these previous works, to address this limitation, a dendritic spike mechanism was introduced to generate an intrinsic bursting capability, which is absent in the somatic compartment of HVC_RA neurons. This dendritic mechanism significantly enhances the robustness of the chain network, eliminating the need to fine-tune any synaptic conductances, including those from HVC_I neurons (Long et al 2010).

      Why is it important that the model should NOT be sensitive to the connection strengths?

      First, the firing of HVC_I neurons is highly noisy and unreliable. HVC_I neurons fire spontaneous, random spikes under baseline conditions. During singing, their spike timing is imprecise and can vary significantly from trial to trial, with spikes appearing or disappearing across different trials. As a result, their inputs to HVC_RA neurons are inherently noisy. If the model relies on precisely tuned inputs from HVC_I neurons, the natural fluctuations in HVC_I firing would render the model non-functional. The authors should incorporate noisy HVC_I neurons into their model to evaluate whether this noise would render the model non-functional.

      Second, Kosche et al. (2015) demonstrated that reducing inhibition by suppressing HVC_I neuron activity makes HVC_RA firing less sparse but does not compromise the temporal precision of the bursts. In this experiment, the local application of gabazine should have severely disrupted HVC_I activity. However, it did not affect the timing precision of HVC_RA neuron firing, emphasizing the robustness of the HVC timing circuit. This robustness is inconsistent with the predictions of the current model, which depends on finely tuned inputs and should, therefore, be vulnerable to such disruptions.

      Third, the reliance on fine-tuning of HVC_RA connections becomes problematic if the model is scaled up to include groups of HVC_RA neurons forming a chain network, rather than the single HVC_RA neurons used in the current work. With groups of HVC_RA neurons, the summation of presynaptic inputs to each HVC_RA neuron would need to be precisely maintained for the model to function. However, experimental evidence shows that the HVC circuit remains functional despite perturbations, such as a few degrees of cooling, micro-lesions, or turnover of HVC_RA neurons. Such robustness cannot be accounted for by a model that depends on finely tuned connections, as seen in the current implementation.

      The authors examined how altering the channel properties of neurons affects the activity in their model. While this approach is valid, many of the observed effects may stem from the delicate balancing required in their model for proper function.

      In the current model, HVC_X neurons burst as a result of rebound activity driven by the I_H current. Rebound bursts mediated by the I_H current typically require a highly hyperpolarized membrane potential. However, this mechanism would fail if the reversal potential of inhibition is higher than the required level of hyperpolarization. Furthermore, Mooney (2000) demonstrated that depolarizing the membrane potential of HVC_X neurons did not prevent bursts of these neurons during forward playback of the bird's own song, suggesting that these bursts (at least under anesthesia, which may be a different state altogether) are not necessarily caused by rebound activity. This discrepancy should be addressed or considered in the model.

      Some figures contain direct copies of figures from published papers. It is perhaps a better practice to replace them with schematics if possible.

    3. Reviewer #2 (Public review):

      Summary:

      In this paper, the authors use numerical simulations to try to understand better a major experimental discovery in songbird neuroscience from 2002 by Richard Hahnloser and collaborators. The 2002 paper found that a certain class of projection neurons in the premotor nucleus HVC of adult male zebra finch songbirds, the neurons that project to another premotor nucleus RA, fired sparsely (once per song motif) and precisely (to about 1 ms accuracy) during singing.

      The experimental discovery is important to understand since it initially suggested that the sparsely firing RA-projecting neurons acted as a simple clock that was localized to HVC and that controlled all details of the temporal hierarchy of singing: notes, syllables, gaps, and motifs. Later experiments suggested that the initial interpretation might be incomplete: that the temporal structure of adult male zebra finch songs instead emerged in a more complicated and distributed way, still not well understood, from the interaction of HVC with multiple other nuclei, including auditory and brainstem areas. So at least two major questions remain unanswered more than two decades after the 2002 experiment: What is the neurobiological mechanism that produces the sparse precise bursting: is it a local circuit in HVC or is it some combination of external input to HVC and local circuitry? And how is the sparse precise bursting in HVC related to a songbird's vocalizations?

      The authors only investigate part of the first question, whether the mechanism for sparse precise bursts is local to HVC. They do so indirectly, by using conductance-based Hodgkin-Huxley-like equations to simulate the spiking dynamics of a simplified network that includes three known major classes of HVC neurons and such that all neurons within a class are assumed to be identical. A strength of the calculations is that the authors include known biophysically deduced details of the different conductances of the three major classes of HVC neurons, and they take into account what is known, based on sparse paired recordings in slices, about how the three classes connect to one another. One weakness of the paper is that the authors make arbitrary and not well-motivated assumptions about the network geometry, and they do not use the flexibility of their simulations to study how their results depend on their network assumptions. A second weakness is that they ignore many known experimental details such as projections into HVC from other nuclei, dendritic computations (the somas and dendrites are treated by the authors as point-like isopotential objects), the role of neuromodulators, and known heterogeneity of the interneurons. These weaknesses make it difficult for readers to know the relevance of the simulations for experiments and for advancing theoretical understanding.

      Strengths:

      The authors use conductance-based Hodgkin-Huxley-like equations to simulate spiking activity in a network of neurons intended to model more accurately songbird nucleus HVC of adult male zebra finches. Spiking models are much closer to experiments than models based on firing rates or on 2-state neurons.

      The authors include information deduced from modeling experimental current-clamp data such as the types and properties of conductances. They also take into account how neurons in one class connect to neurons in other classes via excitatory or inhibitory synapses, based on sparse paired recordings in slices by other researchers.

      The authors obtain some new results of modest interest such as how changes in the maximum conductances of four key channels (e.g., A-type K+ currents or Ca-dependent K+ currents) influence the structure and propagation of bursts, while simultaneously being able to mimic accurately current-clamp voltage measurements.

      Weaknesses:

      One weakness of this paper is the lack of a clearly stated, interesting, and relevant scientific question to try to answer. In the introduction, the authors do not discuss adequately which questions recent experimental and theoretical work have failed to explain adequately, concerning HVC neural dynamics and its role in producing vocalizations. The authors do not discuss adequately why they chose the approach of their paper and how their results address some of these questions.

      For example, the authors need to explain in more detail how their calculations relate to the works of Daou et al, J. Neurophys. 2013 (which already fitted spiking models to neuronal data and identified certain conductances), to Jin et al J. Comput. Neurosci. 2007 (which already discussed how to get bursts using some experimental details), and to the rather similar paper by E. Armstrong and H. Abarbanel, J. Neurophys 2016, which already postulated and studied sequences of microcircuits in HVC. This last paper is not even cited by the authors.

      The authors' main achievement is to show that simulations of a certain simplified and idealized network of spiking neurons, which includes some experimental details but ignores many others, match some experimental results like current-clamp-derived voltage time series for the three classes of HVC neurons (although this was already reported in earlier work by Daou and collaborators in 2013), and simultaneously the robust propagation of bursts with properties similar to those observed in experiments. The authors also present results about how certain neuronal details and burst propagation change when certain key maximum conductances are varied.

      However, these are weak conclusions for two reasons. First, the authors did not do enough calculations to allow the reader to understand how many parameters were needed to obtain these fits and whether simpler circuits, say with fewer parameters and simpler network topology, could do just as well. Second, many previous researchers have demonstrated robust burst propagation in a variety of feed-forward models. So what is new and important about the authors' results compared to the previous computational papers?

      Also missing is a discussion, or at least an acknowledgment, of the fact that not all of the fine experimental details of undershoots, latencies, spike structure, spike accommodation, etc may be relevant for understanding vocalization. While it is nice to know that some models can match these experimental details and produce realistic bursts, that does not mean that all of these details are relevant for the function of producing precise vocalizations. Scientific insights in biology often require exploring which of the many observed details can be ignored and especially identifying the few that are essential for answering some questions. As one example, if HVC-X neurons are completely removed from the authors' model, does one still get robust and reasonable burst propagation of HVC-RA neurons? While part of the nucleus HVC acts as a premotor circuit that drives the nucleus RA, part of HVC is also related to learning. It is not clear that HVC-X neurons, which carry out some unknown calculation and transmit information to area X in a learning pathway, are relevant for burst production and propagation of HVC-RA neurons, and so relevant for vocalization. Simulations provide a convenient and direct way to explore questions of this kind.

      One key question to answer is whether the bursting of HVC-RA projection neurons is based on a mechanism local to HVC or is some combination of external driving (say from auditory nuclei) and local circuitry. The authors do not contribute to answering this question because they ignore external driving and assume that the mechanism is some kind of intrinsic feed-forward circuit, which they put in by hand in a rather arbitrary and poorly justified way, by assuming the existence of small microcircuits consisting of a few HVC-RA, HVC-X, and HVC-I neurons that somehow correspond to "sub-syllabic segments". To my knowledge, experiments do not suggest the existence of such microcircuits nor does theory suggest the need for such microcircuits.

      Another weakness of this paper is an unsatisfactory discussion of how the model was obtained, validated, and simulated. The authors should state as clearly as possible, in one location such as an appendix, what is the total number of independent parameters for the entire network and how parameter values were deduced from data or assigned by hand. With enough parameters and variables, many details can be fit arbitrarily accurately so researchers have to be careful to avoid overfitting. If parameter values were obtained by fitting to data, the authors should state clearly what the fitting algorithm was (some iterative nonlinear method, whose results can depend on the initial choice of parameters), what the error function used for fitting (sum of least squares?) was, and what data were used for the fitting.

      The authors should also state clearly the dynamical state of the network, the vector of quantities that evolve over time. (What is the dimension of that vector, which is also the number of ordinary differential equations that have to be integrated?) The authors do not mention what initial state was used to start the numerical integrations, whether transient dynamics were observed and what were their properties, or how the results depended on the choice of the initial state. The authors do not discuss how they determined that their model was programmed correctly (it is difficult to avoid typing errors when writing several pages or more of a code in any language) or how they determined the accuracy of the numerical integration method beyond fitting to experimental data, say by varying the time step size over some range or by comparing two different integration algorithms.

      Also disappointing is that the authors do not make any predictions to test, except rather weak ones such as that varying a maximum conductance sufficiently (which might be possible by using dynamic clamps) might cause burst propagation to stop or change its properties. Based on their results, the authors do not make suggestions for further experiments or calculations, but they should.

    4. Author response:

      eLife Assessment

      Birdsong production depends on precise neural sequences in a vocal motor nucleus HVC. In this useful biophysical model, Daou and colleagues identify specific biophysical parameters that result in sparse neural sequences observed in vivo. While the model is presently incomplete because it is overfit to produce sequences and therefore not robust to real biological variation, the model has the potential to address some outstanding issues in HVC function.

      We are grateful for the extensive supportive comments from the reviewers, including broad, strong appreciation of the novel aspects of our manuscript. We believe these will be only strengthened in the next submission.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The paper presents a model for sequence generation in the zebra finch HVC, which adheres to cellular properties measured experimentally. However, the model is fine-tuned and exhibits limited robustness to noise inherent in the inhibitory interneurons within the HVC, as well as to fluctuations in connectivity between neurons. Although the proposed microcircuits are introduced as units for sub-syllabic segments (SSS), the backbone of the network remains a feedforward chain of HVC_RA neurons, similar to previous models.

      Strengths:

      The model incorporates all three of the major types of HVC neurons. The ion channels used and their kinetics are based on experimental measurements. The connection patterns of the neurons are also constrained by the experiments.

      Weaknesses:

      The model is described as consisting of micro-circuits corresponding to SSS. This presentation gives the impression that the model's structure is distinct from previous models, which connected HVC_RA neurons in feedforward chain networks (Jin et al 2007, Li & Greenside, 2006; Long et al 2010; Egger et al 2020). However, the authors implement single HVC_RA neurons into chain networks within each micro-circuit and then connect the end of the chain to the start of the chain in the subsequent micro-circuit. Thus, the HVC_RA neuron in their model forms a single-neuron chain. This structure is essentially a simplified version of earlier models.

      In the model of the paper, the chain network drives the HVC_I and HVC_X neurons. The role of the micro-circuits is more significant in organizing the connections: specifically, from HVC_RA neurons to HVC_I neurons, and from HVC_I neurons to both HVC_X and HVC_RA neurons.

      We thank Reviewer 1 for their thoughtful comments.

      While the reviewer is correct about the fact that the propagation of sequential activity in this model is primarily carried by HVC<sub>RA</sub> neurons in a feed-forward manner, we need to emphasize that this is true only if there is no intrinsic or synaptic perturbation to the HVC network. For example, we showed in Figures 10 and 12 how altering the intrinsic properties of HVC<sub>X</sub> neurons or for interneurons disrupts sequence propagation. In other words, while HVC<sub>RA</sub> neurons are the key forces to carry the chain forward, the interplay between excitation and inhibition in our network as well as the intrinsic parameters for all classes of HVC neurons are equally important forces in carrying the chain of activity forward. Thus, the stability of activity propagation necessary for song production depend on a finely balanced network of HVC neurons, with all classes contributing to the overall dynamics. Moreover, all existing models that describe premotor sequence generation in the HVC either assume a distributed model (Elmaleh et al., 2021) that dictates that local HVC circuitry is not sufficient to advance the sequence but rather depends upon momentto-moment feedback through Uva (Hamaguchi et al., 2016), or assume models that rely on intrinsic connections within HVC to propagate sequential activity. In the latter case, some models assume that HVC is composed of multiple discrete subnetworks that encode individual song elements (Glaze & Troyer, 2013; Long & Fee, 2008; Wang et al., 2008), but lacks the local connectivity to link the subnetworks, while other models assume that HVC may have sufficient information in its intrinsic connections to form a single continuous network sequence (Long et al. 2010). The HVC model we present extends the concept of a feedforward network by incorporating additional neuronal classes that influence the propagation of activity (interneurons and HVC<sub>X</sub> neurons). We have shown that any disturbance of the intrinsic or synaptic conductances of these latter neurons will disrupt activity in the circuit even when HVC<sub>RA</sub> neurons properties are maintained.

      In regard to the similarities between our model and earlier models, several aspects of our model distinguish it from prior work. In short, while several models of how sequence is generated within HVC have been proposed (Cannon et al., 2015; Drew & Abbott, 2003; Egger et al., 2020; Elmaleh et al., 2021; Galvis et al., 2018; Gibb et al., 2009a, 2009b; Hamaguchi et al., 2016; Jin, 2009; Long & Fee, 2008; Markowitz et al., 2015), all the models proposed either rely on intrinsic HVC circuitry to propagate sequential activity, rely on extrinsic feedback to advance the sequence or rely on both. These models do not capture the complex details of spike morphology, do not include the right ionic currents, do not incorporate all classes of HVC neurons, or do not generate realistic firing patterns as seen in vivo. Our model is the first biophysically realistic model that incorporates all classes of HVC neurons and their intrinsic properties. We tuned the intrinsic and the synaptic properties bases on the traces collected by Daou et al. (2013) and Mooney and Prather (2005) as shown in Figure 3. The three classes of model neurons incorporated to our network as well as the synaptic currents that connect them are based on HodgkinHuxley formalisms that contain ion channels and synaptic currents which had been pharmacologically identified. This is an advancement over prior models that primarily focused on the role of synaptic interactions or external inputs. The model is based on a feedforward chain of microcircuits that encode for the different sub-syllabic segments and that interact with each other through structured feedback inhibition, defining an ordered sequence of cell firing. Moreover, while several models highlight the critical role of inhibitory interneurons in shaping the timing and propagation of bursts of activity in HVC<sub>RA</sub> neurons, our work offers an intricate and comprehensive model that help understand this critical role played by inhibition in shaping song dynamics and ensuring sequence propagation.

      How useful is this concept of micro-circuits? HVC neurons fire continuously even during the silent gaps. There are no SSS during these silent gaps.

      Regarding the concern about the usefulness of the 'microcircuit' concept in our study, we appreciate the comment and we are glad to clarify its relevance in our network. While we acknowledge that HVC<sub>RA</sub> neurons interconnect microcircuits, our model's dynamics are still best described within the framework of microcircuitry particularly due to the firing behavior of HVC<sub>X</sub> neurons and interneurons. Here, we are referring to microcircuits in a more functional sense, rather than rigid, isolated spatial divisions (Cannon et al. 2015). A microcircuit in our model reflects the local rules that govern the interaction between all HVC neuron classes within the broader network, and that are essential for proper activity propagation. For example, HVC<sub>INT</sub> neurons belonging to any microcircuit burst densely and at times other than the moments when the corresponding encoded SSS is being “sung”. What makes a particular interneuron belong to this microcircuit or the other is merely the fact that it cannot inhibit HVC<sub>RA</sub> neurons that are housed in the microcircuit it belongs to. In particular, if HVC<sub>INT</sub> inhibits HVC<sub>RA</sub> in the same microcircuit, some of the HVC<sub>RA</sub> bursts in the microcircuit might be silenced by the dense and strong HVC<sub>INT</sub> inhibition breaking the chain of activity again. Similarly, HVC<sub>X</sub> neurons were selected to be housed within microcircuits due to the following reason: if an HVC<sub>X</sub> neuron belonging to microcircuit i sends excitatory input to an HVC<sub>INT</sub> neuron in microcircuit j, and that interneuron happens to select an HVC<sub>RA</sub> neuron from microcircuit i, then the propagation of sequential activity will halt, and we’ll be in a scenario similar to what was described earlier for HVC<sub>INT</sub> neurons inhibiting HVC<sub>RA</sub> neurons in the same microcircuit.

      We agree that there are no sub-syllabic segments described during the silent gaps and we thank the reviewer to pointing this out. Although silent gaps are integral to the overall process of song production, we have not elaborated on them in this model due to the lack of a clear, biophysically grounded representation for the gaps themselves at the level of HVC. Our primary focus has been on modeling the active, syllable-producing phases of the song, where the HVC network’s sequential dynamics are critical for song. However, one can think the encoding of silent gaps via similar mechanisms that encode SSSs, where each gap is encoded by similar microcircuits comprised of the three classes of HVC neurons (let’s called them GAP rather than SSS) that are active only during the silent gaps. In this case, the propagation of sequential activity is carried throughout the GAPs from the last SSS of the previous syllable to the first SSS of the subsequent syllable. We’ll make sure to emphasize this mechanism more in the revised version of the manuscript.

      A significant issue of the current model is that the HVC_RA to HVC_RA connections require fine-tuning, with the network functioning only within a narrow range of g_AMPA (Figure 2B). Similarly, the connections from HVC_I neurons to HVC_RA neurons also require fine-tuning. This sensitivity arises because the somatic properties of HVC_RA neurons are insufficient to produce the stereotypical bursts of spikes observed in recordings from singing birds, as demonstrated in previous studies (Jin et al 2007; Long et al 2010). In these previous works, to address this limitation, a dendritic spike mechanism was introduced to generate an intrinsic bursting capability, which is absent in the somatic compartment of HVC_RA neurons. This dendritic mechanism significantly enhances the robustness of the chain network, eliminating the need to fine-tune any synaptic conductances, including those from HVC_I neurons (Long et al 2010).

      Why is it important that the model should NOT be sensitive to the connection strengths?

      We thank the reviewer for the comment. While mathematical models designed for highly complex nonlinear biological processes tangentially touch the biological realism, the current network as is right now is the first biologically realistic-enough network model designed for HVC that explains sequence propagation. We do not include dendritic processes in our network although that increases the realistic dynamics for various reasons. 1) The ion channels we integrated into the somatic compartment are known pharmacologically (Daou et al. 2013), but we don’t know about the dendritic compartment’s intrinsic properties of HVC neurons and the cocktail of ion channels that are expressed there. 2) We are able to generate realistic bursting in HVC<sub>RA</sub> neurons despite the single compartment, and the main emphasis in this network is on the interactions between excitation and inhibition, the effects of ion channels in modulating sequence propagation, etc. 3) The network model already incorporates thousands of ODEs that govern the dynamics of each of the HVC neurons, so we did not want to add more complexity to the network especially that we don’t know the biophysical properties of the dendritic compartments.

      Therefore, our present focus is on somatic dynamics and the interaction between HVC<sub>RA</sub> and HVC<sub>INT</sub> neurons, but we acknowledge the importance of these processes in enhancing network resiliency. Although we agree that adding dendritic processes improves robustness, we still think that somatic processes alone can offer insightful information on the sequential dynamics of the HVC network. While the network should be robust across a wide range of parameters, it is also essential that certain parameters are designed to filter out weaker signals, ensuring that only reliable, precise patterns of activity propagate. Hence, we specifically chose to make the HVC<sub>RA</sub>-to-HVC<sub>RA</sub> excitatory connections more sensitive (narrow range of values) such that only strong, precise and meaningful stimuli can propagate through the network representing the high stereotypy and precision seen in song production.

      First, the firing of HVC_I neurons is highly noisy and unreliable. HVC_I neurons fire spontaneous, random spikes under baseline conditions. During singing, their spike timing is imprecise and can vary significantly from trial to trial, with spikes appearing or disappearing across different trials. As a result, their inputs to HVC_RA neurons are inherently noisy. If the model relies on precisely tuned inputs from HVC_I neurons, the natural fluctuations in HVC_I firing would render the model non-functional. The authors should incorporate noisy HVC_I neurons into their model to evaluate whether this noise would render the model non-functional.

      We acknowledge that under baseline and singing settings, interneurons fire in an extremely noisy and inaccurate manner, although they exhibit time locked episodes in their activity (Hahnloser et al 2002, Kozhinikov and Fee 2007). In order to mimic the biological variability of these neurons, our model does, in fact, include a stochastic current to reflect the intrinsic noise and random variations in interneuron firing shown in vivo (and we highlight this in the Methods). If necessary and to make sure the network is resilient to this randomness in interneuron firing, we will investigate different approaches to enhance the noise representation even further and check its effect on sequence propagation.

      Second, Kosche et al. (2015) demonstrated that reducing inhibition by suppressing HVC_I neuron activity makes HVC_RA firing less sparse but does not compromise the temporal precision of the bursts. In this experiment, the local application of gabazine should have severely disrupted HVC_I activity. However, it did not affect the timing precision of HVC_RA neuron firing, emphasizing the robustness of the HVC timing circuit. This robustness is inconsistent with the predictions of the current model, which depends on finely tuned inputs and should, therefore, be vulnerable to such disruptions.

      We thank the reviewer for the comment. The differences between the Kosche et al. (2015) findings and the predictions of our model arise from differences in the aspect of HVC function we are modeling. Our model is more sensitive to inhibition, which is a designed mechanism for achieving precise song patterning. This is a modeling simplification we adopted to capture specific characteristics of HVC function. Hence, Kosche et al. (2015) findings do not invalidate the approach of our model, but highlights that HVC likely operates with several, redundant mechanisms that overall ensure temporal precision.Nevertheless, we will investigate further the effects of the degree of inhibition on song patterning.

      Third, the reliance on fine-tuning of HVC_RA connections becomes problematic if the model is scaled up to include groups of HVC_RA neurons forming a chain network, rather than the single HVC_RA neurons used in the current work. With groups of HVC_RA neurons, the summation of presynaptic inputs to each HVC_RA neuron would need to be precisely maintained for the model to function. However, experimental evidence shows that the HVC circuit remains functional despite perturbations, such as a few degrees of cooling, micro-lesions, or turnover of HVC_RA neurons. Such robustness cannot be accounted for by a model that depends on finely tuned connections, as seen in the current implementation.

      Our model of individual HVC<sub>RA</sub> neurons and as stated previously is reductive model that focuses on understanding the mechanisms that govern sequential neural activity. We agree that scaling the model to include many of HVC<sub>RA</sub> neurons poses challenges, specifically concerning the summation of presynaptic inputs. However, our model can still be adapted to a larger network without requiring the level of fine-tuning currently needed. In fact, the current fine-tuning of synaptic connections in the model is a reflection of fundamental network mechanisms rather than a limitation when scaling to a larger network. Besides, one important feature of this neural network is redundancy. Even if some neurons or synaptic connections are impaired, other neurons or pathways can compensate for these changes, allowing the activity propagation to remain intact.

      The authors examined how altering the channel properties of neurons affects the activity in their model. While this approach is valid, many of the observed effects may stem from the delicate balancing required in their model for proper function.

      In the current model, HVC_X neurons burst as a result of rebound activity driven by the I_H current. Rebound bursts mediated by the I_H current typically require a highly hyperpolarized membrane potential. However, this mechanism would fail if the reversal potential of inhibition is higher than the required level of hyperpolarization. Furthermore, Mooney (2000) demonstrated that depolarizing the membrane potential of HVC_X neurons did not prevent bursts of these neurons during forward playback of the bird's own song, suggesting that these bursts (at least under anesthesia, which may be a different state altogether) are not necessarily caused by rebound activity. This discrepancy should be addressed or considered in the model.

      In our HVC network model, one goal with HVC<sub>X</sub> neurons is to generate bursts in their underlying neuron population. Since HVC<sub>X</sub> neurons in our model receive only inhibitory inputs from interneurons, we rely on inhibition followed by rebound bursts orchestrated by the IH and the I<sub>CaT</sub> currents to achieve this goal. The interplay between the T-type Ca<sup>++</sup> current and the H current in our model is fundamental to generate their corresponding bursts, as they are sufficient for producing the desired behavior in the network. Due to this interplay, we do not need significant inhibition to generate rebound bursts, because the T-type Ca<sup>++</sup> current’s conductance can be stronger leading to robust rebound bursting even when the degree of inhibition is not very strong. We will highlight this with more clarity in the revised version.

      Some figures contain direct copies of figures from published papers. It is perhaps a better practice to replace them with schematics if possible.

      We will replace the relevant figures with schematic representations where possible.

      Reviewer #2 (Public review):

      Summary:

      In this paper, the authors use numerical simulations to try to understand better a major experimental discovery in songbird neuroscience from 2002 by Richard Hahnloser and collaborators. The 2002 paper found that a certain class of projection neurons in the premotor nucleus HVC of adult male zebra finch songbirds, the neurons that project to another premotor nucleus RA, fired sparsely (once per song motif) and precisely (to about 1 ms accuracy) during singing.

      The experimental discovery is important to understand since it initially suggested that the sparsely firing RA-projecting neurons acted as a simple clock that was localized to HVC and that controlled all details of the temporal hierarchy of singing: notes, syllables, gaps, and motifs. Later experiments suggested that the initial interpretation might be incomplete: that the temporal structure of adult male zebra finch songs instead emerged in a more complicated and distributed way, still not well understood, from the interaction of HVC with multiple other nuclei, including auditory and brainstem areas. So at least two major questions remain unanswered more than two decades after the 2002 experiment: What is the neurobiological mechanism that produces the sparse precise bursting: is it a local circuit in HVC or is it some combination of external input to HVC and local circuitry?

      And how is the sparse precise bursting in HVC related to a songbird's vocalizations?

      The authors only investigate part of the first question, whether the mechanism for sparse precise bursts is local to HVC. They do so indirectly, by using conductance-based Hodgkin-Huxley-like equations to simulate the spiking dynamics of a simplified network that includes three known major classes of HVC neurons and such that all neurons within a class are assumed to be identical. A strength of the calculations is that the authors include known biophysically deduced details of the different conductances of the three major classes of HVC neurons, and they take into account what is known, based on sparse paired recordings in slices, about how the three classes connect to one another. One weakness of the paper is that the authors make arbitrary and not well-motivated assumptions about the network geometry, and they do not use the flexibility of their simulations to study how their results depend on their network assumptions. A second weakness is that they ignore many known experimental details such as projections into HVC from other nuclei, dendritic computations (the somas and dendrites are treated by the authors as point-like isopotential objects), the role of neuromodulators, and known heterogeneity of the interneurons. These weaknesses make it difficult for readers to know the relevance of the simulations for experiments and for advancing theoretical understanding.

      Strengths:

      The authors use conductance-based Hodgkin-Huxley-like equations to simulate spiking activity in a network of neurons intended to model more accurately songbird nucleus HVC of adult male zebra finches. Spiking models are much closer to experiments than models based on firing rates or on 2-state neurons.

      The authors include information deduced from modeling experimental current-clamp data such as the types and properties of conductances. They also take into account how neurons in one class connect to neurons in other classes via excitatory or inhibitory synapses, based on sparse paired recordings in slices by other researchers.

      The authors obtain some new results of modest interest such as how changes in the maximum conductances of four key channels (e.g., A-type K<sup>+</sup> currents or Ca-dependent K<sup>+</sup> currents) influence the structure and propagation of bursts, while simultaneously being able to mimic accurately current-clamp voltage measurements.

      Weaknesses:

      One weakness of this paper is the lack of a clearly stated, interesting, and relevant scientific question to try to answer. In the introduction, the authors do not discuss adequately which questions recent experimental and theoretical work have failed to explain adequately, concerning HVC neural dynamics and its role in producing vocalizations. The authors do not discuss adequately why they chose the approach of their paper and how their results address some of these questions.

      For example, the authors need to explain in more detail how their calculations relate to the works of Daou et al, J. Neurophys. 2013 (which already fitted spiking models to neuronal data and identified certain conductances), to Jin et al J. Comput. Neurosci. 2007 (which already discussed how to get bursts using some experimental details), and to the rather similar paper by E. Armstrong and H. Abarbanel, J. Neurophys 2016, which already postulated and studied sequences of microcircuits in HVC. This last paper is not even cited by the authors.

      We thank the reviewer for this valuable comment, and we agree that we did not clarify enough throughout the paper the utility of our model or how it advanced our understanding of the HVC dynamics and circuitry. To that end, we will revise several places of the manuscript and make sure to cite and highlight the relevance and relatedness of the mentioned papers.

      In short, and as mentioned to Reviewer 1, while several models of how sequence is generated within HVC have been proposed (Cannon et al., 2015; Drew & Abbott, 2003; Egger et al., 2020; Elmaleh et al., 2021; Galvis et al., 2018; Gibb et al., 2009a, 2009b; Hamaguchi et al., 2016; Jin, 2009; Long & Fee, 2008; Markowitz et al., 2015; Jin et al., 2007), all the models proposed either rely on intrinsic HVC circuitry to propagate sequential activity, rely on extrinsic feedback to advance the sequence or rely on both. These models do not capture the complex details of spike morphology, do not include the right ionic currents, do not incorporate all classes of HVC neurons, or do not generate realistic firing patterns as seen in vivo. Our model is the first biophysically realistic model that incorporates all classes of HVC neurons and their intrinsic properties.

      No existing hypothesis had been challenged with our model, rather; our model is a distillation of the various models that’s been proposed for the HVC network. We go over this in detail in the Discussion. We believe that the network model we developed provide a step forward in describing the biophysics of HVC circuitry, and may throw a new light on certain dynamics in the mammalian brain, particularly the motor cortex and the hippocampus regions where precisely-timed sequential activity is crucial. We suggest that temporally-precise sequential activity may be a manifestation of neural networks comprised of chain of microcircuits, each containing pools of excitatory and inhibitory neurons, with local interplay among neurons of the same microcircuit and global interplays across the various microcircuits, and with structured inhibition as well as intrinsic properties synchronizing the neuronal pools and stabilizing timing within a firing sequence.

      The authors' main achievement is to show that simulations of a certain simplified and idealized network of spiking neurons, which includes some experimental details but ignores many others, match some experimental results like current-clamp-derived voltage time series for the three classes of HVC neurons (although this was already reported in earlier work by Daou and collaborators in 2013), and simultaneously the robust propagation of bursts with properties similar to those observed in experiments. The authors also present results about how certain neuronal details and burst propagation change when certain key maximum conductances are varied.

      However, these are weak conclusions for two reasons. First, the authors did not do enough calculations to allow the reader to understand how many parameters were needed to obtain these fits and whether simpler circuits, say with fewer parameters and simpler network topology, could do just as well. Second, many previous researchers have demonstrated robust burst propagation in a variety of feed-forward models. So what is new and important about the authors' results compared to the previous computational papers?

      A major novelty of our work is the incorporation of experimental data with detailed network models. While earlier works have established robust burst propagation, our model uses realistic ion channel kinetics and feedback inhibition not only to reproduce experimental neural activity patterns but also to suggest prospective mechanisms for song sequence production in the most biophysical way possible. This aspect that distinguishes our work from other feed-forward models. We go over this in detail in the Discussion. However, the reviewer is right regarding the details of the calculations conducted for the fits, we will make sure to highlight this in the Methods and throughout the manuscript with more details.

      We believe that the network model we developed provide a step forward in describing the biophysics of HVC circuitry, and may throw a new light on certain dynamics in the mammalian brain, particularly the motor cortex and the hippocampus regions where precisely-timed sequential activity is crucial. We suggest that temporally-precise sequential activity may be a manifestation of neural networks comprised of chain of microcircuits, each containing pools of excitatory and inhibitory neurons, with local interplay among neurons of the same microcircuit and global interplays across the various microcircuits, and with structured inhibition as well as intrinsic properties synchronizing the neuronal pools and stabilizing timing within a firing sequence.

      Also missing is a discussion, or at least an acknowledgment, of the fact that not all of the fine experimental details of undershoots, latencies, spike structure, spike accommodation, etc may be relevant for understanding vocalization. While it is nice to know that some models can match these experimental details and produce realistic bursts, that does not mean that all of these details are relevant for the function of producing precise vocalizations. Scientific insights in biology often require exploring which of the many observed details can be ignored and especially identifying the few that are essential for answering some questions. As one example, if HVC-X neurons are completely removed from the authors' model, does one still get robust and reasonable burst propagation of HVC-RA neurons? While part of the nucleus HVC acts as a premotor circuit that drives the nucleus RA, part of HVC is also related to learning. It is not clear that HVC-X neurons, which carry out some unknown calculation and transmit information to area X in a learning pathway, are relevant for burst production and propagation of HVC<sub>RA</sub> neurons, and so relevant for vocalization. Simulations provide a convenient and direct way to explore questions of this kind.

      One key question to answer is whether the bursting of HVC-RA projection neurons is based on a mechanism local to HVC or is some combination of external driving (say from auditory nuclei) and local circuitry. The authors do not contribute to answering this question because they ignore external driving and assume that the mechanism is some kind of intrinsic feed-forward circuit, which they put in by hand in a rather arbitrary and poorly justified way, by assuming the existence of small microcircuits consisting of a few HVC-RA, HVC-X, and HVC-I neurons that somehow correspond to "sub-syllabic segments". To my knowledge, experiments do not suggest the existence of such microcircuits nor does theory suggest the need for such microcircuits.

      Recent results showed a tight correlation between the intrinsic properties of neurons and features of song (Daou and Margoliash 2020, Medina and Margoliash 2024), where adult birds that exhibit similar songs tend to have similar intrinsic properties. While this is relevant, we acknowledge that not all details may be necessary for every aspect of vocalization, and future models could simplify concentrate on core dynamics and exclude certain features while still providing insights into the primary mechanisms.

      The question of whether HVC<sub>X</sub> neurons are relevant for burst propagation given that our model includes these neurons as part of the network for completeness, the reviewer is correct, the propagation of sequential activity in this model is primarily carried by HVC<sub>RA</sub> neurons in a feed-forward manner, but only if there is no perturbation to the HVC network. For example, we have shown how altering the intrinsic properties of HVC<sub>X</sub> neurons or for interneurons disrupts sequence propagation. In other words, while HVC neurons are the key forces to carry the chain forward, the interplay between excitation and inhibition in our network as well as the intrinsic parameters for all classes of HVC neurons are equally important forces in carrying the chain of activity forward. Thus, the stability of activity propagation necessary for song production depend on a finely balanced network of HVC neurons, with all classes contributing to the overall dynamics.

      We agree with the reviewer however that a potential drawback of our model is that its sole focus is on local excitatory connectivity within the HVC (Kornfeld et al., 2017; Long et al., 2010), while HVC neurons receive afferent excitatory connections (Akutagawa & Konishi, 2010; Nottebohm et al., 1982) that plays significant roles in their local dynamics. For example, the excitatory inputs that HVC neurons receive from Uvaeformis may be crucial in initiating (Andalman et al., 2011; Danish et al., 2017; Galvis et al., 2018) or sustaining (Hamaguchi et al., 2016) the sequential activity. While we acknowledge this limitation, our main contribution in this work is the biophysical insights onto how the patterning activity in HVC is largely shaped by the intrinsic properties of the individual neurons as well as the synaptic properties where excitation and inhibition play a major role in enabling neurons to generate their characteristic bursts during singing. This is true and holds irrespective of whether an external drive is injected onto the microcircuits or not. We will however elaborate on and investigate this more during the next submission.

      Another weakness of this paper is an unsatisfactory discussion of how the model was obtained, validated, and simulated. The authors should state as clearly as possible, in one location such as an appendix, what is the total number of independent parameters for the entire network and how parameter values were deduced from data or assigned by hand. With enough parameters and variables, many details can be fit arbitrarily accurately so researchers have to be careful to avoid overfitting. If parameter values were obtained by fitting to data, the authors should state clearly what the fitting algorithm was (some iterative nonlinear method, whose results can depend on the initial choice of parameters), what the error function used for fitting (sum of least squares?) was, and what data were used for the fitting.

      The authors should also state clearly the dynamical state of the network, the vector of quantities that evolve over time. (What is the dimension of that vector, which is also the number of ordinary differential equations that have to be integrated?) The authors do not mention what initial state was used to start the numerical integrations, whether transient dynamics were observed and what were their properties, or how the results depended on the choice of the initial state. The authors do not discuss how they determined that their model was programmed correctly (it is difficult to avoid typing errors when writing several pages or more of a code in any language) or how they determined the accuracy of the numerical integration method beyond fitting to experimental data, say by varying the time step size over some range or by comparing two different integration algorithms.

      We thank the reviewer again. The fitting process in our model occurred only at the first stage where the synaptic parameters were fit to the Mooney and Prather as well as the Kosche results. There was no data shared and we merely looked at the figures in those papers and checked the amplitude of the elicited currents, the magnitudes of DC-evoked excitations etc, and we replicated that in our model. While this is suboptimal, it was better for us to start with it rather than simply using equations for synaptic currents from the literature for other types of neurons (that are not even HVC’s or in the songbird) and integrate them into our network model. However, we will certainly highlight the details of this fitting process in the new submission. We will also highlight more technical details in the Methods regarding the exact number of ODEs, the initial conditions to run them, etc.

      Also disappointing is that the authors do not make any predictions to test, except rather weak ones such as that varying a maximum conductance sufficiently (which might be possible by using dynamic clamps) might cause burst propagation to stop or change its properties. Based on their results, the authors do not make suggestions for further experiments or calculations, but they should.

      We agree that making experimental testable predictions is crucial for the advancement of the model. Our predictions include testing whether eradication of a class of neurons such as HVC<sub>X</sub> neurons disrupts activity propagation which can be done through targeted neuron elimination. This also can be done through preventing rebound bursting in HVC<sub>X</sub> by pharmacologically blocking the I<sub>h</sub> channels. Others include down regulation of certain ion channels (pharmacologically done through ion blockers) and testing which current is fundamental for song production (and there a plenty of test based our results, like the SK current, the T-type Ca<sup>++</sup> current, the A-type K<sup>+</sup> current, etc). We will incorporate these into the revised manuscript to better demonstrate the model's applicability and to guide future research directions.

    1. eLife Assessment

      This manuscript presents important findings on how structural color can be manipulated through a specific single-gene mutation in the motile bacterium Flavobacterium IR1. It provides a promising model to identify genes and molecular mechanisms supporting this widespread optical phenomenon. The story relies on convincing data with proteomic analysis and well-designed experiments, although it remains rather descriptive. This work will be of interest to biophysicists and microbiologists working on structural colors and Flavobacterium.

    2. Reviewer #1 (Public review):

      Summary:

      Structural colors (SC) are based on nanostructures reflecting and scattering light and producing optical wave interference. All kinds of living organisms exhibit SC. However, understanding the molecular mechanisms and genes involved may be complicated due to the complexity of these organisms. Hence, bacteria that exhibit SC in colonies, such as Flavobacterium IR1, can be good models.

      Based on previous genomic mining and co-occurrence with SC in flavobacterial strains, this article focuses on the role of a specific gene, moeA, in SC of Flavobacterium IR1 strain colonies on an agar plate. moeA is involved in the synthesis of the molybdenum cofactor, which is necessary for the activity of key metabolic enzymes in diverse pathways.

      The authors clearly showed that the absence of moeA shifts SC properties in a way that depends on the nutritional conditions. They further bring evidence that this effect was related to several properties of the colony, all impacted by the moeA mutant: cell-cell organization, cell motility and colony spreading, and metabolism of complex carbohydrates. Hence, by linking SC to a single gene in appearance, this work points to cellular organization (as a result of cell-cell arrangement and motility) and metabolism of polysaccharides as key factors for SC in a gliding bacterium. This may prove useful for designing molecular strategies to control SC in bacterial-based biomaterials.

      Strengths:

      The topic is very interesting from a fundamental viewpoint and has great potential in the field of biomaterials.

      The article is easy to read. It builds on previous studies with already established tools to characterize SC at the level of the flavobacterial colony. Experiments are well described and well executed. In addition, the SIBR-Cas method for chromosome engineering in Flavobacteria is the most recent and is a leap forward for future studies in this model, even beyond SC.

      Weaknesses:

      The paper appears a bit too descriptive and could be better organized. Some of the results, in particular the proteomic comparison, are not well exploited (not explored experimentally). In my opinion, the problem originates from the difficulty in explaining the link between the absence of moeA and the alterations observed at the level of colony spreading and polysaccharide utilization, and the variation in proteomic content.

      First, the effect of moeA deletion on molybdenum cofactor synthesis should be addressed.

      Second, as I was reading the entire manuscript, I kept asking myself if moeA (and by extension molybdenum cofactor) was really involved in SC or it was an indirect effect. For example, what if the absence of moeA alters the cell envelope because the synthesis of its building blocks is perturbed, then subsequently perturbates all related processes, including gliding motility and protein secretion? It would help to know if the effects on colony spreading and polysaccharide metabolism can be uncoupled. I don't think the authors discussed that clearly.

    3. Reviewer #2 (Public review):

      Summary:

      The authors constructed an in-frame deletion of moeA gene, which is involved in molybdopterin cofactor (MoCo) biosynthesis, and investigated its role in structural colors in Flavobacterium IR1. The deletion of moeA shifted colony color from green to blue, reduced colony spreading, and increased starch degradation, which was attributed to the upregulation of various proteins in polysaccharide utilization loci. This study lays the ground for developing new colorants by modifying genes involved in structural colors.

      Major strengths and weaknesses:

      The authors conducted well-designed experiments with appropriate controls and the results in the paper are presented in a logical manner, which supports their conclusions. Using statistical tests to compare the differences between the wild type and moeA mutant, and adding a significance bar in Figure 4B, would strengthen their claims on differences in cell motility regarding differences in cell motility. Additionally, in the result section (Figure 6), the authors suggest that the shift in blue color is "caused by cells which are still highly ordered but narrower", which to my knowledge is not backed up by any experimental evidence.

      Overall, this is a well-written paper in which the authors effectively address their research questions through proper experimentation. This work will help us understand the genetic basis of structural colors in Flavobacterium and open new avenues to study the roles of additional genes and proteins in structural colors.

    4. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Structural colors (SC) are based on nanostructures reflecting and scattering light and producing optical wave interference. All kinds of living organisms exhibit SC. However, understanding the molecular mechanisms and genes involved may be complicated due to the complexity of these organisms. Hence, bacteria that exhibit SC in colonies, such as Flavobacterium IR1, can be good models.

      Based on previous genomic mining and co-occurrence with SC in flavobacterial strains, this article focuses on the role of a specific gene, moeA, in SC of Flavobacterium IR1 strain colonies on an agar plate. moeA is involved in the synthesis of the molybdenum cofactor, which is necessary for the activity of key metabolic enzymes in diverse pathways.

      The authors clearly showed that the absence of moeA shifts SC properties in a way that depends on the nutritional conditions. They further bring evidence that this effect was related to several properties of the colony, all impacted by the moeA mutant: cell-cell organization, cell motility and colony spreading, and metabolism of complex carbohydrates. Hence, by linking SC to a single gene in appearance, this work points to cellular organization (as a result of cell-cell arrangement and motility) and metabolism of polysaccharides as key factors for SC in a gliding bacterium. This may prove useful for designing molecular strategies to control SC in bacterial-based biomaterials.

      Strengths:

      The topic is very interesting from a fundamental viewpoint and has great potential in the field of biomaterials.

      Thank you for your comments.

      The article is easy to read. It builds on previous studies with already established tools to characterize SC at the level of the flavobacterial colony. Experiments are well described and well executed. In addition, the SIBR-Cas method for chromosome engineering in Flavobacteria is the most recent and is a leap forward for future studies in this model, even beyond SC.

      We appreciate these comments.

      Weaknesses:

      The paper appears a bit too descriptive and could be better organized. Some of the results, in particular the proteomic comparison, are not well exploited (not explored experimentally). In my opinion, the problem originates from the difficulty in explaining the link between the absence of moeA and the alterations observed at the level of colony spreading and polysaccharide utilization, and the variation in proteomic content.

      We will look at the organisation of the manuscript carefully in the coming, detailed revision, as suggested. In terms of the proteomics, there are clearly a large number of proteins affected by the moeA deletion. In terms of experimental exploration, we chose spreading, structural colour formation and starch degradation to test phenotypically, as the most relevant. For example, in L615-617, we discuss the downregulation of GldL (which is known to be involved Flavobacterial gliding motility [Shrivastava et al., 2013]) in the _moe_A KO as a possible explanation for the reduced colony spreading of moeA mutant. Changes in polysaccharide (starch) utilization were seen on solid medium, as well as in the proteomic profile where we observed the upregulation of carbohydrate metabolism proteins linked to PUL (polysaccharide utilisation locus) operons (Terrapon et al., 2015), such as PAM95095-90 (Figure 8), and other carbohydrate metabolism-related proteins, including a pectate lyase (Table S7) which is involved in starch degradation (Aspeborg et al., 2012). And as noted in L555-566 and Figure 9, starch metabolism was tested experimentally.

      First, the effect of moeA deletion on molybdenum cofactor synthesis should be addressed.

      MoeA is the last enzyme in the MoCo synthesis pathway, thus if only MoeA is absent the cell would accumulate MPT-AMP (molybdopterin-adenosine monophosphatase) (Iobbi-Nivol & Leimkühler, 2013), and the expressed molybdoenzymes would not be functional. In L582-585, we commented how the lack of molybdenum cofactor may affect the synthesis of molybdoenzymes. However, if you meant to analyse the presence of the small molecules, the cofactors, involved in these pathways, that was an assay we were not able to perform. Moreover, in L585-587, we addressed how the deletion of _moe_A affected the proteins encoded by the rest of genes in the operon.

      Second, as I was reading the entire manuscript, I kept asking myself if moeA (and by extension molybdenum cofactor) was really involved in SC or it was an indirect effect. For example, what if the absence of moeA alters the cell envelope because the synthesis of its building blocks is perturbed, then subsequently perturbates all related processes, including gliding motility and protein secretion? It would help to know if the effects on colony spreading and polysaccharide metabolism can be uncoupled. I don't think the authors discussed that clearly.

      The message of the paper is that the moeA gene, as predicted from a previous genomics analysis, is important in SC. This is based on the representation of the _moe_A gene in genomes of bacteria that display SC. This analysis does not predict the mechanism. When knocked out, a significant change in structural colour occurred, supporting this hypothesis. Whether this effect is direct or indirect is difficult to assess, as this referee rightly suggests. In order to follow up this central result, we performed proteomics (both intra- and extracellular). As we observed, the deletion of a single gene generated many changes in the proteomic profile, thus in the biological processes. Based on the known functions of molybdenum cofactor, we could only hypothesize that pterin metabolism is important for SC, not exactly how.

      We intend to discuss the links between gliding/spreading and polysaccharide metabolism more clearly, with reference to the literature, as quite a bit is known here including possible links to SC.

      Reviewer #2 (Public review):

      Summary:

      The authors constructed an in-frame deletion of moeA gene, which is involved in molybdopterin cofactor (MoCo) biosynthesis, and investigated its role in structural colors in Flavobacterium IR1. The deletion of moeA shifted colony color from green to blue, reduced colony spreading, and increased starch degradation, which was attributed to the upregulation of various proteins in polysaccharide utilization loci. This study lays the ground for developing new colorants by modifying genes involved in structural colors.

      Major strengths and weaknesses:

      The authors conducted well-designed experiments with appropriate controls and the results in the paper are presented in a logical manner, which supports their conclusions.

      We appreciate your comment.

      Using statistical tests to compare the differences between the wild type and moeA mutant, and adding a significance bar in Figure 4B, would strengthen their claims on differences in cell motility regarding differences in cell motility.

      Thank you. Figure 4B contains the significance bars that represent the standard deviation of the mean value of the three replicates, but we will modify it to make them more clear.

      Additionally, in the result section (Figure 6), the authors suggest that the shift in blue color is "caused by cells which are still highly ordered but narrower", which to my knowledge is not backed up by any experimental evidence.

      Thanks. We mentioned that the mutant cells are narrower than the wild type based on the estimated periodicity resulting from the goniometry analysis (L427-430). We will now say “likely to be narrower based on the estimated periodicity from the optical analysis” rather than just “narrower” in the revision.

      Overall, this is a well-written paper in which the authors effectively address their research questions through proper experimentation. This work will help us understand the genetic basis of structural colors in Flavobacterium and open new avenues to study the roles of additional genes and proteins in structural colors.

      Much appreciated.

      REFERENCES

      Aspeborg, Henrik, Pedro M. Coutinho, Yang Wang, Harry Brumer, and Bernard Henrissat. "Evolution, substrate specificity and subfamily classification of glycoside hydrolase family 5 (GH5)." BMC evolutionary biology 12 (2012): 1-16.

      lobbi-Nivol, Chantal, and Silke Leimkühler. "Molybdenum enzymes, their maturation and molybdenum cofactor biosynthesis in Escherichia coli." Biochimica et Biophysica Acta (BBA)-Bioenergetics 1827, no. 8-9 (2013): 1086-1101.

      Shrivastava, Abhishek, Joseph J. Johnston, Jessica M. Van Baaren, and Mark J. McBride. "Flavobacterium johnsoniae GldK, GldL, GldM, and SprA are required for secretion of the cell surface gliding motility adhesins SprB and RemA." Journal of bacteriology 195, no. 14 (2013): 3201-3212.

      Terrapon, Nicolas, Vincent Lombard, Harry J. Gilbert, and Bernard Henrissat. "Automatic prediction of polysaccharide utilization loci in Bacteroidetes species." Bioinformatics 31, no. 5 (2015): 647-655.

    1. eLife Assessment

      This fundamental research conducted a molecular comparison between smooth muscle cells and adjacent fibroblast cells within lung blood vessels affected by pulmonary arterial hypertension. The study identified distinct disease-related states in each cell type and provided deeper insights into their interactions and communication. While certain conclusions should be interpreted with caution due to inherent methodological limitations, the study's findings remain convincing and robust. This is supported by the use of advanced and complementary techniques, as well as the rare isolation of diseased lung blood vessel cells from the same donor, enabling direct comparison.

    2. Reviewer #1 (Public review):

      Summary:

      The authors isolated and cultured pulmonary artery smooth muscle cells (PASMC) and pulmonary artery adventitial fibroblasts (PAAF) of the lung samples derived from the patients with idiopathic pulmonary arterial hypertension (PAH) and the healthy volunteers. They performed RNA-seq and proteomics analyses to detail the cellular communication between PASMC and PAAF, which are the main target cells of pulmonary vascular remodeling during the pathogenesis of PAH. The authors revealed that PASMC and PAAF retained their original cellular identity and acquired different states associated with the pathogenesis of PAH, respectively.

      Strengths:

      Although previous studies have shown that PASMC and PAAF cells each have an important role in the pathogenesis of PAH, there have been scarce reports focusing on the interactions between PASMC and PAAF. These findings may provide valuable information for elucidating the pathogenesis of pulmonary arterial hypertension.

      Comments on revisions:

      The authors adequately responded to my concerns and revised their manuscript to elaborate on the new data from new experiments and address my queries. Although some of the issues I initially raised could not be fully resolved, the revised manuscript has been significantly improved. This manuscript provides essential insights into the communications across the PASMCs and PAAFs in PAH. This would greatly interest various researchers in both basic and clinical fields.

    3. Reviewer #2 (Public review):

      Summary:

      Utilizing a combination of transcriptomic and proteomic profiling as well as cellular phenotyping from source-matched PASMC and PAAFs in IPAH, this<br /> study sought to explore a molecular comparison of these cells in order to track distinct cell fate trajectories and acquisition of their IPAH-associated cellular states. The authors also aimed to identify cell-cell communication axes in order to infer mechanisms by which these two cells interact and depend upon external cues. This study will be of interest to the scientific and clinical communities of those interested in pulmonary vascular biology and disease. It also will appeal to those interested in lung and vascular development as well as multi-omic analytic procedures.

      Strengths:

      (1) This is one of the first studies using orthogonal sequencing and phenotyping for characterization of source-matched neighoring mesenchymal PASMC and PAAF cells in healthy and diseased IPAH patients. This is a major strength which allows for direct comparison of neighboring cell types and the ability to address an unanswered question regarding the nature of these mesenchymal "mural" cells at a precise molecular level.

      (2) Unlike a number of multi-omic sequencing papers that read more as an atlas of findings without structure, the inherent comparative organization of the study and presentation of the data were valuable in aiding the reader in understanding how to discern the distinct IPAH-associated cell states. As a result, the reader not only gleans greater insight into these two interacting cell types in disease but also now can leverage these datasets more easily for future research questions in this space.

      (3) There are interesting and surprising findings in the cellular characterizations, including the low proliferative state of IPAH-PASMCs as compared to the hyperproliferative state in IPAH-PAAFs. Furthermore, the cell-cell communication axes involving ECM components and soluble ligands provided by PAAFs that direct cell state dynamics of PASMCs offer some of the first and foundational descriptions of what are likely complex cellular interactions that await discovery.

      (4) Technical rigor is quite high in the -omics methodology and in vitro phenotyping tools used.

      Weaknesses:

      There are some weaknesses in the methodology that should temper the conclusions:

      (1) The number of donors sampled for PAAF/PASMCs was relatively small for both healthy controls and IPAH patients. Thus, while the level of detail of -omics profiling was quite deep, the generalizability of their findings to all IPAH patients or Group 1 PAH patients is limited. In the revised manuscript, the authors addressed this concern with important text changes and additional data.

      (2) While the study utilized early passage cells, these cells nonetheless were still cultured outside the in vivo milieu prior to analysis. Thus, while there is an assumption that these cells do not change fundamental behavior outside the body, that is not entirely proven for all transcriptional and proteomic signatures. As such, the major alterations that are noted would be more compelling if validated from tissue or cells derived directly from in vivo sources. Without such validation, the major limitation of the impact and conclusions of the paper is that the full extent of the relevance of these findings to human disease is not known. The authors addressed this concern appropriately with significant text changes to clarify these limitations for the reader.

      (3) While the presentation of most of the manuscript was quite clear and convincing, the terminology and conclusions regarding "cell fate trajectories" throughout the manuscript did not seem to be fully justified. That is, all of the analyses were derived from cells originating from end-stage IPAH, and otherwise, the authors were not lineage tracing across disease initiation or development (which would be impossible currently in humans). So, while the description of distinct "IPAH-associated states" makes sense, any true cell fate trajectory was not clearly defined. The revised manuscript has removed this terminology and replaced it with more precise language.

      Comments on revisions:

      The authors were quite responsive to all of my concerns, offering both important revisions to the presentation of the work as well as new data. While some of the limitations were not fully resolved (and the authors provide appropriate justification for this), the revised manuscript is much improved. It will be of great interest to both the scientific and clinical communities.

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This important study explored a molecular comparison of smooth muscle and neighboring fibroblast cells found in lung blood vessels afflicted by a disease called pulmonary arterial hypertension. In doing so, the authors described distinct disease-associated states of each of these cell types with further insights into the cellular communication and crosstalk between them. The strength of evidence was convincing through the use of complementary and sophisticated tools, accompanied by rare isolation of human diseased lung blood vessel cells that were source-matched to the same donor for direct comparison.

      We thank the editors and reviewers in their highly positive and encouraging assessment of our manuscript detailing the cell state changes of arterial smooth muscle cells and fibroblasts in the pulmonary bed. We addressed reviewers’ major comments in the revised manuscript by providing validation of key in vitro findings, such as preserved marker localization and increased GAG deposition in IPAH pulmonary arteries. We additionally provide comparison of transcriptomic profiles spanning fresh, very early and late passage cells. Finally, we present expanded experimental data in support of cellular crosstalk, including testing of additional PAAF ligands on donor PASMC and influence of PTX3/HGF on IPAH PASMC.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors isolated and cultured pulmonary artery smooth muscle cells (PASMC) and pulmonary artery adventitial fibroblasts (PAAF) of the lung samples derived from the patients with idiopathic pulmonary arterial hypertension (PAH) and the healthy volunteers. They performed RNA-seq and proteomics analyses to detail the cellular communication between PASMC and PAAF, which are the main target cells of pulmonary vascular remodeling during the pathogenesis of PAH. The authors revealed that PASMC and PAAF retained their original cellular identity and acquired different states associated with the pathogenesis of PAH, respectively.

      Strengths:

      Although previous studies have shown that PASMC and PAAF cells each have an important role in the pathogenesis of PAH, there have been scarce reports focusing on the interactions between PASMC and PAAF. These findings may provide valuable information for elucidating the pathogenesis of pulmonary arterial hypertension.

      We appreciate the reviewer’s positive view of our study.

      Weaknesses:

      The results of proteome analysis using primary culture cells in this paper seem a bit insufficient to draw conclusions. In particular, the authors described "We elucidated the involvement of cellular crosstalk in regulating cell state dynamics and identified pentraxin-3 and hepatocyte growth factor as modulators of PASMC phenotypic transition orchestrated by PAAF." However, the presented data are considered limited and insufficient.

      We thank the reviewer for drawing our attention to this point and have accordingly modified the conclusion section to read: “We investigated the involvement of cellular crosstalk….” Moreover, we provide further experimental evidence demonstrating the effect of both PTX3 and HGF on cell state marker expression in IPAH-PASMC cells (Figure 7H). In addition, we clarify the selection strategy applied to investigate particular PAAF-secreted ligands and test three additional ligands on donor PASMC (Figure S8), supporting the original focus on PTX3 and HGF.

      Reviewer #2 (Public Review):

      Summary:

      Utilizing a combination of transcriptomic and proteomic profiling as well as cellular phenotyping from source-matched PASMC and PAAFs in IPAH, this study sought to explore a molecular comparison of these cells in order to track distinct cell fate trajectories and acquisition of their IPAH-associated cellular states. The authors also aimed to identify cell-cell communication axes in order to infer mechanisms by which these two cells interact and depend upon external cues. This study will be of interest to the scientific and clinical communities of those interested in pulmonary vascular biology and disease. It also will appeal to those interested in lung and vascular development as well as multi-omic analytic procedures.

      We thank the reviewer for overall highly positive assessment of our study.

      Strengths:

      (1) This is one of the first studies using orthogonal sequencing and phenotyping for the characterization of source-matched neighboring mesenchymal PASMC and PAAF cells in healthy and diseased IPAH patients. This is a major strength that allows for direct comparison of neighboring cell types and the ability to address an unanswered question regarding the nature of these mesenchymal "mural" cells at a precise molecular level.

      We value the reviewer’s kind and objective summary of our study.

      (2) Unlike a number of multi-omic sequencing papers that read more as an atlas of findings without structure, the inherent comparative organization of the study and presentation of the data were valuable in aiding the reader in understanding how to discern the distinct IPAH-associated cell states. As a result, the reader not only gleans greater insight into these two interacting cell types in disease but also now can leverage these datasets more easily for future research questions in this space.

      We thank the reviewer for this highly positive comment.

      (3) There are interesting and surprising findings in the cellular characterizations, including the low proliferative state of IPAH-PASMCs as compared to the hyperproliferative state in IPAH-PAAFs. Furthermore, the cell-cell communication axes involving ECM components and soluble ligands provided by PAAFs that direct cell state dynamics of PASMCs offer some of the first and foundational descriptions of what are likely complex cellular interactions that await discovery.

      We agree with the reviewer’s assessment that some of the novel data in our study helps to formulate testable hypothesis that can be followed through with more focused follow-up research.

      (4) Technical rigor is quite high in the -omics methodology and in vitro phenotyping tools used.

      We are grateful for reviewer’s assessment of our work and positive recognition.

      Weaknesses:

      There are some weaknesses in the methodology that should temper the conclusions:

      (1) The number of donors sampled for PAAF/PASMCs was small for both healthy controls and IPAH patients. Thus, while the level of detail of -omics profiling was quite deep, the generalizability of their findings to all IPAH patients or Group 1 PAH patients is limited.

      We appreciate the reviewers concerns regarding the generalizability of the findings and have acknowledged this as the study limitation in the discussion: “A low case number and end-stage disease samples used for omics characterization represents a study limitation that has to be taken into account before assuming similar findings would be evident in the entire PAH patient population over the course of the disease development and progression”. We have addressed this issue by performing validation of key in vitro findings using fresh cells or assessment of FFPE lung material from additional independent samples in the revised manuscript (Figures 2D, 3D, 3H, 4H). For transparency, we provide biological sample number in the result section of the modified manuscript.

      (2) While the study utilized early passage cells, these cells nonetheless were still cultured outside the in vivo milieu prior to analysis. Thus, while there is an assumption that these cells do not change fundamental behavior outside the body, that is not entirely proven for all transcriptional and proteomic signatures. As such, the major alterations that are noted would be more compelling if validated from tissue or cells derived directly from in vivo sources. Without such validation, the major limitation of the impact and conclusions of the paper is that the full extent of the relevance of these findings to human disease is not known.

      We thank the reviewer for this constructive and excellent suggestion. The comparison of fresh and cultured cells revealed a strong and early divergence of differentially regulated pathways for PAAF, while a more gradual transition for PASMC. The results of this analysis are included in the new Figures 2D, 3D, 3H, and 4H. Implications are discussed in the revised manuscript: “However, the same mechanism renders cells susceptible to phenotypic change induced simply by extended vitro culturing, testified by broad expression profile differences between fresh and cultured cells. This common caveat in cell biology research and represents a technical and practical tradeoff that requires cross validation of key findings. Using a combination of archived lung tissue and available single cell RNA sequencing dataset of human pulmonary arteries, we show that some of the key defining phenotypic features of diseased cells, such as altered proliferation rate and ECM production, are preserved and gradually lost upon prolonged culturing”.

      (3) While the presentation of most of the manuscript was quite clear and convincing, the terminology and conclusions regarding "cell fate trajectories" throughout the manuscript did not seem to be fully justified. That is, all of the analyses were derived from cells originating from end-stage IPAH, and otherwise, the authors were not lineage tracing across disease initiation or development (which would be impossible currently in humans). So, while the description of distinct "IPAH-associated states" makes sense, any true cell fate trajectory was not clearly defined.

      In accordance to reviewer’s comment, we have decided to modify the wording to exclude the “cell fate trajectory” phrase and replace it with “acquisition of disease cell state”.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Major comments:

      (1) In Figure 1, PASMC and PAAF were collected from the lungs of healthy donors and analyzed for transcriptomics and proteomics; in Figure 1A, it can be taken as if both cells from IPAH patients were also analyzed, but this is not reflected in the results. In Figure1D, immunostaining of normal lungs confirms the localization of PASMC and PAAF markers found by transcriptomics. The authors describe a strong, but not perfect, correlation between the transcriptomics and proteomics data from Figure S1, but the gene names of each cellular marker they found should also be listed. In addition, the authors have observed the expression of markers characteristic of PASMC and PAAF in pulmonary vessels of healthy subjects by IH, but is there any novelty in these markers? Furthermore, are the expression sites of these markers altered in IPAH patients?

      In the revised manuscript we have adjusted the schematic to reflect the fact that only donor cells are compared in Figure 1. We additionally provide a correlation of cell type markers between proteomic and transcriptomic data sets for those molecules that are detected in both datasets (Figure S1B).

      We provide clarification on the novelty aspect in the result section: “Some of the molecules were previously associated with predominant SMC, such as RGS5 and CSPR1 (Crnkovic et al., 2022; Snider et al., 2008), or adventitial fibroblast, such as SCARA5, CFD and MGST1 (Crnkovic et al., 2022; Sikkema et al., 2023) expression”. Except for RGS5, expression and localization of other markers in IPAH was previously unknown.

      The conservation of expression sites for reported markers was validated in IPAH in the revised manuscript (Figure 2D), with IGFBP5 showing dual localization in both cell types. Moreover, results in Figure 1D, 1E and 2D support the validity of omics findings and preservation of key markers during passaging.

      (2) In Figure 2, the authors compare PASMC and PAAF derived from IPAH patients and donors. The results show that transcriptomics and proteomics changes are clearly differentiated by cell type and not by pathological state. In the pathological state, transcriptional changes are more pronounced. The GO analysis of the factors that showed significant changes in each cell type is shown in Figure 2E, but the differences between the GO analysis of the transcriptomics and proteomics results are not clearly shown. The reviewer believes that the advantages of a combined analysis of both should be indicated. Also, in Figure 2G, the GAG content in PA appears to be elevated in only 3 cases, while the other 5 cases appear to be at the same level as the donor; is there a characteristic change in these 3 cases? Figure 2I shows that the phenotype of PAAF changes with cell passages. Since this phenomenon would be interesting and useful to the reader, additional discussion regarding the mechanism would be desired.

      We have integrated both data sets in order to achieve stronger and meaningful analysis due to weaker and uncomplete correlation between transcriptomic and protein dataset as indicated in the results section: “Comparative analysis of transcriptomic and proteomic data sets revealed a strong, but not complete level of linear correlation between the gene and protein expression profiles (Figure S1B, C). We therefore decided to use an integrative dataset and analyzed all significantly enriched genes and proteins (-log10(P)>1.3) between both cell types to achieve stronger and more robust analysis”. In general, proteomic profile showed fewer significant differences and extent of change was lesser compared with transcriptomics, likely due to technical limitations of the method and sensitivity, testified by the complete lack of top transcriptomic molecules (RGS5, ADH1C, IGFBP5, CFD, SCARA5) in the protein dataset.

      To strengthen the findings of increased GAG in IPAH pulmonary arteries, we have performed compartment-specific, quantitative image analysis of Alcian blue staining on additional donor and patient samples (n=10 for each condition). The new analysis totaling around 40 PA confirmed significantly increased deposition of GAG in IPAH pulmonary arteries.

      We have addressed the issue of phenotypic change with prolonged cell culture in the revised manuscript by systematically comparing enrichment for biological processes between fresh (Crnkovic et al., 2022: GSE210248), very early (this study: GSE255669) and later passage cells (Chelladurai et al., 2022: GSE144932; Gorr et al., 2020: GSE144274). We observed cell type differences in the rate of change of phenotypic features, with PAAF showing faster shift early on during culturing that could for some of the features be due to isolation from immunomodulatory environment or presence of hydrocortisone supplement in the PAAF cell media. These points have been described in the revised results section and mentioned in the discussion.

      (3) The authors claim that one feature of this paper is the use of "very early passage (p1)" of pulmonary artery smooth muscle cells (PASMC). Since there are other existing (previouly reported) data that are publicly available, such as RNA-seq data using cells with 2-4 cell passages, it may be possible to show that fewer passages are better in primary culture by comparing the data presented in this paper.

      Following reviewers’ comments, we have performed systematic comparison (Crnkovic et al., 2022: GSE210248), very early (this study: GSE255669) and later passage cells (Chelladurai et al., 2022: GSE144932; Gorr et al., 2020: GSE144274). in the revised manuscript in order to comprehensively address the issue and define changes occurring as a result of prolonged in vitro conditions (Figure 3H). The results showed that the expression profile of early passage cells retains some of the key phenotypic features displayed by cells in their native environment, with PASMC displaying a more gradual loss of phenotypic characteristics compared to PAAF. Interestingly, PAAF displayed a striking inverse enrichment for inflammatory/NF-kB signaling between fresh and cultured PAAF, which could potentially be caused by the hydrocortisone supplement in the PAAF cell media or due to the isolation from its highly immunomodulatory enviroment. These points have been described in the revised results section and mentioned in the discussion.

      (4) The authors describe a study characterized by decreased expression of "cytoskeletal contractile elements" in pulmonary artery smooth muscle cells (PASMC) derived from patients with IPAH. What are the implications of this result, and does it arise from the use of smooth muscle in patients resistant to pulmonary artery smooth muscle dilating agents? A discussion on this issue needs to be made in a way that is easy for the reader to understand.

      The reviewer raises an interesting point regarding the loss the contractile markers and response to vasodilating therapy. We would speculate that isolated decrease in contractile machinery, without concomitant change in ECM and other PASMC features, would dampen both the contraction and relaxation properties of the single PASMC, affecting not only its response to dilating agents, but also to vasoconstrictors. Clinical consequences and responsiveness to dilating agents are more difficult to predict, since the vasoactive response would additionally depend on mechanical properties of the pulmonary artery defined by cellular and ECM composition. Nevertheless, we believe that decreased expression of contractile machinery reflects an intrinsic, “programmed” response of SMC to remodeling, rather than vasodilator therapy-induced selection pressure, since similar phenotypic change is observed in SMC from systemic circulation and in various animal models without exposure to PAH medication. These considerations have been included in the revised discussion section.

      (5) There are a lot of secreted proteins that increase or decrease in Figure 6G, but there is scant reason to focus on PTX3 and HGF among them. The authors need to elaborate on the above issue.

      We regret the lack of clarity and provide improved explanation of the ligand selection strategy in the revised manuscript. In order to prioritize the potential hits, we first used hierarchical clustering to group co-regulated ligands into smaller number of groups. We then prioritized for the ligands that lacked or had limited information with respect to IPAH. Based on these results, we analyzed the effect of three additional ligands on PASMC cell state marker expression (Figure S8). This additional data supported the initial focus on PTX3 and HGF.

      Minor comments:

      (1) Regarding the number of specimens used in the Result, it would be more helpful to the reader if the number of samples were also mentioned in the text.

      We have included the number of used samples in manuscript text.

      (2) There is no explanation of what R2Y represents in Figure 2B. This reviewer is not able to understand the statistical analysis of Figure 2H. The detailed results should be explained.

      We apologize for the oversight in labeling of Figure 2B and modify the figure legend: “Orthogonal projection to latent structures-discriminant analysis (OPLS-DA) T score plots separating predictive variability (x-axis), attributed to biological grouping, and non-predictive variability (technical/inter-individual, y-axis). Monofactorial OPLS-DA model for separation according to cell type or disease. C) Bifactorial OPLS-DA model considering cell type and disease simultaneously. Ellipse depicting the 95% confidence region, Q2 denoting model’s predictive power (significance: Q2>50%) and R2Y representing proportion of variance in the response variable explained by the model (higher values indicating better fit)”.

      We also modified figure legend wording for the analysis in Figure 2H (new Figure 3E) to clarify the independent factors whose interaction was investigated using 3-way ANOVA: “Interaction effects of stimulation, cell type, and disease state on cellular proliferation were analyzed by 3-way ANOVA. Significant interaction effects are indicated as follows: * for stimulation × cell type interactions and # for cell type × disease state interactions (both *, # p<0.05)”.

      (3) In Figure 3, the authors examined whether there were molecular abnormalities common to IPAH-PASMC and IPAH-PAAF and found that the number of commonly regulated genes and proteins was limited to 47. Further analysis of these regulators by STRING analysis revealed that factors related to the regulation of apoptosis are commonly altered in both cells. On the other hand, the authors focused on mitochondria, as SOD2 is downregulated, and found an increase in ROS production specific to PASMC, indicating that mitochondrial dysfunction is common to PASMC and PAAF in IPAH, but downstream phenomena are different between cell types. Factors associated with apoptosis regulation have been found to be both upward and downward regulated, but the actual occurrence of apoptosis in both cell types has not been addressed.

      We have performed TUNEL staining on FFPE lung tissue from donors and IPAH patients that revealed apoptosis as a rare event in both conditions in PASMC and PAAF. Therefore, no meaningful quantification could be conducted. An example of pulmonary artery where rare positive signal in either PAAF or PASMC could be found is provided in Figure 4H.

      Unfortunately, association of a particular gene with a pathway is by default arbitrary and potentially ambiguous. In particular, factors identified as associated in apoptosis are also involved in regulation of inflammatory signaling (BIRC3, DDIT3) and amino acid metabolism (SHMT1). Nevertheless, mitochondria represent a crucial cellular hub for apoptosis regulation and, as shown in the current study, display significant functional alterations in IPAH in both cell types, aligning with reduced mitochondrial superoxide dismutase (SOD2) expression.

      (4) The meaning of the gray circle in Figure 3C should be clarified. Similarly, the meaning of the color in Fig. 3D should be clearly explained. In Figure 3E-G, each cell is significantly different from 18-61 cells, and the number of each cell and the reason should be described.

      We regret the confusion and provide better explanation of the figure legend: “gray nodes representing their putative upstream regulators”, “with color coding reflecting the IPAH dependent regulation”. In the revised Figure panels 4E-G (old 3E-G) we provide the exact number of cells measured in each condition. Although we tried to have comparable cell confluency at the time of measurement, different proliferation rates between cells from different cell type and condition led to different number of measured cells per donor/patient.

      (5) In Figure 4, the authors focus on factors that vary in different directions between cells, revealing fingerprints of molecular changes that differ between cell types, particularly IPAH-PASMC, which acquires a synthetic phenotype with enhanced regulation of chemotaxis elements, whereas IPAH-PAAF, a fast cycling cell characteristics. Next, focusing on the ECM components that were specifically altered in IPAH-PASMC, Nichenet analysis in Figure 5 suggested that ligands from PAAF may act on PASMC, and the authors focused on integrin signaling to examine ECM contact and changes in cell function. The results indicate that adhesion to laminin is poor in PASMC. Although no difference was observed between donor and IPAH PASMCs, a discussion of the reasons for this would be desired and helpful to the readers.

      Both donor and IPAH PASMCs respond similarly to laminin. However, our key finding is the downregulation of laminin in IPAH PAAF, which likely leads to a skewed laminin-to-collagen ratio and altered ECM composition in remodeled arteries. This shift in the ECM class results in altered PASMC behavior, affecting both donor and IPAH cells similarly. In the revised manuscript, we demonstrate that PASMC largely retain the expression pattern of integrin subunits that serve as high-affinity collagen and laminin receptors, with higher levels compared to PAAF (Figure 6F, G). Furthermore, we speculate that the distinct cellular phenotypic responses to collagen versus laminin coatings may arise from different downstream signaling pathways activated by the various integrin subunits (Nguyen et al., 2000). These considerations have been included in the revised discussion: “The comparable responses of donor and IPAH PASMC likely result from their shared integrin receptor expression profiles. Meanwhile, ECM class switching engages different high-affinity integrin receptors, which activate alternative downstream signaling pathways (Nguyen et al., 2000) and lead to differential responses to collagen and laminin matrices. We thus propose a model in which laminins and collagens act as PAAF-secreted ligands, regulating PASMC behavior through their ECM-sensing integrin receptors.”

      (6) Since Figure 3B and Figure 4A seem to show the same results, why not combine them into one?

      Indeed, these figure panels show the same results, but the focus of the investigations in each Figure is different. We therefore opted to keep the panels separate for better clarity and logical link to other panels in the same figure

      (7) In Figure 6, the interaction analysis of scRNAseq data with respect to signaling between PASMC and PAAF was performed using Nichenet and CellChat, showing that signaling from PAAF to PASMC is biased toward secreted ligands and that a functionally relevant set of soluble ligands is impaired in the IPAH state. From there, they proceeded with co-culture experiments and showed that co-culture healthy PASMC with PAAF of IPAH patients abolished PASMC markers in the healthy state. Furthermore, the authors attempted to identify ligands that induce functional changes in PASMCs produced from IPAH PAAFs and found that HGF is a factor that downregulates the expression of contractile markers in PASMCs. Further insights may be gained by co-culturing IPAH-derived cells in co-culture experiments. Also, no beneficial effect of pentraxin3 was found in Figure 6H. The authors should examine the effect of pentraxin3 on PASMC cells derived from IPAH patients, rather than healthy donors.

      We tested the influence of IPAH-PASMC on donor-PAAF and found no effect on the expression of the selected markers. We thank the reviewer for the suggestion to conduct the experiments on IPAH-PASMC. The new data show that both PTX3 and HGF have a significant effect, but differential effect on IPAH-PASMC as compared to donors-PASMC. Whereas PTX lacks effect on donor PASMC, it leads to downregulation of some of the contractile markers in IPAH PASMC, while HGF upregulates VCAN synthetic marker in IPAH PASMC. These results are now included in Figure 7H.

      Reviewer #2 (Recommendations For The Authors):

      The authors should double-check for grammar and typos in the manuscript. I caught a few such as "therefor" and others, but there could be more.

      We thank the reviewer for the effort and time in reading and evaluating the manuscript. To the best of our knowledge, we have corrected the grammatical errors in the revised manuscript.

    1. eLife Assessment

      The paper presents a valuable theoretical treatment of the role of passage of time in optimal decision strategies in pursuit based tasks. The computational evidence and methodologies employed are novel, and the authors offer solid evidence for the majority of the claims.

    2. Reviewer #2 (Public review):

      Summary:

      This paper from Sutlief et al. focuses on an apparent contradiction observed in experimental data from two related types of pursuit-based decision tasks. In "forgo" decisions, where the subject is asked to choose whether or not to accept a presented pursuit, after which they are placed into a common inter-trial interval, subjects have been shown to be nearly optimal in maximizing their overall rate of reward. However, in "choice" decisions, where the subject is asked which of two mutually-exclusive pursuits they will take, before again entering a common inter-trial interval, subjects exhibit behavior that is believed to be sub-optimal. To investigate this contradiction, the authors derive a consistent reward-maximizing strategy for both tasks using a novel and intuitive geometric approach that treats every phase of a decision (pursuit choice and inter-trial interval) as vectors. From this approach, the authors are able to show that previously-reported examples of sub-optimal behavior in choice decisions are in fact consistent with a reward-maximizing strategy. Additionally, the authors are able to use their framework to deconstruct the different ways the passage of time impacts decisions, demonstrating the time cost contains both an opportunity cost and an apportionment cost, as well as examine how a subject's misestimation of task parameters impacts behavior.

      Strengths:

      The main strength of the paper lies in the authors' geometric approach to studying the problem. The authors chose to simplify the decision process by removing the highly technical and often cumbersome details of evidence accumulation that is common in most of the decision-making literature. In doing so, the authors were able to utilize a highly accessible approach that is still able to provide interesting insights into decision behavior and the different components of optimal decision strategies.

      Weaknesses:

      The authors have made great improvements to the strength of their evidence through revision, especially concerning their treatment of apportionment cost. However, I am concerned that the story this paper tells is far from concise, and that this weakness may limit the paper's audience and overall impact. I would strongly suggest making an effort to tighten up the language and structure of the paper to improve its readability and accessibility.

    3. Reviewer #3 (Public review):

      Summary:

      The goal of the paper is to examine the objective function of total reward rate in an environment to understand behavior of humans and animals in two types of decision-making tasks: 1) stay/forgo decisions and 2) simultaneous choice decisions. The main aims are to reframe the equation of optimizing this normative objective into forms that are used by other models in the literature like subjective value and temporally discounted reward. One important contribution of the paper is the use of this theoretical analysis to explain apparent behavioral inconsistencies between forgo and choice decisions observed in the literature.

      Strengths:

      The paper provides a nice way to mathematically derive different theories of human and animal behavior from a normative objective of global reward rate optimization. As such, this work has value in trying to provide a unifying framework for seemingly contradictory empirical observations in literature, such as differentially optimal behaviors in stay-forgo v/s choice decision tasks. The section about temporal discounting is particularly well motivated as it serves as another plank in the bridge between ecological and economic theories of decision-making. The derivation of the temporal discounting function from subjective reward rate is much appreciated as it provides further evidence for potential equivalence between reward rate optimization and hyperbolic discounting, which is known to explain a slew of decision-making behaviors in the economics literature.

      Weaknesses:

      (1) Readability and organization:<br /> While I appreciate the detailed analysis and authors' attempts to provide as many details as possible, the paper would have benefitted from a little selectivity on behalf of the authors so that the main contributions aren't buried by the extensive mathematical detail provided.<br /> For instance, in Figure 5, the authors could have kept the most important figures (A, B and G) to highlight the most relevant terms in the subjective value instead of providing all possible forms of the equation.

      Further, in subfigure 5E, is there a reason that the outside reward r_out is shown to be zero? The text referencing 5E is also very unclear: "In so downscaling, the subjective value of a considered pursuit (green) is to the time it would take to traverse the world were the pursuit not taken, 𝑡_out, as its opportunity cost subtracted reward (cyan) is to the time to traverse the world were it to be taken (𝑡_in+ 𝑡_out) (Figure 5E)."

      In the abstract, the malapportionment of time is mentioned as a possible explanation for reconciling observed empirical results between simultaneous and sequential decision-making. However, perhaps due to the density of mathematical detail presented, the discussion of the malapportionment hypothesis is pushed all the way to the end of the discussion section.

      (2) Apportionment Cost definition and interpretation<br /> This additional cost arises in their analyses from redefining the opportunity cost in terms of just "outside" rewards so that the subjective value of the current pursuit and the opportunity cost are independent of each other. However, in doing so, an additional term arises in defining the subjective value of a pursuit, named here the "apportionment cost". The authors have worked hard to provide a definition to conceptualize the apportionment cost though it remains hard to intuit, especially in comparison to the opportunity cost. The additive form of apportionment cost (Equation 9) doesn't add much in way of intuition or their later analyses for the malapportionment hypothesis. It appears that the most important term is the apportionment scaling term so just focusing on this term will help the reader through the subsequent analyses.

      (3) Malapportionment Hypothesis: From where does this malapportionment arise?<br /> The authors identify the range of values for t_in and t_out in Figure 18, the terms comprising the apportionment scaling term, that lead to optimal forgo behaviors despite suboptimally rejecting the larger-later (LL) choice in choice decisions. They therefore conclude that a lower apportionment scale, which arises from overestimating the time required outside the pursuit (t_out) or underestimating the time required at the current pursuit (t_in). What is not discussed though is whether and how the underestimation of t_out and overestimation of t_in can be dissociated, though it is understood that empirical demonstration of this dissociation is outside the scope of this work.

    1. eLife Assessment

      This useful study by Gao et al identifies Hspa2 as a heterogeneous transcript in the early embryo and proposes a plausible mechanism showing interactions with Carm1. The authors propose that variability in HSPA2 levels among blastomeres at the 4-cell stage skews their relative contribution to the embryonic lineage. Given only 4 other heterogeneous transcripts/non-coding RNA have been proposed to act similarly at or before the 4-cell stage, this would be a key addition to our understanding of how the first cell fate decision is made. While this is a solid study, further data are needed to fully support the conclusions.

    2. Reviewer #1 (Public review):

      Summary:

      The authors investigate the role of HSPA2 during mouse preimplantation development. Knocking down HSPA2 in zygotes, the authors describe lower chances of developing into blastocysts, which show a reduced number of inner cell mass cells. They find that HSPA2 mRNA and protein levels show some heterogeneity among blastomeres at the 4-cell stage and propose that HSPA2 could contribute to skewing their relative contribution to embryonic lineages. To test this, the authors try to reduce HSPA2 expression in one of the 2-cell stage blastomere and propose that it biases their contribution to towards extra-embryonic lineages. To explain this, the authors propose that HSPA2 would interact with CARM1, which controls chromatin accessibility around genes regulating differentiation into embryonic lineage.

      Strengths:

      (1) The study offers simple and straightforward experiments with large sample sizes.

      (2) Unlike most studies in the field, this research often relies on both mRNA and protein levels to analyse gene expression and differentiation.

      Weaknesses:

      (1) Image and statistical analyses are not well described.

      (2) The functionality of the overexpression construct is not fully validated.

      (3) Tracking of KD cells in embryos injected at the 2-cell stage with GFP is unclear.

      (4) A key rationale of the study relies on measuring small differences in the levels of mRNA and proteins using semi-quantitative methods to compare blastomeres. As such, it is not possible to know whether those subtle differences are biologically meaningful. For example, the lowest HSPA2 level of the embryo with the highest level is much higher than the top cell from the embryo with the lowest level. What does this level mean then? Does this mean that some blastomeres grafted from strong embryos would systematically outcompete all other blastomeres from weaker embryos? That would be very surprising. I think the authors should be more careful and consider the lack of quantitative power of their approach before reaching firm conclusions. Although to be fair, the authors only follow a long trend of studies with the same intrinsic flaw of this approach.

      (5) Some of the analyses on immunostaining do not take into account that this technique only allows for semi-quantitative measurements and comparisons.<br /> a) Some of the microscopy images are shown with an incorrect look-up table.<br /> b) Some of the schematics are incorrect and misleading.

    3. Reviewer #2 (Public review):

      Summary:

      In this study, Gao et al. use RNA-seq to identify Hspa2 as one of the earliest transcripts heterogeneously distributed between blastomeres. Functional studies are performed using siRNA knockdown showing Hspa2 may bias cells toward the ICM lineage via interaction with the known methyltransferase CARM1.

      Strengths:

      This study tackles an important question regarding the origins of the first cell fate decision in the preimplantation embryo. It provides novelty in its identification of Hspa2 as a heterogeneous transcript in the early embryo and proposes a plausible mechanism showing interactions with Carm1. Multiple approaches are used to validate their functional studies (FISH, WB, development rates, proteomics). Given only 4 other transcripts/RNA have been identified at or before the 4-cell stage (LincGET, CARM1, PRDM14, HMGA1), this would be an important addition to our understanding of how TE vs ICM fate is established.

      Weaknesses:

      The RNA-seq results leading the authors to focus on Hspa2 are not included in the manuscript. This dataset would serve as an important resource but is neither included nor discussed. Nor is it mentioned whether Hspa2 was identified in prior RNA-seq embryos studies (for example Deng Science 2014).

      Furthermore, the authors show that Hspa2 knockdown at the 1-cell stage lowers total Carm1 levels at the 4-cell stage. However, it is unclear how total abundance within the embryo alters lineage specification within blastomeres. The authors go on to propose a plausible mechanism involving Hspa2 and Carm1 interaction, but do not discuss how expression levels may be involved.

    1. eLife Assessment

      This important work addresses the relationship between the transdiagnostic compulsivity dimension and confidence as well as confidence-related behaviours like reminder setting. The relationship between confidence and compulsive disorders has recently received a lot of attention and has been considered to be a key cognitive change. The authors paired an elegant experimental design and pre-registration to give convincing evidence of the relationship between compulsivity, reminder setting, and confidence. In the revised version they thoroughly addressed the reviewer's comments, in particular adding new analyses clarifying how their findings relate to prediction error based learning as well as presenting additional recovery analyses and psychometric curves further strengthening the manuscript.

    2. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #1:

      (1) To improve the clarity of the work, I suggest a final note to the authors to say more explicitly that objective accuracy has a finer resolution *due to the number of "special circles" per trial* in their task. This task detail got lost in my read of the manuscript, and confused me with respect to the resolution of each accuracy measure.

      We agree with the reviewer that this would be a useful clarification and have therefore added the following statement to the Methods section on p. 20:

      “It should be noted that the OIP has a slightly finer resolution due to the number of special circles per trial.”

      (2) Similarly for clarification, they could point out that their exclusion criteria removes subjects that have lower OIP than their AIP analysis allows (which is good for comparison between OIP and AIP). Thus, it removes the possibility that very poor performing subjects (OIP) are forced to have a higher than actual AIP due to the range).

      We agree this would be a useful statement to add and have included the following sentence in the Supplement on p. 8:

      “Such a restriction of the threshold parameter was intended to increase the comparability between AIP and OIP, and hence improved the calculation of the reminder bias.”


      The following is the authors’ response to the previous reviews.

      Reviewer #1:

      (1) Upon reading their response to the question I had regarding AIP and OIP, a few more questions came up regarding OIP, AIP, how they're calculations differ, and how the latter was computed in R. I hope these help readers to clarify how to interpret these key measures, and the hypotheses that rely upon them.

      Regarding fitting, and in relation to power, is16 queries adequate to estimate an AIP using the R's quickpsy? That is, assuming some noise in the choice process, how recoverable is a true indifference points from 16 trials? If there's a parameter recovery analysis (ie generating choice via the fitting parameters, which will have built-in stochasticity, and seeing how well you recover the parameter) of interest would be helpful. It may help to characterize why the present study might differ from prior studies (maybe a power issue here).

      The reviewer is absolutely correct that we should have provided more detail when describing our fitting procedure for the psychometric curves. We have now addressed this by adding the following statements to the Methods section and Supplement:

      Page 20 in the main manuscript: “Fitting was done using the quickpsy package in R and more detail is given in the Supplement.”

      Pages 8 and 9 in the Supplement: 

      “Psychometric curve fitting

      We used the quickpsy package in R to fit psychometric curves to each participant’s choice data to derive their actual indifference point (AIP), which was operationalised as the threshold parameter when predicting reminder choices from target values. We restricted the possible parameter ranges from 2 to 9 for the threshold parameter and from 1 to 500 for the slope parameter, based on the task’s properties and pilot data. Apart from those parameter ranges, we used only default settings of the quickpsy() function.

      Each participant has only 16 trials (2 for each target value) contribute to the curve fitting. To understand the robustness of the AIP based on such limited data, we conducted a parameter recovery analysis. We simulated 16 trials based on each psychometric function and re-ran the curve fitting based on those simulated choices. There was close correspondence between the actual and recovered threshold parameters (or AIPs) with a correlation of r = 0.97, p < 0.001 (see also Figure S1). In contrast, the slope parameter—which was not central to any of our analyses—exhibited greater variability during the initial fitting. This increased uncertainty likely contributed to its poor recovery in the simulation, as evidenced by a near-zero correlation (r = −0.01, p = 0.82).”

      (2) Along these lines, it would be helpful for the reader to actually see the individual psychometric curve, now how quickpsy was used (did you fit left and right asymptotes), etc, to understand how that fitting procedure works and how the assumptions of the fitting procedure compare to what can be gleaned through seeing the choice curves plotted.

      As stated above, we used default settings of the quickpsy() function and hence assumed symmetric asymptotes at 0 and 1. However, the reviewer mentions “left and right asymptotes”, so maybe this question is about restricting the possible parameter range for the threshold, which we restricted to values from 2 to 9, as described above.

      Regarding the individual curves, we have now include the following statement on page 9 in the Supplement: “Figures S2 to S31 show the individual psychometric curves that were estimated for each participant.” Please refer to the Supplement for the added figures.

      (3) A more full explanation of quickpsy, its parameters, and how choice curves look might also generate interesting further questions to think about with respect to biases and compulsivity. Two individuals might have similar indifference points, but an asymptote might reflect a bias to always have some percent chance of for example to take the reminders even at the lowest offer available for them.

      We agree that this is an interesting focus which we will keep in mind for future studies.

      (4) Regarding comparing OIP to AIP: 

      For OIP, as far as I can understand, the resolution of it is decreased compared to AIP.  Accuracies for OIP can only be 0/4,1/4,2/4,3/4, or 4/4. Yet, the resolution for AIP is the full range of offers (2 to 9) with respect to the parameter of interest (the indifference point). Could this bias the estimation of OIP (for instance, someone who scored 25% might actually be much closer to either 50 or 0, but we can't tell due to resolution?

      As mentioned in response to comment (1), we restricted the parameter range for the thresholds to 2 to 9 to increase comparability. The reviewer is right to point out that the OIP  still has lower resolution than the AIP, which is one of the downsides of having a shortened paradigm (cf. the longer version in Gilbert et al., 2019), which is optimised for online testing, especially if used in combination with additional questionnaires. We have no reason to believe though that this could have led to any bias, especially none that would contribute to the individual differences which are the main focus of our study.

      Gilbert, S. J., Bird, A., Carpenter, J. M., Fleming, S. M., Sachdeva, C., & Tsai, P.-C. (2020). Optimal use of reminders: Metacognition, effort, and cognitive offloading. Journal of Experimental Psychology: General, 149(3), 501–517. https://doi.org/10.1037/xge0000652

      (5) Additionally, it seems like the upper and lower bounds of OIP (0 and 10) differ from AIP (2 and 9). Could this also introduce bias (for example, if someone terrible performance, the mean would artificially be higher under AIP than OIP because the smallest indifference point is 2 under AIP, but could be 0 under OIP.

      See our response to comment (1), we fixed the range to 2 to 9 (which was the range of target values used in our study).

      (6) Finally seeing how CIT actually corresponds to accuracy overall (not a relative measure like AIP compared to OIP) I think would also be helpful as this is related to most points noted above.

      We included the suggested test as an exploratory analysis on pages 42-43 in the Supplement: “Third, we were interested in how the transdiagnostic phenotypes would correspond to performance. We therefore fitted a model which predicted internal accuracy (that is, unaided task performance on trials where no reminders could be used) from AD, CIT, and the other covariates (age, education and gender). We found that neither AD, β = -0.02, SE = 0.05, t = 0.44, p = 0.658, nor CIT, β = -0.03, SE = 0.05, t = -0.66, p = 0.510, predicted internal accuracy.

      The full results can be found in Table S13 as well as in Figure S32.”

    3. Reviewer #1 (Public review):

      Summary:

      Boldt et al test several possible relationships between trandiagnostically-defined compulsivity and cognitive offloading in a large online sample. To do so, they develop a new and useful cognitive task to jointly estimate biases in confidence and reminder-setting. In doing so, they find that over-confidence is related to less utilization of reminder-setting, which partially mediates the negative relationship between compulsivity and lower reminder-setting. The paper thus establishes that, contrary to the over-use of checking behaviors in patients with OCD, greater levels of transdiagnostically-defined compulsivity predicts less deployment of cognitive offloading. The authors offer speculative reasons as to why (perhaps it's perfectionism in less clinically-severe presentations that lowers the cost of expending memory resources), and sets an agenda to understand the divergence in cognitive between clinical and nonclinical samples. Because only a partial mediation had robust evidence, multiple effects may be at play, whereby compulsivity impacts cognitive offloading via overconfidence and also by other causal pathways.

      Strengths:

      The study develops an easy-to-implement task to jointly measure confidence and replicates several major findings on confidence and cognitive offloading. The study uses a useful measure of cognitive offloading - the tendency to set reminders to augment accuracy in the presence of experimentally manipulated costs. Moreover, the utilizes multiple measures of presumed biases -- overall tendency to set reminders, the empirically estimated indifference point at which people engage reminders, and a bias measure that compares optimal indifference points to engage reminders relative to the empirically observed indifference points. That the study observes convergenence along all these measures strengthens the inferences made relating compulsivity to the under-use of reminder-setting. Lastly, the study does find evidence for one of several a priori hypotheses and sets a compelling agenda to try to explain why such a finding diverges from an ostensible opposing finding in clinical OCD samples and the over-use of cognitive offloading.

      Weaknesses:

      Although I think this design and study are very helpful for the field, I felt that a feature of the design might reduce the tasks's sensitivity to measuring dispositional tendencies to engage cognitive offloading. In particular, the design introduces prediction errors, that could induce learning and interfere with natural tendencies to deploy reminder-setting behavior. These PEs comprise whether a given selected strategy will be or not be allowed to be engaged. We know individuals with compulsivity can learn even when instructed not to learn (e.g., Sharp, Dolan and Eldar, 2021, Psychological Medicine), and that more generally, they have trouble with structure knowledge (eg Seow et al; Fradkin et al), and thus might be sensitive to these PEs. Thus, a dispositional tendency to set reminders might be differentially impacted for those with compulsivity after an NPE, where they want to set a reminder, but aren't allowed to. After such an NPE, they may avoid moreso the tendency to set reminders. Those with compulsivity likely have superstitious beliefs about how checking behaviors lead to a resolution of catastrophes, that might in part originate from inferring structure in the presence of noise or from purely irrelevant sources of information for a given decision problem.<br /> It would be good to know if such learning effects exist, if they're modulated by PE (you can imagine PEs are higher if you are more incentivized - e.g., 9 points as opposed to only 3 points - to use reminders, and you are told you cannot use them), and if this learning effect confounds the relationship between compulsivity and reminder-setting.

      A more subtle point, I think this study can be more said to be an exploration than a deductive of test of a particular model -> hypothesis -> experiment. Typically, when we test a hypothesis, we contrast it with competing models. Here, the tests were two-sided because multiple models, with mutually exclusive predictions (over-use or under-use of reminders) were tested. Moreover, it's unclear exactly how to make sense of what is called the direct mechanism, which is supported by the partial (as opposed to complete) mediation.

      Comments on revisions:

      I have the following final comments for your manuscript revisions:

      To improve the clarity of the work, I suggest a final note to the authors to say more explicitly that objective accuracy has a finer resolution *due to the number of "special circles" per trial* in their task. This task detail got lost in my read of the manuscript, and confused me with respect to the resolution of each accuracy measure. Similarly for clarification, they could point out that their exclusion criteria removes subjects that have lower OIP than their AIP analysis allows (which is good for comparison between OIP and AIP). Thus, it removes the possibility that very poor performing subjects (OIP) are forced to have a higher than actual AIP due to the range).

    1. eLife Assessment

      You et al. present an important study that applied a high-resolution transposon-based barcoding system to show the clonal contribution of hematopoietic stem and progenitor cells during aging, after 5-FU treatment, and upon transplantation. The results are convincing and show that there are different categories of multipotent progenitors that are either active or indolent, and that long-term fates are dominated by clones that either favor differentiation or self-renewal. This study will be of broad interest to stem-cell biologists and could reach an even wider audience with a clearer and more concise presentation and discussion of the results.

    2. Reviewer #1 (Public review):

      Summary:

      You, Zhang et al. comprehensively characterize the long-term fates of mouse HSCs in the unperturbed setting using transposon-based lineage tracing for up to 2 years post-labeling. Their analyses reveal a complex heterogeneity of long-term fates, dominated by two behaviors: i) long-lived differentiation-biased clones, and ii) self-renewal & platelet-biased clones. They further identify two categories of multipotent progenitor clones, with one group showing a markedly reduced differentiation activity.

      Strengths:

      You et al. present a very comprehensive and high-resolution characterization of mouse hematopoietic clonal dynamics, with robust replicates, and technical prowess. The manuscript is beautifully written, with in-depth and clear explanations of the logic behind experimental design choices, and very well-thought-out interpretations of results.

      Some of the results integrate well with past observations in the field, whereas many of them are quite unique and novel.

      This will surely be a highly impactful study in the field of hematopoiesis and stem cell biology.

      Weaknesses:

      The authors trace hematopoiesis in situ, in a fully unbiased way for almost 2-years. They compare this time course with the last few years of Cre-LoxP-based tracing studies and they make an assumption that most hematopoiesis will be derived from some type of HSC at that point in time. They then use this assumption to support that what is being measured in their model are the long-term fates of HSCs (or at least cells that were HSC at the point of labeling). While this is a generally valid assumption, the short-lived nature of certain populations (myeloid cells, megakaryocytes) means that these cells are being produced in the context of a relatively aged environment by the time of sampling, which might change the properties of the system. In other words, the "steady-state" is always changing. It is important to read and interpret this manuscript with this in consideration.

    3. Reviewer #2 (Public review):

      Summary:

      The work from You et al. elucidates the clonal contribution of ageing stem and progenitor cells to both native and perturbed hematopoiesis. The authors use a previously published in vivo lineage tracing system (Patel et al., 2022) that relies on the random integration of a transposon element in the mouse genome. They barcode all mouse cells and then look at lineage relationships between HSPC and mature populations after ~90 weeks.

      Strengths:

      This work offers very interesting insights into the clonal behaviour of HSPC in the native and perturbed setting during ageing. Experiments are well-planned and well-executed. Understanding the clonal output of HSPCs in aged mice in a native setting, after 5-FU treatment, and upon transplantation are important findings for the field.

      Weaknesses:

      We found appraising the graphs, interpreting the findings, and understanding those findings in the main text very difficult to follow. While we have made some suggestions below, we encourage the authors to think carefully about what the core messages are, and how best to visualise those, both in terms of data viz and in a schematic to summarise the key findings, and to use plain language in the text.

    4. Author response:

      We genuinely appreciate the reviewers' interest and recognition of our work. The comments and suggestions on the results presentation and interpretation are well taken. We plan to revise the manuscript based on the reviewers' recommendations in the following aspects.

      (1) We fully agree with the reviewer that the aged environment indeed would affect the myeloid and megakaryocyte differentiation behaviors of HSC. As a result, the clonal behaviors of HSCs presented in the current manuscript could be different from how HSCs differentiate in young mice. This point will be discussed in the revised manuscript.

      (2) We agree with the reviewer that the manuscript was not as easy to follow as many other papers in experimental hematology, primarily because the analyses presented in the current manuscript were not frequently used in previous studies. To address this, we will try to revise the manuscript using plain language to describe the results and conclusions. We will also provide graphical summary schematics where appropriate to present the findings better. We will further discuss our results in the context of previous findings to better illustrate the novelty of the current work.

      (3) We will provide more technical details of our analysis in the revised manuscript for readers to better understand how results are obtained and data analyses are performed in the current manuscript.

    1. eLife Assessment

      This study reports important advances in understanding how pyrazinamide, a first-line antibiotic for tuberculosis treatment, is effective in vivo. The experimental design and data provide solid evidence that the production of reactive oxygen species by host cells contributes to how pyrazinamide is more potent in the host than in culture conditions; however, additional experiments and controls would strengthen these conclusions. This work is of interest to the antibiotic drug development field.

    2. Reviewer #1 (Public review):

      Summary

      Pyrazinamide (PZA) is a key drug in the anti-TB arsenal, yet despite over 50 years of clinical use, its precise mechanism of action remains unclear. This study offers valuable insights into the in vitro potentiating effect of PZA when used with exogenous oxidative agents. The authors suggest that oxidative stress, specifically thiol oxidation, may be a primary driver of PZA/POA's bactericidal activity. Although the work is substantial, conceptually innovative, and timely, the evidence supporting the authors' conclusions requires further investigation with additional controls and experiments to fully validate the proposed mechanism of action. Once revised, this work will undoubtedly be of significant interest to the TB drug discovery community and researchers focusing on mycobacterial diseases.

      Strengths

      The authors have long-standing experience in the field of PZA mode of action, with several publications that have been highly relevant to the field. They are particularly well aware of the literature, and this is clearly visible in the introduction of the manuscript which is beautifully articulated. The biological question(s) and their hypotheses are also well-formulated in the introduction section.

      The understanding of PZA mode of action is a long-lasting question in the TB community, therefore studies reporting well-conducted research that aims at deciphering the underlying mechanism responsible for PZA peculiar activity is always appreciated. Since PZA/POA are poorly active in conventional 7H9 media, but very potent in cellulo or in vivo; looking at host-mediated stress that can eventually lead to an increased vulnerability is extremely relevant. In that context, most of the work has been focused on host-cell endolysosomal pH but very little information is available on other stress. Thus, investigating the contribution of oxidative stress and ROS as specific host environments that might contribute to PZA/POA activity is overall novel and conceptually very interesting.

      To address this question, the authors combine multiple approaches including conventional antimicrobial susceptibility profiling, CFU-based counting, and checkerboard assays to report the potentiating effect of PZA pre-treatment on hydrogen peroxide- and diamide-mediated antibacterial action. The use of multiple reference strains including Mtb H37Ra, Mtb H37Rv, M.bovis BCG, and M.bovis BCG::pncA is a great asset of the manuscript, even though they might have been more appropriately used to get further mechanistical insights on the proposed model of action. The findings are reported in 4 major figures that are clear and in an order that appears logical for the understanding of the story.

      Weaknesses

      Although the manuscript is conceptually very interesting and contains intriguing results, it sometimes fails to fully convince and some additional controls/experiments might help to better back up the authors' claims and really strengthen the study. Indeed, some conclusions seem premature therefore leading to some molecular assumptions regarding a potential mode of action that is not fully supported by the presented data.<br /> The rationale behind some of the experiments is not always clearly explained which makes difficult to follow the authors ideas, the biological hypothesis/model that they test, and therefore the overall scientific story.

      The authors conclude their study by proposing a mechanism by which the active form of the drug POA acts in concert with exogenous ROS to promote cellular oxidative damage. This is tested within two models of macrophage infection where they propose that IFN-γ mediated ROS production is essential for PZA activity. Unfortunately, the in cellulo part presents some weaknesses and inconsistencies that the authors need to carefully address.

      Finally, the in vitro experiments performed in this manuscript mainly report that PZA pre-treatment increases H2O2-mediated killing or inhibition. There is no direct evidence that clearly shows that oxidative stress drives the potent bactericidal activity of PZA. In these settings, the oxidative stress is always applied after PZA pre-treatment and is therefore likely displaying the major lethal effect.

    3. Reviewer #2 (Public review):

      Summary:

      The authors tested how ROS and PZA affected Mycobacterium survival to determine if ROS could have a role in the remarkable in vivo efficacy of PZA.

      Strengths:

      This is a well-written and clear manuscript convincingly demonstrating the synergy between PZA and reactive oxygen species in the inhibition of growth and survival of Mycobacterium tuberculosis.

      Weaknesses:

      The manuscript would benefit from a clear statement of the rationale for the protocols used to examine the synergy of PZA with ROS, the possible models their protocols could be testing, and then how their data supports or disproves the models being tested. The manuscript appears to propose, as stated in the title, that "Oxidative stress drives potent bactericidal activity of pyrazinamide...". However their experimental design more likely tests the effect of PZA on ROS sensitivity. Indeed, by the last figure, the authors begin the present their data as PZA sensitizing the bacteria to ROS. More clarity on these possible models and the different interpretations of the data should be considered.

      Impact:

      The data provide important insight to expand our understanding of the in vivo efficacy of PZA in the treatment of tuberculosis.

    4. Author response:

      We thank the reviewers for their thoughtful and constructive assessment of our manuscript. We agree that additional clarity on some key points in the manuscript will be valuable additions to this work. Both reviewers expressed a related concern regarding the basis for design and interpretation of our pyrazinamide ROS synergy experiments. 

      Reviewer 1:

      The in vitro experiments performed in this manuscript mainly report that PZA pre-treatment increases H2O2-mediated killing or inhibition. There is no direct evidence that clearly shows that oxidative stress drives the potent bactericidal activity of PZA. In these settings the oxidative stress is always applied after PZA pre-treatment and is therefore likely displaying the major lethal effect.

      Reviewer 2:

      The manuscript would benefit from a clear statement of the rationale for the protocols used to examine the synergy of PZA with ROS, the possible models their protocols could be testing, and then how their data supports or disproves the models being tested. The manuscript appears to propose, as stated in the title, that "Oxidative stress drives potent bactericidal activity of pyrazinamide...". However their experimental design more likely tests the effect of PZA on ROS sensitivity. Indeed, by the last figure, the authors begin the present their data as PZA sensitizing the bacteria to ROS. More clarity on these possible models and the different interpretations of the data should be considered.

      We agree that the data presented in the current version of the manuscript is incomplete in supporting our assertion that oxidative stress drives bactericidal activity of pyrazinamide. As both reviewers note, pretreatment of bacilli with pyrazinamide followed by challenge with ROS indicates that pyrazinamide enhances susceptibility to oxidative stress but does not address whether oxidative stress enhances susceptibility to pyrazinamide. Further, we neglected to provide information regarding why we chose to pretreat bacilli with pyrazinamide before ROS exposure. Over the course of our work, we had found that pyrazinoic acid, the active form of pyrazinamide, showed potent synergy with hydrogen peroxide.  In contrast to the time-dependent synergy that we observed between pyrazinamide and peroxide, synergy between pyrazinoic acid and peroxide did not require pretreatment. We will revise our manuscript to include results that address these key issues and we will carefully consider revising our interpretations accordingly.

    1. eLife Assessment

      This study demonstrated that the conditional knockout of afadin disrupts retinal laminar organization and reduced number of photoreceptors while preserving some of the structure and light responsiveness of retinal ganglion cells. These findings are solid and useful for understanding afadin's role in retinal cell generation, lamination, and functional organization. However, the study provides limited new insights into the relationship between retinal lamination defects and overall retinal function.

    2. Reviewer #1 (Public review):

      Summary:

      The question of how central nervous system (CNS) lamination defects affect functional integrity is an interesting topic, though it remains a subject of debate. The authors focused on the retina, which is a relatively simple yet well-laminated tissue, to investigate the impact of afadin - a key component of adherens junctions on retinal structure and function. Their findings show that the loss of afadin leads to significant disruptions in outer retinal lamination, affecting the morphology and localization of photoreceptors and their synapses, as illustrated by high-quality images. Despite these severe changes, the study found that some functions of the retinal circuits, such as the ability to process light stimuli, could still be partially preserved. This research offers new insights into the relationship between retinal lamination and neural circuit function, suggesting that altered retinal morphology does not completely eliminate the capacity for visual information processing.

      Strengths:

      The retina serves as an excellent model for investigating lamination defects and functional integrity due to its relatively simple yet well-organized structure, along with the ease of analyzing visual function. The images depicting outer retinal lamination, as well as the morphology and localization of photoreceptors and their synapses, are clear and well-described. The paper is logically organized, progressing from structural defects to functional analysis. Additionally, the manuscript includes a comprehensive discussion of the findings and their implications.

      Weaknesses:

      While this work presents a wealth of descriptive data, it lacks quantification, which would help readers fully understand the findings and compare results with those from other studies. Furthermore, the molecular mechanisms underlying the defects caused by afadin deletion were not explored, leaving the role of afadin and its intracellular signaling pathways in retinal cells unclear. Finally, the study relied solely on electrophysiological recordings to demonstrate RGC function, which may not be robust enough to support the conclusions. Incorporating additional experiments, such as visual behavior tests, would strengthen the overall conclusions.

    3. Reviewer #2 (Public review):

      Summary:

      Ueno et al. described substantial changes in the afadin knockout retina. These changes include decreased numbers of rods and cones, an increased number of bipolar cells, and disrupted somatic and synaptic organization of the outer limiting membrane, outer nuclear layer, and outer plexiform layer. In contrast, the number and organization of amacrine cells and retinal ganglion cells remain relatively intact. They also observed changes in ERG responses and RGC receptive fields and functions using MEA recordings.

      Strengths:

      The morphological characterization of retinal cell types and laminations is detailed and relatively comprehensive.

      Weaknesses:

      (1) The major weakness of this study, perhaps, is that its findings are predominantly descriptive and lack any mechanistic explanation. As afadin is key component of adherent junctions, its role in mediating retinal lamination has been reported previously (see PMCID: PMC6284407). Thus, a more detailed dissection of afadin's role in processes, such as progenitor generation, cell migration, or the formation of retinal lamination would provide greater insight into the defects caused by knocking out afadin.

      (2) The authors observed striking changes in the numbers of rods, cones, and BCs, but not in ACs or RGCs. The causes of these distinct changes in specific cell classes remain unclear. Detailed characterizations, such as the expression of afadin in early developing retina, tracing cell numbers across various early developmental time points, and staining of apoptotic markers in developing retinal cells, could help to distinguish between defects in cell generation and survival, providing a better understand of the underlying causes of these phenotypes.

      (3) Although the total number of ACs or RGCs remains unchanged, their localizations are somewhat altered (Figures 2E and 4E). Again, the cause of the altered somatic localization in ACs and RGCs is unclear.

      (4) One conclusion that the authors emphasise is that the function of RGCs remains detectable despite a major disrupted outer plexiform layer. However, the organization of the inner plexiform layer remains largely intact, and the axonal innervation of BCs remains unchanged. This could explain the function integrity of RGCs. In addition, the resolution of detecting RGCs by MEA is low, as they only detected 5 clusters in heterozygous animals. This represents an incomplete clustering of RGC functional types and does not provide a full picture of how functional RGC types are altered in the afadin knockout.

      Minor Comments:

      (1) Line 56-67: "Overall, these findings provide the first evidence that retinal circuit function can be partially preserved even when there are significant disruptions in retinal lamination and photoreceptor synapses" There is existing evidence showing substantial adaption in retinal function when retinal lamination or photoreceptor synapses are disrupted, such as PMCID: PMC10133175.

      (2) Line 114-115: "we focused on afadin, which is a scaffolding protein for nectin and has no ortholog in mice." The term "Ortholog" is misused here, as the mouse has an afadin gene. Should the intended meaning be that afadin has no other isoforms in mouse?

    4. Author response:

      Reviewer #1 (Public review):

      Summary:

      The question of how central nervous system (CNS) lamination defects affect functional integrity is an interesting topic, though it remains a subject of debate. The authors focused on the retina, which is a relatively simple yet well-laminated tissue, to investigate the impact of afadin - a key component of adherens junctions on retinal structure and function. Their findings show that the loss of afadin leads to significant disruptions in outer retinal lamination, affecting the morphology and localization of photoreceptors and their synapses, as illustrated by high-quality images. Despite these severe changes, the study found that some functions of the retinal circuits, such as the ability to process light stimuli, could still be partially preserved. This research offers new insights into the relationship between retinal lamination and neural circuit function, suggesting that altered retinal morphology does not completely eliminate the capacity for visual information processing.

      Strengths:

      The retina serves as an excellent model for investigating lamination defects and functional integrity due to its relatively simple yet well-organized structure, along with the ease of analyzing visual function. The images depicting outer retinal lamination, as well as the morphology and localization of photoreceptors and their synapses, are clear and well-described. The paper is logically organized, progressing from structural defects to functional analysis. Additionally, the manuscript includes a comprehensive discussion of the findings and their implications.

      Weaknesses:

      While this work presents a wealth of descriptive data, it lacks quantification, which would help readers fully understand the findings and compare results with those from other studies. Furthermore, the molecular mechanisms underlying the defects caused by afadin deletion were not explored, leaving the role of afadin and its intracellular signaling pathways in retinal cells unclear. Finally, the study relied solely on electrophysiological recordings to demonstrate RGC function, which may not be robust enough to support the conclusions. Incorporating additional experiments, such as visual behavior tests, would strengthen the overall conclusions.

      Thank you very much for taking the time and thoughtful and valuable comments. Following your suggestions, we will quantify some of the histological data and explore the mechanisms underlying the defects of lamination and cell fate determination observed in afadin cKO retina. We will also try to examine the vision of afadin cKO mice by visual behavior tests.

      Reviewer #2 (Public review):

      Summary:

      Ueno et al. described substantial changes in the afadin knockout retina. These changes include decreased numbers of rods and cones, an increased number of bipolar cells, and disrupted somatic and synaptic organization of the outer limiting membrane, outer nuclear layer, and outer plexiform layer. In contrast, the number and organization of amacrine cells and retinal ganglion cells remain relatively intact. They also observed changes in ERG responses and RGC receptive fields and functions using MEA recordings.

      Strengths:

      The morphological characterization of retinal cell types and laminations is detailed and relatively comprehensive.

      Weaknesses:

      (1) The major weakness of this study, perhaps, is that its findings are predominantly descriptive and lack any mechanistic explanation. As afadin is key component of adherent junctions, its role in mediating retinal lamination has been reported previously (see PMCID: PMC6284407). Thus, a more detailed dissection of afadin's role in processes, such as progenitor generation, cell migration, or the formation of retinal lamination would provide greater insight into the defects caused by knocking out afadin.

      Thank you for taking the time and valuable comments. Following your suggestions, we will perform experiments to evaluate mechanisms of retinal lamination and cell fate determination defects observed in the afadin cKO retina. However, we would like to note that the paper cited in the comment (PMCID: PMC6284407) analyzed the function of afadin in the formation of dendrites of direction selective RGCs in the IPL, and that the word "lamination" refers to the layering of RGC dendrites in the IPL. Here, we analyzed the function of afadin in laminar construction of the retina.

      (2) The authors observed striking changes in the numbers of rods, cones, and BCs, but not in ACs or RGCs. The causes of these distinct changes in specific cell classes remain unclear. Detailed characterizations, such as the expression of afadin in early developing retina, tracing cell numbers across various early developmental time points, and staining of apoptotic markers in developing retinal cells, could help to distinguish between defects in cell generation and survival, providing a better understand of the underlying causes of these phenotypes.

      Following your suggestion, we will perform the experiments to characterize the causes of distinct changes in the afadin cKO retina.

      (3) Although the total number of ACs or RGCs remains unchanged, their localizations are somewhat altered (Figures 2E and 4E). Again, the cause of the altered somatic localization in ACs and RGCs is unclear.

      To clarify the reviewer’s point, we will analyze the progenitor and those cell positions in the developing stage of the afadin cKO retina.

      (4) One conclusion that the authors emphasise is that the function of RGCs remains detectable despite a major disrupted outer plexiform layer. However, the organization of the inner plexiform layer remains largely intact, and the axonal innervation of BCs remains unchanged. This could explain the function integrity of RGCs. In addition, the resolution of detecting RGCs by MEA is low, as they only detected 5 clusters in heterozygous animals. This represents an incomplete clustering of RGC functional types and does not provide a full picture of how functional RGC types are altered in the afadin knockout.

      We appreciate the reviewer’s insightful comments. Although our clustering of RGC subtypes in afadin cHet retinas resulted in only five clusters, the key finding of our study is the preservation of RGC receptive fields in afadin cKO retinas, despite severe photoreceptor loss (reduced to about one-third of normal) and disruption of photoreceptor-bipolar cell synapses in the OPL. This suggests that even with crucial damage to the OPL, the primary photoreceptor-bipolar-RGC pathway can still function as long as the IPL remains intact. Moreover, the presence of rod-driven responses in RGCs indicates that the AII amacrine cell-mediated rod pathway may also continue to function. We agree that our functional clustering in afadin cHet retinas was incomplete. However, we guess that the absence of RGCs with fast temporal responses in afadin cKO retinas may not simply due to the loss of specific RGC subtypes but due to disrupted synaptic connections between photoreceptors and fast-responding bipolar cells. Furthermore, the structural abnormalities in retinal lamination in afadin cKO retinas may alter RGC response properties, making strict functional classification less meaningful. We would like to emphasize the finding that disruption of the retinal lamination in afadin cKO retinas leads to the absence of RGCs with fast temporal response properties, rather than focusing solely on the classification of RGC subtypes.

      Minor Comments:

      (1) Line 56-67: "Overall, these findings provide the first evidence that retinal circuit function can be partially preserved even when there are significant disruptions in retinal lamination and photoreceptor synapses" There is existing evidence showing substantial adaption in retinal function when retinal lamination or photoreceptor synapses are disrupted, such as PMCID: PMC10133175.

      Thank you for your comment. The paper you mentioned is crucial for discussing and considering the results of our study. We will refer the paper and mention in Discussion.  

      (2) Line 114-115: "we focused on afadin, which is a scaffolding protein for nectin and has no ortholog in mice." The term "Ortholog" is misused here, as the mouse has an afadin gene. Should the intended meaning be that afadin has no other isoforms in mouse?

      Thank you for pointing it out. As we misused "Ortholog" as "Paralog", we will revise it.

    1. eLife Assessment

      This useful study integrates experimental methods from materials science with psychophysical methods to investigate how frictional stabilities influence tactile surface discrimination. The authors argue that force fluctuations arising from transitions between frictional sliding conditions facilitate the discrimination of surfaces with similar friction coefficients. However, the reliance on friction data obtained from an artificial finger, together with the ambiguous correlative analyses relating these measurements to human psychophysics, renders the findings incomplete.

    2. Reviewer #1 (Public review):

      Summary:

      In this paper, Derkaloustian et. al look at the important topic of what affects fine touch perception. The observations that there may be some level of correlation with instabilities are intriguing. They attempted to characterize different materials by counting the frequency (occurrence #, not of vibration) of instabilities at various speeds and forces of a PDMS slab pulled lengthwise over the material. They then had humans make the same vertical motion to discriminate between these samples. They correlated the % correct in discrimination with differences in frequency of steady sliding over the design space as well as other traditional parameters such as friction coefficient and roughness. The authors pose an interesting hypothesis and make an interesting observation about the occurrences of instability regimes in different materials while in contact with PDMS, which is interesting for the community to see in the publication. It should be noted that the finger is complex, however, and there are many factors that may be quite oversimplified with the use of the PDMS finger, and the consideration and discounting of other parameters are not fully discussed in the main text or SI. Most importantly, however, the conclusions as stated do not align with the primary summary of the data in Figure 2.

      Strengths:

      The strength of this paper is in its intriguing hypothesis and important observation that instabilities may contribute to what humans are detecting as differences in these apparently similar samples.

      Weaknesses:

      The most important weakness is that the findings do not support the statements of findings made in the abstract. Of specific note in this regard is the primary correlation in Figure 2B between SS (steady sliding) and percent correct discrimination. While the statistical test shows significance (and is interesting!), the R-squared value is 0.38, while the R-squared value for the "Friction Coefficient vs. Percent Correct" plot has an R-squared of 0.6 and a p-value of < 0.01 (including Figure 2B). This suggests that the results do not support the claim in the abstract: "We found that participant accuracy in tactile discrimination was most strongly correlated with formations of steady sliding, and response times were negatively correlated with stiction spikes. Conversely, traditional metrics like surface roughness or average friction coefficient did not predict tactile discriminability." This is the most fundamental weakness of this paper.

      Along the same lines, other parameters that were considered such as the "Percent Correct vs. Difference in Sp" and "Percent Correct vs. Difference in SFW" were not plotted for consideration in the SI. It would be helpful to compare these results with the other three metrics in order to fully understand the relationships. Other parameters such as stiction magnitude and differences in friction coefficient over the test space could also be important and interesting.

      Beyond this fundamental concern, there is a weakness in the representativeness of the PDMS finger, the vertical motion, and the speed of sliding to real human exploration. The real finger has multiple layers with different moduli. In fact, the stratum corneum cells, which are the outer layer at the interface and determine the friction, have a much higher modulus than PDMS. In addition, the slanted position of the finger can cause non-uniform pressures across the finger. Both can contribute to making the PDMS finger have much more stick-slip than a real finger. In fact, if you look at the regime maps, there is very little space that has steady sliding. This does not represent well human exploration of surfaces. We do not tend to use a force and velocity that will cause extensive stick-slip (frequent regions of 100% stick-slip) and, in fact, the speeds used in the study are on the slow side, which also contributes to more stick-slip. At higher speeds and lower forces, all of the materials had steady sliding regions. Further, on these very smooth surfaces, the friction and stiction are more complex and cannot dismiss considerations such as finger material property change with sweat pore occlusion and sweat capillary forces. Also, the vertical motion of both the PDMS finger and the instructed human subjects is not the motion that humans typically use to discriminate between surfaces. Finally, fingerprints may not affect the shape and size of the contact area, but they certainly do affect the dynamic response and detection of vibrations.

      This all leads to the critical question, why are friction, normal force, and velocity not measured during the measured human exploration and in a systematic study using the real human finger? The authors posed an extremely interesting hypothesis that humans may alter their speed to feel the instability transition regions. This is something that could be measured with a real finger but is not likely to be correlated accurately enough to match regime boundaries with such a simplified artificial finger.

    3. Reviewer #2 (Public review):

      Summary:

      In this paper, the authors want to test the hypothesis that frictional instabilities rather than friction are the main drivers for discriminating flat surfaces of different sub-nanometric roughness profiles.

      They first produced flat surfaces with 6 different coatings giving them unique and various properties in terms of roughness (picometer scale), contact angles (from hydrophilic to hydrophobic), friction coefficient (as measured against a mock finger), and Hurst exponent.

      Then, they used those surfaces in two different experiments. In the first experiment, they used a mock finger (PDMS of 100kPA molded into a fingertip shape) and slid it over the surfaces at different normal forces and speeds. They categorized the sliding behavior as steady sliding, sticking spikes, and slow frictional waves by visual inspection, and show that the surfaces have different behaviors depending on normal force and speed. In a second experiment, participants (10) were asked to discriminate pairs of those surfaces. It is found that each of those pairs could be reliably discriminated by most participants.

      Finally, the participant's discrimination performance is correlated with differences in the physical attributes observed against the mock finger. The authors found a positive correlation between participants' performances and differences in the count of steady sliding against the mock finger and a negative correlation between participants' reaction time and differences in the count of stiction spikes against the mock finger. They interpret those correlations as evidence that participants use those differences to discriminate the surfaces.

      Strengths:

      The created surfaces are very interesting as they are flat at the nanometer scale, yet have different physical attributes and can be reliably discriminated.

      Weaknesses:

      In my opinion, the data presented in the paper do not support the conclusions. The conclusions are based on a correlation between results obtained on the mock finger and results obtained with human participants but there is no evidence that the human participants' fingertips will behave similarly to the mock finger during the experiment. Figure 3 gives a hint that the 3 sliding behaviors can be observed in a real finger, but does not prove that the human finger will behave as the mock finger, i.e., there is no evidence that the phase maps in Figure 1C are similar for human fingers and across different people that can have very different stiffness and moisture levels.

      I believe that the authors collected the contact forces during the psychophysics experiments, so this shortcoming could be solved if the authors use the actual data, and show that the participant responses can be better predicted by the occurrence of frictional instabilities than by the usual metrics on a trial by trial basis, or at least on a subject by subject basis. I.e. Poor performers should show fewer signs of differences in the sliding behaviors than good performers.

      The sample size (10) is very small.

    4. Reviewer #3 (Public review):

      Strengths:

      The paper describes a new perspective on friction perception, with the hypothesis that humans are sensitive to the instabilities of the surface rather than the coefficient of friction. The paper is very well written and with a comprehensive literature survey.

      One of the central tools used by the author to characterize the frictional behavior is the frictional instabilities maps. With these maps, it becomes clear that two different surfaces can have both similar and different behavior depending on the normal force and the speed of exploration. It puts forward that friction is a complicated phenomenon, especially for soft materials.

      The psychophysics study is centered around an odd-one-out protocol, which has the advantage of avoiding any external reference to what would mean friction or texture for example. The comparisons are made only based on the texture being similar or not.

      The results show a significant relationship between the distance between frictional maps and the success rate in discriminating two kinds of surface.

      Weaknesses:

      The main weakness of the paper comes from the fact that the frictional maps and the extensive psychophysics study are not made at the same time, nor with the same finger. The frictional maps are produced with an artificial finger made out of PDMS which is a poor substitute for the complex tribological properties of skin.

      The evidence would have been much stronger if the measurement of the interaction was done during the psychophysical experiment. In addition, because of the protocol, the correlation is based on aggregates rather than on individual interactions.

      The authors compensate with a third experiment where they used a 2AFC protocol and an online force measurement. But the results of this third study, fail to convince the relation.

      No map of the real finger interaction is shown, bringing doubt to the validity of the frictional map for something as variable as human fingers.

    1. eLife Assessment

      This valuable study offers insights into the role of Leiomodin-1 (LMOD1) in muscle stem cell biology, advancing our understanding of myogenic differentiation and indicating LMOD1 as a regulator of muscle regeneration, aging, and exercise adaptation. The integration of in vitro and in vivo approaches, complemented by proteomic and imaging methodologies, is solid. However, certain aspects require further attention to improve the clarity, impact, and overall significance of the work, particularly in substantiating the in vivo relevance. This work will provide a starting point that will be of value to medical biologists and biochemists working on LMOD and its variants in muscle biology.

    2. Reviewer #1 (Public review):

      This manuscript by Ori and colleagues investigates the role of Lmod1 in muscle stem cell activation and differentiation. The study begins with a time-course mass spectrometry analysis of primary muscle stem cells, identifying Lmod1 as a pro-myogenic candidate (Figure 1). While the initial approach is robust, the subsequent characterization lacks depth and clarity. Although the data suggest that Lmod1 promotes myogenesis, the underlying mechanisms remain vague, and key experiments are missing. Please find my comments below.

      (1) The authors mainly rely on coarse and less-established readouts such as myotube length and spherical Myh-positive cells. More comprehensive and standard analyses, such as co-staining for Pax7, MyoD, and Myogenin, would allow quantification of quiescent, activated, and differentiating stem cells in knockdown and overexpression experiments. The exact stage at which Lmod1 functions (stem cell, progenitor, or post-fusion) is unclear due to the limited depth of the analysis. Performing similar experiments on cultured single EDL fibers would add valuable insights.

      (2) In supplementary Figure 2E, the distinction between Hoechst-positive cells and total cell counts is unclear. The authors should clarify why Hoechst-positive cells increase and relabel "reserve cells," as the term is confusing without reading the legend.

      (3) The specificity of Lmod1 and Sirt1 immunostaining needs validation using siRNA-treated samples, especially as these data form the basis of the mechanistic conclusions.

      (4) The authors must test the effect of Lmod1 siRNA on Sirt1 localization, as only overexpression experiments are shown.

      (5) In Figure S3, the biotin signal in LMOD2 samples appears weak. The authors need to address whether comparing LMOD1 and LMOD2 is valid given the apparent difference in reaction efficiency. It would also help to highlight where Sirt1 falls on the volcano plot in S3B.

      (6) The immunostaining data suggest that Lmod1 remains cytoplasmic throughout differentiation, whereas Sirt1 shows transient cytoplasmic localization at day 1 of differentiation. The authors should explain why Sirt1 is not constantly sequestered if Lmod1's cytoplasmic localization is consistent. It is also unclear whether day 1 is the key time point for Lmod1 function, as its precise role during myogenesis remains ambiguous.

      (7) The introduction does not sufficiently establish the motivation or knowledge gap this work aims to address. Instead, it reads like a narration of disparate topics in a single paragraph. The authors should clarify the statement in line 150, "since this protein has been...,".

      Overall, while the identification of Lmod1 as a pro-myogenic factor is convincing, the mechanistic insights are insufficient, and the manuscript would benefit from addressing these concerns.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors identify Leiomodin-1 (LMOD1) as a key regulator of early myogenic differentiation, demonstrating its interaction with SIRT1 to influence SIRT1's cellular localization and gene expression. The authors propose that LMOD1 translocates SIRT1 from the nucleus to the cytoplasm to permit the expression of myogenic differentiating genes such as MYOD or Myogenin.

      Strengths:

      A major strength of this work lies in the robust temporal resolution achieved through a time-course mass spectrometry analysis of in vitro muscle differentiation. This provides novel insights into the dynamic process of myogenic differentiation, often under-explored in terms of temporal progression. The authors provide a strong mechanistic case for how LMOD1 exerts its role in muscle differentiation which opens avenues to modulate.

      Weaknesses:

      One limitation of the study is the in vivo data. Although the authors do translate their findings in vivo for LMOD1 localization and expression, the cross-sectional imaging is not highly convincing. Longitudinal cuts or isolated fibers could have been more useful specimens to answer these questions. Moreover, the authors do not assess their in vitro SIRT1 findings in vivo. A few key experiments in regenerating or aged mice would strengthen the mechanistic insight of the findings.

      Discussion:

      Overall, the study emphasizes the importance of understanding the temporal dynamics of molecular players during myogenic differentiation and provides valuable proteomic data that will benefit the field. Future studies should explore whether LMOD1 modulates the nuclear-cytoplasmic shuttling of other transcription factors during muscle development and how these processes are mechanistically achieved. Investigating whether LMOD1 can be therapeutically targeted to enhance muscle regeneration in contexts such as exercise, aging, and disease will be critical for translational applications. Additionally, elucidating the interplay among LMOD1, LMOD2, and LMOD3 could uncover broader implications for actin cytoskeletal regulation in muscle biology.

    4. Reviewer #3 (Public review):

      Summary:

      In this manuscript, the investigators identified LMOD1 as one of a subset of cytoskeletal proteins whose levels increase in the early stages of myogenic differentiation. Lmod1 is understudied in striated muscle and in particular in myogenic differentiation. Thus, this is an important study. It is also a very thorough study - with perhaps even too much data presented. Importantly, the investigators observed that LMOD1 appears to be important for skeletal regeneration, and myogenic differentiation and that it interacts with SIRT1. Both primary myoblast differentiation and skeletal muscle regeneration were studied. Rescue experiments confirmed these observations: SIRT1 can rescue perturbations of myogenic differentiation as a result of LMOD1 knockdown.

      Strengths:

      Particular strengths include: important topic, the use of primary skeletal cultures, the use of both cell culture and in vivo approaches, careful biomarker analysis of primary mouse myoblast differentiation, the use of two methods to probe the function of the Lmod1/SIRT1 pathway via using depletion approaches and inhibitors, and generation of six independent myoblast cultures. Results support their conclusions.

      Weaknesses:

      (1) Figure 1. Images of cells in Figure 1A are too small to be meaningful (especially in comparison to the other data presented in this figure). Perhaps the authors could make graphs smaller?

      (2) Line 148 "We found LMOD2 to be the most abundant Lmod in whole skeletal muscle." This is confusing since most if not all prior studies have shown that Lmod3 is the predominant isoform in skeletal muscle. The two papers that are cited are incorrectly cited. Clarification to resolve this discrepancy is needed.

      (3) Figure 2. Immunoflorescence (IF) panels are too small to be meaningful. Perhaps the graphs could be made smaller and more space allocated for the IF panels? This issue is apparent for just about all IF panels - they are simply too small to be meaningful. Additionally, in many of the immunofluorescence figures, the colors that were used make it difficult to discern the stained cellular structures. For example in Figure S1, orange and purple are used - they do not stand out as well as other colors that are more commonly used.

      (4) There is huge variability in many experiments presented - as such, more samples appear to be required to allow for meaningful data to be obtained. For example, Figure S2. Many experimental groups, only have 3 samples - this is highly problematic - I would estimate that 5-6 would be the minimum.

      (5) Ponceau S staining is often used as a loading control in this manuscript for western blots. The area/molecular weight range actually used should be specified. Not clear why in some experiments GAPDH staining is used, in other experiments Ponceau S staining is used, and in some, both are used. In some experiments, the variability of total protein loaded from lane to lane is disconcerting. For example, in Figure S4C there appears to be more than normal variability. Can the protein assay be redone and samples run again?

      (6) Figure S3 - Lmod3 is included in the figure but no mention of it occurs in the title of the figure and/or legend.

      (7) Abstract, line 25. "overexpression accelerates and improves the formation of myotubes". This is a confusing sentence. How is it improving the formation? A little more information about how they are different than developing myotubes in normal/healthy muscles would be helpful.

      (8) It is impossible from the IF figures presented to determine where Lmod1 localizes in the myocytes. Information on its subcellular localization is important. Does it localize with Lmod2 and Lmod3 at thin filament pointed ends?

    1. eLife Assessment

      ProtSSN is a valuable approach that generates protein embeddings by integrating sequence and structural information, demonstrating improved prediction of mutation effects on thermostability compared to competing models. The evidence supporting the authors' claims is compelling, with well-executed comparisons. This work will be of particular interest to researchers in bioinformatics and structural biology, especially those focused on protein function and stability.

    2. Reviewer #1 (Public review):

      Summary:

      The authors introduce a denoising-style model that incorporates both structure and primary-sequence embeddings to generate richer embeddings of peptides. My understanding is that the authors use ESM for the primary sequence embeddings, take resolved structures (or use structural predictions from AlphaFold when they're not available), then develop an architecture to combine these two with a loss that seems reminiscent of diffusion models or masked language model approaches. The embeddings can be viewed as ensemble-style embedding of the two levels of sequence information, or with AlphaFold, an ensemble of two methods (ESM+AlphaFold). The authors also gather external datasets to evaluate their approach and compare it to previous approaches. The approach seems promising and appears to out-compete previous methods at several tasks. Nonetheless, I have strong concerns about a lack of verbosity as well as exclusion of relevant methods and references.

      Advances:

      I appreciate the breadth of the analysis and comparisons to other methods. The authors separate tasks, models, and sizes of models in an intuitive, easy-to-read fashion that I find valuable for selecting a method for embedding peptides. Moreover, the authors gather two datasets for evaluating embeddings' utility for predicting thermostability. Overall, the work should be helpful for the field as more groups choose methods/pretraining strategies amenable to their goals, and can do so in an evidence-guided manner.

      Considerations:

      Primarily, a majority of the results and conclusions (e.g., Table 3) are reached using data and methods from ProteinGym, yet the best-performing methods on ProteinGym are excluded from the paper (e.g., EVE-based models and GEMME). In the ProteinGym database, these methods outperform ProtSSN models. Moreover, these models were published over a year---or even 4 years in the case of GEMME---before ProtSSN, and I do not see justification for their exclusion in the text.

      Secondly, related to comparison of other models, there is no section in the methods about how other models were used, or how their scores were computed. When comparing these models, I think it's crucial that there are explicit derivations or explanations for the exact task used for scoring each method. In other words, if the pre-training is indeed the important advance of the paper, the paper needs to show this more explicitly by explaining exactly which components of the model (and previous models) are used for evaluation. Are the authors extracting the final hidden layer representations of the model, treating these as features, then using these features in a regression task to predict fitness/thermostability/DDG etc.? How are the model embeddings of other methods being used, since, for example, many of these methods output a k-dimensional embedding of a given sequence, rather than one single score that can be correlated with some fitness/functional metric. Summarily, I think the text is lacking an explicit mention of how these embeddings are being summarized or used, as well as how this compares to the model presented.

      I think the above issues can mainly be addressed by considering and incorporating points from Li et al. 2024[1] and potentially Tang & Koo 2024[2]. Li et al.[1] make extremely explicit the use of pretraining for downstream prediction tasks. Moreover, they benchmark pretraining strategies explicitly on thermostability (one of the main considerations in the submitted manuscript), yet there is no mention of this work nor the dataset used (FLIP (Dallago et al., 2021)) in this current work. I think a reference and discussion of [1] is critical, and I would also like to see comparisons in line with [1], as [1] is very clear about what features from pretraining are used, and how. If the comparisons with previous methods were done in this fashion, this level of detail needs to be included in the text.

      To conclude, I think the manuscript would benefit substantially from a more thorough comparison of previous methods. Maybe one way of doing this is following [1] or [2], and using the final embeddings of each method for a variety of regression tasks---to really make clear where these methods are performing relative to one another. I think a more thorough methods section detailing how previous methods did their scoring is also important. Lastly, TranceptEVE (or a model comparable to it) and GEMME should also be mentioned in these results, or at the bare minimum, be given justification for their absence.

      [1] Feature Reuse and Scaling: Understanding Transfer Learning with Protein Language Models, Francesca-Zhoufan Li, Ava P. Amini, Yisong Yue, Kevin K. Yang, Alex X. Lu bioRxiv 2024.02.05.578959; doi: https://doi.org/10.1101/2024.02.05.578959<br /> [2] Evaluating the representational power of pre-trained DNA language models for regulatory genomics, Ziqi Tang, Peter K Koo bioRxiv 2024.02.29.582810; doi: https://doi.org/10.1101/2024.02.29.582810

      Comments on revisions:

      My concerns have been addressed. What seems to remain are some semantical disagreements and I'm not sure that these will be answered here. Do MSAs and other embedding methods lead to some notable type of data leakage? Does this leakage qualify as "x-shot" learning under current definitions?

    3. Reviewer #2 (Public review):

      Summary:

      To design proteins and predict disease, we want to predict the effects of mutations on the function of a protein. To make these predictions, biologists have long turned to statistical models that learn patterns that are conserved across evolution. There is potential to improve our predictions however by incorporating structure. In this paper the authors build a denoising auto-encoder model that incorporates sequence and structure to predict mutation effects. The model is trained to predict the sequence of a protein given its perturbed sequence and structure. The authors demonstrate that this model is able to predict the effects of mutations better than sequence-only models.

      As well, the authors curate a set of assays measuring the effect of mutations on thermostability. They demonstrate their model also predicts the effects of these mutations better than previous models and make this benchmark available for the community.

      Strengths:

      The authors describe a method that makes accurate mutation effect predictions by informing its predictions with structure.

      The authors curate a new dataset of assays measuring thermostability. These can be used to validate and interpret mutation effect prediction methods in the future.

      Weaknesses:

      In the review period, the authors included a previous method, SaProt, that similarly uses protein structure to predict the effects of mutations, in their evaluations. They see that SaProt performs similarly to their method.

      ProteinGym is largely made of deep mutational scans, which measure the effect of every mutation on a protein. These new benchmarks contain on average measurements of less than a percent of all possible point mutations of their respective proteins. It is unclear what sorts of protein regions these mutations are more likely to lie in; therefore it is challenging to make conclusions about what a model has necessarily learned based on its score on this benchmark. For example, several assays in this new benchmark seem to be similar to each other, such as four assays on ubiquitin performed in pH 2.25 to pH 3.0.

      Comments on revisions:

      I think the rounds of review have improved the paper and I've raised my score.

    1. eLife Assessment

      This study presents a valuable theoretical exploration on the electrophysiological mechanisms of ionic currents via gap junctions in hippocampal CA1 pyramidal-cell models, and their potential contribution to local field potentials (LFPs) that is different from the contribution of chemical synapses. The biophysical argument regarding electric dipoles appears solid, but the evidence would be stronger if their predictions are tested against experiments. A shortage of model validation and strictly comparable parameters used in the comparisons between chemical vs. junctional inputs makes the modeling approach incomplete; once strengthened, the finding can be of broad interest to electrophysiologists, who often make recordings from regions of neurons interconnected with gap junctions.

    2. Reviewer #1 (Public review):

      This manuscript makes a significant contribution to the field by exploring the dichotomy between chemical synaptic and gap junctional contributions to extracellular potentials. While the study is comprehensive in its computational approach, adding experimental validation, network-level simulations, and expanded discussion on implications would elevate its impact further.

      Strengths:

      Novelty and Scope:<br /> The manuscript provides a detailed investigation into the contrasting extracellular field potential (EFP) signatures arising from chemical synapses and gap junctions, an underexplored area in neuroscience.<br /> It highlights the critical role of active dendritic processes in shaping EFPs, pushing forward our understanding of how electrical and chemical synapses contribute differently to extracellular signals.

      Methodological Rigor:<br /> The use of morphologically and biophysically realistic computational models for CA1 pyramidal neurons ensures that the findings are grounded in physiological relevance.<br /> Systematic analysis of various factors, including the presence of sodium, leak, and HCN channels, offers a clear dissection of how transmembrane currents shape EFPs.

      Biological Relevance:<br /> The findings emphasize the importance of incorporating gap junctional inputs in analyses of extracellular signals, which have traditionally focused on chemical synapses.<br /> The observed polarity differences and spectral characteristics provide novel insights into how neural computations may differ based on the mode of synaptic input.

      Clarity and Depth:<br /> The manuscript is well-structured, with a logical progression from synchronous input analyses to asynchronous and rhythmic inputs, ensuring comprehensive coverage of the topic.

      Weaknesses and Areas for Improvement:

      Generality and Validation:<br /> The study focuses exclusively on CA1 pyramidal neurons. Expanding the analysis to other cell types, such as interneurons or glial cells, would enhance the generalizability of the findings.<br /> Experimental validation of the computational predictions is entirely absent. Empirical data correlating the modeled EFPs with actual recordings would strengthen the claims.

      Role of Active Dendritic Currents:<br /> The paper emphasizes active dendritic currents, particularly the role of HCN channels in generating outward currents under certain conditions. However, further discussion of how this mechanism integrates into broader network dynamics is warranted.

      Analysis of Plasticity:<br /> While the manuscript mentions plasticity in the discussion, there are no simulations that account for activity-dependent changes in synaptic or gap junctional properties. Including such analyses could significantly enhance the relevance of the findings.

      Frequency-Dependent Effects:<br /> The study demonstrates that gap junctional inputs suppress high-frequency EFP power due to membrane filtering. However, it could delve deeper into the implications of this for different brain rhythms, such as gamma or ripple oscillations.

      Visualization:<br /> Figures are dense and could benefit from more intuitive labeling and focused presentations. For example, isolating key differences between chemical and gap junctional inputs in distinct panels would improve clarity.

      Contextual Relevance:<br /> The manuscript touches on how these findings relate to known physiological roles of gap junctions (e.g., in gamma rhythms) but does not explore this in depth. Stronger integration of the results into known neural network dynamics would enhance its impact.

      Suggestions for Improvement:

      Broader Application:<br /> Simulate EFPs in multi-neuron networks to assess how the findings extend to network-level interactions, particularly in regions with mixed synaptic connectivity.

      Experimental Correlation:<br /> Collaborate with experimental groups to validate the computational predictions using in vivo or in vitro recordings.

      Mechanistic Insights:<br /> Provide a more detailed mechanistic explanation of how specific ionic currents (e.g., HCN, sodium, leak) interact during gap junctional vs. chemical synaptic inputs.

      Implications for Neural Coding:<br /> Discuss how the observed differences in EFP signatures might influence neural coding, especially in circuits with heavy gap junctional connectivity.

    3. Reviewer #2 (Public review):

      Summary:

      This computational work examines whether the inputs that neurons receive through electrical synapses (gap junctions) have different signatures in the extracellular local field potential (LFP) compared to inputs via chemical synapses. The authors present the results of a series of model simulations where either electric or chemical synapses targeting a single hippocampal pyramidal neuron are activated in various spatio-temporal patterns, and the resulting LFP in the vicinity of the cell is calculated and analyzed. The authors find several notable qualitative differences between the LFP patterns evoked by gap junctions vs. chemical synapses. For some of these findings, the authors demonstrate convincingly that the observed differences are explained by the electric vs. chemical nature of the input, and these results likely generalize to other cell types. However, in other cases, it remains plausible (or even likely) that the differences are caused, at least partly, by other factors (such as different intracellular voltage responses due to, e.g., the unequal strengths of the inputs). Furthermore, it was not immediately clear to me how the results could be applied to analyze more realistic situations where neurons receive partially synchronized excitatory and inhibitory inputs via chemical and electric synapses.

      Strengths:

      The main strength of the paper is that it draws attention to the fact that inputs to a neuron via gap junctions are expected to give rise to a different extracellular electric field compared to inputs via chemical synapses, even if the intracellular effects of the two types of input are similar. This is because, unlike chemical synaptic inputs, inputs via gap junctions are not directly associated with transmembrane currents. This is a general result that holds independent of many details such as the cell types or neurotransmitters involved.

      Another strength of the article is that the authors attempt to provide intuitive, non-technical explanations of most of their findings, which should make the paper readable also for non-expert audiences (including experimentalists).

      Weaknesses:

      The most problematic aspect of the paper relates to the methodology for comparing the effects of electric vs. chemical synaptic inputs on the LFP. The authors seem to suggest that the primary cause of all the differences seen in the various simulation experiments is the different nature of the input, and particularly the difference between the transmembrane current evoked by chemical synapses and the gap junctional current that does not involve the extracellular space. However, this is clearly an oversimplification: since no real attempt is made to quantitatively match the two conditions that are compared (e.g., regarding the strength and temporal profile of the inputs), the differences seen can be due to factors other than the electric vs. chemical nature of synapses. In fact, if inputs were identical in all parameters other than the transmembrane vs. directly injected nature of the current, the intracellular voltage responses and, consequently, the currents through voltage-gated and leak currents would also be the same, and the LFPs would differ exactly by the contribution of the transmembrane current evoked by the chemical synapse. This is evidently not the case for any of the simulated comparisons presented, and the differences in the membrane potential response are rather striking in several cases (e.g., in the case of random inputs, there is only one action potential with gap junctions, but multiple action potentials with chemical synapses). Consequently, it remains unclear which observed differences are fundamental in the sense that they are directly related to the electric vs. chemical nature of the input, and which differences can be attributed to other factors such as differences in the strength and pattern of the inputs (and the resulting difference in the neuronal electric response).

      Some of the explanations offered for the effects of cellular manipulations on the LFP appear to be incomplete. More specifically, the authors observed that blocking leak channels significantly changed the shape of the LFP response to synchronous synaptic inputs - but only when electric inputs were used, and when sodium channels were intact. The authors seemed to attribute this phenomenon to a direct effect of leak currents on the extracellular potential - however, this appears unlikely both because it does not explain why blocking the leak conductance had no effect in the other cases, and because the leak current is several orders of magnitude smaller than the spike-generating currents that make the largest contributions to the LFP. An indirect effect mediated by interactions of the leak current with some voltage-gated currents appears to be the most likely explanation, but identifying the exact mechanism would require further simulation experiments and/or a detailed analysis of intracellular currents and the membrane potential in time and space.

      In every simulation experiment in this study, inputs through electric synapses are modeled as intracellular current injections of pre-determined amplitude and time course based on the sampled dendritic voltage of potential synaptic partners. This is a major simplification that may have a significant impact on the results. First, the current through gap junctions depends on the voltage difference between the two connected cellular compartments and is thus sensitive to the membrane potential of the cell that is treated as the neuron "receiving" the input in this study (although, strictly speaking, there is no pre- or postsynaptic neuron in interactions mediated by gap junctions). This dependence on the membrane potential of the target neuron is completely missing here. A related second point is that gap junctions also change the apparent membrane resistance of the neurons they connect, effectively acting as additional shunting (or leak) conductance in the relevant compartments. This effect is completely missed by treating gap junctions as pure current sources.

      One prominent claim of the article that is emphasized even in the abstract is that HCN channels mediate an outward current in certain cases. Although this statement is technically correct, there are two reasons why I do not consider this a major finding of the paper. First, as the authors acknowledge, this is a trivial consequence of the relatively slow kinetics of HCN channels: when at least some of the channels are open, any input that is sufficiently fast and strong to take the membrane potential across the reversal potential of the channel will lead to the reversal of the polarity of the current. This effect is quite generic and well-known and is by no means specific to gap junctional inputs or even HCN channels. Second, and perhaps more importantly, the functional consequence of this reversed current through HCN channels is likely to be negligible. As clearly shown in Supplementary Figure S3, the HCN current becomes outward only for an extremely short time period during the action potential, which is also a period when several other currents are also active and likely dominant due to their much higher conductances. I also note that several of these relevant facts remain hidden in Figure 3, both because of its focus on peak values, and because of the radically different units on the vertical axes of the current plots.

      Finally, I missed an appropriate validation of the neuronal model used, and also the characterization of the effects of the in silico manipulations used on the basic behavior of the model. As far as I understand, the model in its current form has not been used in other studies. If this is the case, it would be important to demonstrate convincingly through (preferably quantitative) comparisons with experimental data using different protocols that the model captures the physiological behavior of at least the relevant compartments (in this case, the dendrites and the soma) of hippocampal pyramidal neurons sufficiently well that the results of the modeling study are relevant to the real biological system. In addition, the correct interpretation of various manipulations of the model would be strongly facilitated by investigating and discussing how the physiological properties of the model neuron are affected by these alterations.

    4. Author response:

      eLife Assessment

      This study presents a valuable theoretical exploration on the electrophysiological mechanisms of ionic currents via gap junctions in hippocampal CA1 pyramidal-cell models, and their potential contribution to local field potentials (LFPs) that is different from the contribution of chemical synapses. The biophysical argument regarding electric dipoles appears solid, but the evidence can be more convincing if their predictions are tested against experiments. A shortage of model validation and strictly comparable parameters used in the comparisons between chemical vs. junctional inputs makes the modeling approach incomplete; once strengthened, the finding can be of broad interest to electrophysiologists, who often make recordings from regions of neurons interconnected with gap junctions.

      We gratefully thank the editors and the reviewers for the time and effort in rigorously assessing our manuscript, for the constructive review process, for their enthusiastic responses to our study, and for the encouraging and thoughtful comments. We especially thank you for deeming our study to be a valuable exploration on the differential contributions of active dendritic gap junctions vs. chemical synapses to local field potentials. We thank you for your appreciation of the quantitative biophysical demonstration on the differences in electric dipoles that appear in extracellular potentials with gap junctions vs. chemical synapses.

      However, we are surprised by aspects of the assessment that resulted in deeming the approach incomplete, especially given the following with specific reference to the points raised:

      (1) Testing against experiments: With specific reference to gap junctions, quantitative experimental verification becomes extremely difficult because of the well-established nonspecificities associated with gap junctional modulators (Behrens et al., 2011; Rouach et al., 2003). The non-specific actions of gap junctions are tabulated in Table 2 of (Szarka et al., 2021), reproduced below. In addition, genetic knockouts of gap junctional proteins are either lethal or involve functional compensation (Bedner et al., 2012; Lo, 1999), together making causal links to specific gap junctional contributions with currently available techniques infeasible.

      In addition, the complex interactions between co-existing chemical synaptic, gap junctional, and active dendritic contributions from several cell-types make the delineation of the contributions of specific components infeasible with experimental approaches. A computational approach is the only quantitative route to specifically delineate the contributions of individual components to extracellular potentials, as seen from studies that have addressed the question of active dendritic contributions to field potentials (Halnes et al., 2024; Ness et al., 2018; Reimann et al., 2013; Sinha & Narayanan, 2015, 2022) or spiking contributions to local field potentials (Buzsaki et al., 2012; Gold et al., 2006; Schomburg et al., 2012). The biophysically and morphologically realistic computational modeling route is therefore invaluable in assessing the impact of individual components to extracellular field potentials (Einevoll et al., 2019; Halnes et al., 2024).

      Together, we emphasize that the computational modeling route is currently the only quantitative methodology to delineate the contributions of gap junctions vs. chemical synapses to extracellular potentials.

      (2) Model validation: The model used in this study was adopted from a physiologically validated model from our laboratory (Roy & Narayanan, 2021). Please note that the original model was validated against several physiological measurements along the somatodendritic axis. We sincerely regret our oversight in not mentioning clearly that we have used an existing, thoroughly physiologically-validated model from our laboratory in this study.

      (3) Comparisons between chemical vs. junctional inputs: We had taken elaborate precautions in our experimental design to match the intracellular electrophysiological signatures with reference to synchronous as well as oscillatory inputs, irrespective of whether inputs arrived through gap junctions or chemical synapses.

      In a revised manuscript, we will address all the concerns raised by the reviewers in detail. We have provided point-by-point responses to reviewers’ helpful and constructive comments below. We thank the editors and the reviewers for this constructive review process, which we believe will help us in improving our manuscript with specific reference to emphasizing the novelty of our approach and conclusions.

      Reviewer #1 (Public review):

      This manuscript makes a significant contribution to the field by exploring the dichotomy between chemical synaptic and gap junctional contributions to extracellular potentials. While the study is comprehensive in its computational approach, adding experimental validation, network-level simulations, and expanded discussion on implications would elevate its impact further.

      We gratefully thank you for your time and effort in rigorously assessing our manuscript, for the enthusiastic response, and the encouraging and thoughtful comments on our study. In what follows, we have provided point-by-point responses to the specific comments.

      Strengths

      Novelty and Scope

      The manuscript provides a detailed investigation into the contrasting extracellular field potential (EFP) signatures arising from chemical synapses and gap junctions, an underexplored area in neuroscience. It highlights the critical role of active dendritic processes in shaping EFPs, pushing forward our understanding of how electrical and chemical synapses contribute differently to extracellular signals.

      We thank you for the positive comments on the novelty of our approach and how our study addresses an underexplored area in neuroscience. The assumptions about the passive nature of dendritic structures had indeed resulted in an underestimation of the contributions of gap junctions to extracellular potentials. Once the realities of active structures are accounted for, the contributions of gap junctions increases by several orders of magnitude compared to passive structures (Fig. 1D).

      Methodological Rigor

      The use of morphologically and biophysically realistic computational models for CA1 pyramidal neurons ensures that the findings are grounded in physiological relevance. Systematic analysis of various factors, including the presence of sodium, leak, and HCN channels, offers a clear dissection of how transmembrane currents shape EFPs.

      We thank you for your encouraging comments on the experimental design and methodological rigor of our approach.

      Biological Relevance

      The findings emphasize the importance of incorporating gap junctional inputs in analyses of extracellular signals, which have traditionally focused on chemical synapses. The observed polarity differences and spectral characteristics provide novel insights into how neural computations may differ based on the mode of synaptic input.

      We thank you for your positive comments on the biological relevance of our approach. We also gratefully thank you for emphasizing the two striking novelties unveiling the dichotomy between gap junctions and chemical synapses in their contributions to field potentials: polarity differences and spectral characteristics.

      Clarity and Depth

      The manuscript is well-structured, with a logical progression from synchronous input analyses to asynchronous and rhythmic inputs, ensuring comprehensive coverage of the topic.

      We sincerely thank you for the positive comments on the structure and comprehensive coverage of our manuscript encompassing different types of inputs that neurons typically receive.

      Weaknesses and Areas for Improvement

      Generality and Validation

      The study focuses exclusively on CA1 pyramidal neurons. Expanding the analysis to other cell types, such as interneurons or glial cells, would enhance the generalizability of the findings. Experimental validation of the computational predictions is entirely absent. Empirical data correlating the modeled EFPs with actual recordings would strengthen the claims.

      We thank you for raising this important point. The prime novelty and the principal conclusion of this study is that gap junctional contributions to extracellular field potentials are orders of magnitude higher when the active nature of cellular compartments are accounted for. The lacuna in the literature has been consequent to the assumption that cellular compartments are passive, resulting in the dogma that gap junctional contributions to field potentials are negligible. Despite knowledge about active dendritic structures for decades now, this assumption has kept studies from understanding or even exploring the contributions of gap junctions to field potentials. The rationale behind the choice of a computational approach to address the lacuna were as follows:

      (1) The complex interactions between co-existing chemical synaptic, gap junctional, and active dendritic contributions from several cell-types make the delineation of the contributions of specific components infeasible with experimental approaches. A computational approach is the only quantitative route to specifically delineate the contributions of individual components to extracellular potentials, as seen from studies that have addressed the question of active dendritic contributions to field potentials (Halnes et al., 2024; Ness et al., 2018; Reimann et al., 2013; Sinha & Narayanan, 2015, 2022) or spiking contributions to local field potentials (Buzsaki et al., 2012; Gold et al., 2006; Schomburg et al., 2012). The biophysically and morphologically realistic computational modeling route is therefore invaluable in assessing the impact of individual components to extracellular field potentials (Einevoll et al., 2019; Halnes et al., 2024).

      (2) With specific reference to gap junctions, quantitative experimental verification becomes extremely difficult because of the well-established non-specificities associated with gap junctional modulators (Behrens et al., 2011; Rouach et al., 2003). The non-specific actions of gap junctions are tabulated in Table 2 of (Szarka et al., 2021). In addition, genetic knockouts of gap junctional proteins are either lethal or involve functional compensation (Bedner et al., 2012; Lo, 1999), together making causal links to specific gap junctional contributions with currently available techniques infeasible.

      We highlight the novelty of our approach and of the conclusions about differences in extracellular signatures associated with active-dendritic chemical synapses and gap junctions, against these experimental difficulties. We emphasize that the computational modeling route is currently the only quantitative methodology to delineate the contributions of gap junctions vs. chemical synapses to extracellular potentials. Our analyses clearly demonstrates that gap junctions do contribute to extracellular potentials if the active nature of the cellular compartments is explicitly accounted for (Fig. 1D). We also show theoretically well-grounded and mechanistically elucidated differences in polarity (Figs. 1–3) as well as in spectral signatures (Figs. 5–8) of extracellular potentials associated with gap junctional vs. chemical synaptic inputs. Together, our fundamental demonstration in this study is the critical need to account for the active nature of cellular compartments in studying gap junctional contributions of extracellular potentials, with CA1 pyramidal neuronal dendrites used as an exemplar.

      In a revised version of the manuscript, we will emphasize the motivations for the approach we took, highlighting the specific novelties both in methodological and conceptual aspects, finally emphasizing the need to account for other cell types and gap junctional contributions therein. Importantly, we will emphasize the non-specificities associated with gap-junctional blockers as the reason why experimental delineation of gap junctional vs. chemical synaptic contributions to LFP becomes tedious. We hope that these points will underscore the need for the computational approach that we took to address this important question, apart from the novelties of the manuscript.

      Role of Active Dendritic Currents

      The paper emphasizes active dendritic currents, particularly the role of HCN channels in generating outward currents under certain conditions. However, further discussion of how this mechanism integrates into broader network dynamics is warranted.

      We thank you for this constructive suggestion. We agree that it is important to consider the implications for broader network dynamics of the outward HCN currents that are observed with synchronous inputs. In a revised manuscript, we will elaborate on the implications of the outward HCN current to network dynamics in detail.

      Analysis of Plasticity

      While the manuscript mentions plasticity in the discussion, there are no simulations that account for activity-dependent changes in synaptic or gap junctional properties. Including such analyses could significantly enhance the relevance of the findings.

      We thank you for this constructive suggestion. Please note that we have presented consistent results for both fewer and more gap junctions in our analyses (Figure 1 with 217 gap junctions and Supplementary Figure 1 with 99 gap junctions). Thus, our fundamentally novel result that gap junctions onto active dendrites differentially shape LFPs holds true irrespective of the relative density of gap junctions onto the neuron. Thus, these results demonstrate that the conclusions about their contributions to LFP are invariant to plasticity in their gap junctional numerosity.

      We had only briefly mentioned plasticity in the Introduction to highlight the different modes of synaptic transmission and to emphasize that plasticity has been studied in both chemical synapses and gap junctions, playing a role in learning and adaptation. However, if this wording inadvertently suggests that our study includes plasticity simulations, we would remove it from Introduction in the updated manuscript to ensure clarity.

      In the ‘Limitations of analyses and future studies’ section in Discussion, we suggested investigating the impact of plasticity mechanisms—specifically, activity-dependent plasticity of ion channels—on synaptic receptors vs. gap junctions and their effects on extracellular field potentials under various input conditions and plasticity combinations across different structures. We fully agree with the reviewer that such studies would offer valuable insights and further enhance the broader relevance of our findings. However, while our study implies this direction, it was not the primary focus of our investigation.

      In the revised manuscript, we will expand on intrinsic/synaptic plasticity and how they could contribute to LFPs (Sinha & Narayanan, 2015, 2022), while also pointing to simulations with different numbers of gap junction in this context.

      Frequency-Dependent Effects

      The study demonstrates that gap junctional inputs suppress highfrequency EFP power due to membrane filtering. However, it could delve deeper into the implications of this for different brain rhythms, such as gamma or ripple oscillations.

      We sincerely thank you for these insightful comments that we totally agree with. As it so happens, this manuscript forms the first part of a broader study where we explore the implications of gap junctions to ripple frequency oscillations. The ripple oscillations part of the work was presented as a poster in the Society for Neuroscience (SfN) annual meeting 2024 (Sirmaur & Narayanan, 2024). There, we simulate a neuropil made of hundreds of morphologically realistic neurons to assess the role of different synaptic inputs — excitatory, inhibitory, and gap junctional — and active dendrites to ripple frequency oscillations. We demonstrate there that the conclusions from single-neuron simulations in this current manuscript extend to a neuropil with several neurons, each receiving excitatory, inhibitory and gap-junctional inputs, especially with reference to high-frequency oscillations. Our networkbased analyses unveiled a dominant mediatory role of patterned inhibition in ripple generation, with recurrent excitations through chemical synapses and gap junctions in conjunction with return-current contributions from active dendrites playing regulatory roles in determining ripple characteristics (Sirmaur & Narayanan, 2024).

      Our principal goal in this study, therefore, was to lay the single-neuron foundation for network analyses of the impact of gap junctions on LFPs. We are preparing the network part of the study, with a strong focus on ripple-frequency oscillations, for submission for peer review separately.

      In a revised manuscript, we will mention the results from our SfN abstract with reference to network simulations and high-frequency oscillations, while also presenting discussions from other studies on the role of gap junctions in synchrony and LFP oscillations.

      Visualization

      Figures are dense and could benefit from more intuitive labeling and focused presentations. For example, isolating key differences between chemical and gap junctional inputs in distinct panels would improve clarity.

      We thank you for this constructive suggestion. In the revised manuscript, we will enhance the visualization of the figures to ensure a clearer and more intuitive distinction between chemical synapses and gap junctions.

      Contextual Relevance

      The manuscript touches on how these findings relate to known physiological roles of gap junctions (e.g., in gamma rhythms) but does not explore this in depth. Stronger integration of the results into known neural network dynamics would enhance its impact.

      We sincerely appreciate your valuable suggestion and acknowledge the importance of integrating our results into established neural network dynamics, particularly their implications for gamma rhythms. We will address this aspect more comprehensively in the revised version of our manuscript.

      Reviewer #2 (Public review):

      This computational work examines whether the inputs that neurons receive through electrical synapses (gap junctions) have different signatures in the extracellular local field potential (LFP) compared to inputs via chemical synapses. The authors present the results of a series of model simulations where either electric or chemical synapses targeting a single hippocampal pyramidal neuron are activated in various spatio-temporal patterns, and the resulting LFP in the vicinity of the cell is calculated and analyzed. The authors find several notable qualitative differences between the LFP patterns evoked by gap junctions vs. chemical synapses. For some of these findings, the authors demonstrate convincingly that the observed differences are explained by the electric vs. chemical nature of the input, and these results likely generalize to other cell types. However, in other cases, it remains plausible (or even likely) that the differences are caused, at least partly, by other factors (such as different intracellular voltage responses due to, e.g., the unequal strengths of the inputs). Furthermore, it was not immediately clear to me how the results could be applied to analyze more realistic situations where neurons receive partially synchronized excitatory and inhibitory inputs via chemical and electric synapses.

      We gratefully thank you for your time and effort in rigorously assessing our manuscript, for the enthusiastic response, and the encouraging and thoughtful comments on our study. In what follows, we have provided point-by-point responses to the specific comments.

      Strengths

      The main strength of the paper is that it draws attention to the fact that inputs to a neuron via gap junctions are expected to give rise to a different extracellular electric field compared to inputs via chemical synapses, even if the intracellular effects of the two types of input are similar. This is because, unlike chemical synaptic inputs, inputs via gap junctions are not directly associated with transmembrane currents. This is a general result that holds independent of many details such as the cell types or neurotransmitters involved.

      We gratefully thank you for the positive comments and the encouraging words about the novel contributions of our study. We are particularly thankful to you for your comment on the generality of our conclusions that hold for different cell types and neurotransmitters involved.

      Another strength of the article is that the authors attempt to provide intuitive, non-technical explanations of most of their findings, which should make the paper readable also for non-expert audiences (including experimentalists).

      We sincerely thank you for the positive comments about the readability of the paper.

      Weaknesses

      The most problematic aspect of the paper relates to the methodology for comparing the effects of electric vs. chemical synaptic inputs on the LFP. The authors seem to suggest that the primary cause of all the differences seen in the various simulation experiments is the different nature of the input, and particularly the difference between the transmembrane current evoked by chemical synapses and the gap junctional current that does not involve the extracellular space. However, this is clearly an oversimplification: since no real attempt is made to quantitatively match the two conditions that are compared (e.g., regarding the strength and temporal profile of the inputs), the differences seen can be due to factors other than the electric vs. chemical nature of synapses. In fact, if inputs were identical in all parameters other than the transmembrane vs. directly injected nature of the current, the intracellular voltage responses and, consequently, the currents through voltage-gated and leak currents would also be the same, and the LFPs would differ exactly by the contribution of the transmembrane current evoked by the chemical synapse. This is evidently not the case for any of the simulated comparisons presented, and the differences in the membrane potential response are rather striking in several cases (e.g., in the case of random inputs, there is only one action potential with gap junctions, but multiple action potentials with chemical synapses). Consequently, it remains unclear which observed differences are fundamental in the sense that they are directly related to the electric vs. chemical nature of the input, and which differences can be attributed to other factors such as differences in the strength and pattern of the inputs (and the resulting difference in the neuronal electric response).

      We thank you for raising this important point. We would like to emphasize that our experimental design and analyses quantitatively account for the spatial distribution and temporal pattern of specific kinds of inputs that arrive through gap junctions and chemical synapses. We submit that our analyses quantitatively demonstrates that the fundamental difference between the gap junctional and chemical synaptic contributions to extracellular potentials is the absence of the direct transmembrane component from gap junctional inputs. We elucidate these points below:

      (1) Spatial distribution: The inputs were distributed randomly across the basal dendrites, irrespective of whether they were through gap junctions or chemical synapses. For both chemical synapses and gap junctions, the inputs were of the same nature: excitatory.

      (2) Different numbers of inputs: We have presented consistent results for both fewer and more gap junctions or chemical synapses in our analyses (see Figure 1 with 217 gap junctions or 245 chemical synapses and Supplementary Figure 2 with 99 gap junctions or 30 chemical synapses). Our fundamentally novel result that gap junctions onto active dendrites shape LFPs holds true irrespective of the relative density of gap junctions onto the neuron.

      (3) Synchronous inputs (Figs. 1–3): For chemical synapses, the waveforms are in the shape of postsynaptic potentials. For gap junctional inputs, the waveforms are in the shape of postsynaptic potentials or dendritic spikes (to respect the active nature of inputs from the other cell). Here, the electrical response of the postsynaptic cell is identical irrespective of whether inputs arrive through gap junctions or chemical synapses: an action potential. We quantitatively matched the strengths such that the model generated a single action potential in response to synchronous inputs, irrespective of whether they arrived through chemical synaptic and gap junctional inputs. We mechanistically analyze the contributions of different cellular components and show that the direct transmembrane current in chemical synapses is the distinguishing factor that determines the dichotomy between the contributions of gap junctions vs. chemical synapses to extracellular potentials (Figs. 2–3). In a revised manuscript, we will show the intracellular responses to demonstrate that they are electrically matched.

      (4) Random inputs (Fig. 4): For random inputs, we did not account for the number of action potentials that arrived, as the only observation we made here was with reference to the biphasic nature of the extracellular potentials with gap junctional inputs in the “No Sodium” scenario. We note that in the “No Sodium” scenario, the time-domain amplitudes were comparable for the field potentials (Fig. 4B, Fig. 4D).

      (5) Rhythmic inputs (Fig. 5–8): For rhythmic inputs, please note that the intracellular and extracellular waveforms for every frequency are provided in supplementary figures S5– S11. It may be noted that the intracellular responses are comparable. In simulations for assessing spike-LFP comparison, we tuned the strengths to produce a single spike per cycle, ensuring fair comparison of LFPs with gap junctions vs. chemical synapses.

      Taken together, we demonstrate through explicit sets of simulations and analyses that the differences in LFPs were not driven by the strength or patterns of the inputs but rather by the differences in direct transmembrane currents, which are subsequently reflected in the LFPs. In a revised manuscript, we will add a section to emphasize these points apart from providing intracellular traces for cases where they are not provided.

      Some of the explanations offered for the effects of cellular manipulations on the LFP appear to be incomplete. More specifically, the authors observed that blocking leak channels significantly changed the shape of the LFP response to synchronous synaptic inputs - but only when electric inputs were used, and when sodium channels were intact. The authors seemed to attribute this phenomenon to a direct effect of leak currents on the extracellular potential - however, this appears unlikely both because it does not explain why blocking the leak conductance had no effect in the other cases, and because the leak current is several orders of magnitude smaller than the spike-generating currents that make the largest contributions to the LFP. An indirect effect mediated by interactions of the leak current with some voltage-gated currents appears to be the most likely explanation, but identifying the exact mechanism would require further simulation experiments and/or a detailed analysis of intracellular currents and the membrane potential in time and space.

      We thank you for raising this important question. Leak channels were among the several contributors to the positive deflection observed in LFPs associated with gap junctions. This effect was present not only in gap junctional models with intact sodium conductance but also in the no-sodium model, where the amplitude of the positive deflection was reduced across other models as well (Fig. 2F, I). Furthermore, even in the absence of leak conductance, a small positive deflection was still observed (Fig. 2F), leading us to further investigate other transmembrane currents over time and across spatial locations, from the proximal to the distal dendritic ends relative to the soma (Fig. 3D). We had observed that the dominant contributor in the case of chemical synapses was the inward synaptic current (Fig. 3A), whereas for gap junctions, the primary contributors were leak conductance along with other outward currents, such as potassium and HCN currents (Fig. 3D). Together, the direct transmembrane component of chemical synapses provides a dominant contribution to extracellular potentials. This dominance translates to differences in the relative contributions of indirect currents (including leak currents) to extracellular potentials associated chemical synaptic vs. gap junctional inputs. Our analyses of the exact ionic mechanisms (Fig. 3) demonstrates the involvement of several ion channels contributing to the indirect component in either scenario.

      In every simulation experiment in this study, inputs through electric synapses are modeled as intracellular current injections of pre-determined amplitude and time course based on the sampled dendritic voltage of potential synaptic partners. This is a major simplification that may have a significant impact on the results. First, the current through gap junctions depends on the voltage difference between the two connected cellular compartments and is thus sensitive to the membrane potential of the cell that is treated as the neuron "receiving" the input in this study (although, strictly speaking, there is no pre- or postsynaptic neuron in interactions mediated by gap junctions). This dependence on the membrane potential of the target neuron is completely missing here. A related second point is that gap junctions also change the apparent membrane resistance of the neurons they connect, effectively acting as additional shunting (or leak) conductance in the relevant compartments. This effect is completely missed by treating gap junctions as pure current sources.

      We thank you for raising this important point. We agree with the analyses presented by the reviewer on the importance of network simulations and bidirectional gap junctions that respect the voltages in both neurons. However, the complexities of LFP modeling precludes modeling of networks of morphologically realistic models with patterns of stimulations occurring across the dendritic tree. LFP modeling studies predominantly uses “post-synaptic” currents to analyze the impact of different patterns of inputs arriving on to a neuron, even when chemical synaptic inputs are considered. Explicitly, individual neurons are separately simulated with different patterns of synaptic inputs, the transmembrane current at different locations recorded, and the extracellular potential is then computed using line source approximation (Buzsaki et al., 2012; Gold et al., 2006; Halnes et al., 2024; Ness et al., 2018; Reimann et al., 2013; Schomburg et al., 2012; Sinha & Narayanan, 2015, 2022). Even in scenarios where a network is analyzed, a hybrid approach involving the outputs of a pointneuron-based network being coupled to an independent morphologically realistic neuronal model is employed (Hagen et al., 2016; Martinez-Canada et al., 2021; Mazzoni et al., 2015). Given the complexities associated with the computation of electrode potentials arising as a distance-weighted summation of several transmembrane currents, these simplifications becomes essential.

      Our approach models gap junctional currents in a similar way as the other model incorporate synaptic currents in LFP modeling (Buzsaki et al., 2012; Gold et al., 2006; Halnes et al., 2024; Ness et al., 2018; Reimann et al., 2013; Schomburg et al., 2012; Sinha & Narayanan, 2015, 2022). As gap junctions are typically implemented as resistors from the other neuronal compartment, we accounted for gap-junctional variability in our model by randomizing the scaling-factors and the exact waveforms that arrive through individual gap junctions at specific locations. Thus, the inputs were not pre-determined by “pre” neurons. Instead, the recorded voltages from potential synaptic partner neurons were randomized across locations and scaled using factors at the dendrites before being injected into the target neuron (Supplementary Fig. S1). While incorporating a network of interconnected neurons is indeed important, we utilized biophysical, morphologically realistic CA1 neuron model with different sets of input patterns to model LFPs, which were derived from the total transmembrane currents across all compartments of the multi-compartmental neuron model. Given the complexity of this approach, adding further network-level interactions or pre-post connections would have been computationally demanding.

      In a revised manuscript, we will introduce the general methodology used in LFP modeling studies to introduce synaptic currents. We will emphasize that our study extends this approach to modeling gap junctional inputs, while also highlighting randomization of locations and the scaling process in assigning gap junctional synaptic strengths.

      One prominent claim of the article that is emphasized even in the abstract is that HCN channels mediate an outward current in certain cases. Although this statement is technically correct, there are two reasons why I do not consider this a major finding of the paper. First, as the authors acknowledge, this is a trivial consequence of the relatively slow kinetics of HCN channels: when at least some of the channels are open, any input that is sufficiently fast and strong to take the membrane potential across the reversal potential of the channel will lead to the reversal of the polarity of the current. This effect is quite generic and well-known and is by no means specific to gap junctional inputs or even HCN channels. Second, and perhaps more importantly, the functional consequence of this reversed current through HCN channels is likely to be negligible. As clearly shown in Supplementary Figure S3, the HCN current becomes outward only for an extremely short time period during the action potential, which is also a period when several other currents are also active and likely dominant due to their much higher conductances. I also note that several of these relevant facts remain hidden in Figure 3, both because of its focus on peak values, and because of the radically different units on the vertical axes of the current plots.

      We thank you for raising this point and agree with you on every point. Please note that we do not assert that the outward HCN currents are exclusively associated with gap junctional inputs. Rather, our results show that synchronous inputs generate outward HCN currents in both chemical synapses (Fig. 3B; positive/outward HCN currents, except in the no sodium or leak model) and gap junctions (Fig. 3D; positive/outward HCN currents). We emphasized this in the case of gap junctions because, in the absence of inward synaptic currents, HCN (acting as outward currents with synchronous inputs) contributed to the positive deflection observed in the LFPs. While HCN would also contribute in the case of chemical synapses, its effect was negligible due to the presence of large inward synaptic currents. Since LFPs reflect the collective total transmembrane currents, the dominant contributors differ between these two scenarios, which we aimed to highlight. Since HCN exhibited outward currents in our synchronous input simulations, we have elaborated on this mechanism in the supplementary figure (Fig. S3). Our intention was not to emphasize this effect for only one synaptic mode but rather to highlight HCN's contribution to the positive deflection as one of the contributing factors.

      We agree that HCN currents are relatively small in magnitude; therefore, our conclusions were based on HCN being one of the several contributing factors. Leak conductance and other outward conductances, including HCN currents (Fig. 3D), collectively contribute to the positive deflections observed in the case of gap junctional synchronous inputs.

      We will ensure that we will account for all the points appropriately in a revised manuscript.

      Finally, I missed an appropriate validation of the neuronal model used, and also the characterization of the effects of the in silico manipulations used on the basic behavior of the model. As far as I understand, the model in its current form has not been used in other studies. If this is the case, it would be important to demonstrate convincingly through (preferably quantitative) comparisons with experimental data using different protocols that the model captures the physiological behavior of at least the relevant compartments (in this case, the dendrites and the soma) of hippocampal pyramidal neurons sufficiently well that the results of the modeling study are relevant to the real biological system. In addition, the correct interpretation of various manipulations of the model would be strongly facilitated by investigating and discussing how the physiological properties of the model neuron are affected by these alterations.

      We thank you for raising this important point. The CA1 pyramidal neuronal model used in this study is built with ion-channel models derived from biophysical and electrophysiological recordings from these cells. As mentioned in the Methods section “Dynamics and distribution of active channels” and Supplementary Table S1, models for individual channels, their gating kinetics, and channel distributions across the somatodendritic arbor (wherever known) are all derived from their physiological equivalents. Importantly, these values were derived from previously validated models from the laboratory, which contain these very ion channel models and the exact same morphology (Roy & Narayanan, 2021). Please compare Supplementary Table S1 with the Table 1 from (Roy & Narayanan, 2021). Please note that this model was validated against several physiological measurements along the somatodendritic axis (Fig. 1 of (Roy & Narayanan, 2021)).

      In a revised manuscript, we will explicitly mention this while also mentioning the different physiological properties that were used for the validation process from (Roy & Narayanan, 2021). We sincerely regret not mentioning these details in the current version of our manuscript.

      We will fix these in a revised version of the manuscript.

      References

      Bedner, P., Steinhauser, C., & Theis, M. (2012). Functional redundancy and compensation among members of gap junction protein families? Biochim Biophys Acta, 1818(8), 1971-1984. https://doi.org/10.1016/j.bbamem.2011.10.016

      Behrens, C. J., Ul Haq, R., Liotta, A., Anderson, M. L., & Heinemann, U. (2011). Nonspecific effects of the gap junction blocker mefloquine on fast hippocampal network oscillations in the adult rat in vitro. Neuroscience, 192, 11-19. https://doi.org/10.1016/j.neuroscience.2011.07.015

      Buzsaki, G., Anastassiou, C. A., & Koch, C. (2012). The origin of extracellular fields and currents--EEG, ECoG, LFP and spikes. Nat Rev Neurosci, 13(6), 407-420. https://doi.org/10.1038/nrn3241

      Einevoll, G. T., Destexhe, A., Diesmann, M., Grun, S., Jirsa, V., de Kamps, M., Migliore, M., Ness, T. V., Plesser, H. E., & Schurmann, F. (2019). The Scientific Case for Brain Simulations. Neuron, 102(4), 735-744. https://doi.org/10.1016/j.neuron.2019.03.027

      Gold, C., Henze, D. A., Koch, C., & Buzsaki, G. (2006). On the origin of the extracellular action potential waveform: A modeling study. J Neurophysiol, 95(5), 3113-3128. https://doi.org/10.1152/jn.00979.2005

      Hagen, E., Dahmen, D., Stavrinou, M. L., Linden, H., Tetzlaff, T., van Albada, S. J., Grun, S., Diesmann, M., & Einevoll, G. T. (2016). Hybrid Scheme for Modeling Local Field Potentials from Point-Neuron Networks. Cereb Cortex, 26(12), 4461-4496. https://doi.org/10.1093/cercor/bhw237

      Halnes, G., Ness, T. V., Næss, S., Hagen, E., Pettersen, K. H., & Einevoll, G. T. (2024). Electric Brain Signals: Foundations and Applications of Biophysical Modeling. Cambridge University Press. https://doi.org/DOI: 10.1017/9781009039826

      Lo, C. W. (1999). Genes, gene knockouts, and mutations in the analysis of gap junctions. Dev Genet, 24(1-2), 1-4. https://doi.org/10.1002/(SICI)1520-6408(1999)24:1/2<1::AIDDVG1>3.0.CO;2-U

      Martinez-Canada, P., Ness, T. V., Einevoll, G. T., Fellin, T., & Panzeri, S. (2021). Computation of the electroencephalogram (EEG) from network models of point neurons. PLoS Comput Biol, 17(4), e1008893. https://doi.org/10.1371/journal.pcbi.1008893

      Mazzoni, A., Linden, H., Cuntz, H., Lansner, A., Panzeri, S., & Einevoll, G. T. (2015). Computing the Local Field Potential (LFP) from Integrate-and-Fire Network Models. PLoS Comput Biol, 11(12), e1004584. https://doi.org/10.1371/journal.pcbi.1004584

      Ness, T. V., Remme, M. W. H., & Einevoll, G. T. (2018). h-Type Membrane Current Shapes the Local Field Potential from Populations of Pyramidal Neurons. J Neurosci, 38(26), 6011-6024. https://doi.org/10.1523/jneurosci.3278-17.2018

      Reimann, M. W., Anastassiou, C. A., Perin, R., Hill, S. L., Markram, H., & Koch, C. (2013). A biophysically detailed model of neocortical local field potentials predicts the critical role of active membrane currents. Neuron, 79(2), 375-390. https://doi.org/10.1016/j.neuron.2013.05.023

      Rouach, N., Segal, M., Koulakoff, A., Giaume, C., & Avignone, E. (2003). Carbenoxolone blockade of neuronal network activity in culture is not mediated by an action on gap junctions. Journal of Physiology, 553(Pt 3), 729-745. https://doi.org/10.1113/jphysiol.2003.053439

      Roy, A., & Narayanan, R. (2021). Spatial information transfer in hippocampal place cells depends on trial-to-trial variability, symmetry of place-field firing, and biophysical heterogeneities. Neural Netw, 142, 636-660. https://doi.org/10.1016/j.neunet.2021.07.026

      Schomburg, E. W., Anastassiou, C. A., Buzsaki, G., & Koch, C. (2012). The spiking component of oscillatory extracellular potentials in the rat hippocampus. J Neurosci, 32(34), 11798-11811. https://doi.org/10.1523/JNEUROSCI.0656-12.2012

      Sinha, M., & Narayanan, R. (2015). HCN channels enhance spike phase coherence and regulate the phase of spikes and LFPs in the theta-frequency range. Proc Natl Acad Sci U S A, 112(17), E2207-2216. https://doi.org/10.1073/pnas.1419017112

      Sinha, M., & Narayanan, R. (2022). Active Dendrites and Local Field Potentials: Biophysical Mechanisms and Computational Explorations. Neuroscience, 489, 111-142. https://doi.org/10.1016/j.neuroscience.2021.08.035

      Sirmaur, R., & Narayanan, R. (2024). Distinct extracellular signatures of chemical and electrical synapses impinging on active dendrites differentially contribute to ripple-frequency oscillations. Society for Neuroscience annual meeting (https://www.abstractsonline.com/pp8/?_gl=1*1bxo7m*_gcl_au*MTc5MTQ0NjE0NC4xNzI3MDcwOTMw*_ga*MTMxMTE5OTcyMy4xNzI3MDcwOTMx*_ga_T09K 3Q2WDN*MTcyNzA3MDkzMS4xLjEuMTcyNzA3MDkzNy41NC4wLjA.#!/20433/ presentation/13949), Chicago, USA.

      Szarka, G., Balogh, M., Tengolics, A. J., Ganczer, A., Volgyi, B., & Kovacs-Oller, T. (2021). The role of gap junctions in cell death and neuromodulation in the retina. Neural Regen Res, 16(10), 1911-1920. https://doi.org/10.4103/1673-5374.308069

    1. eLife Assessment

      This study describes the impact of mycobacterial genetic diversity on host-infection phenotypes by assessing the effect of different M. tuberculosis lineages on granulomatous inflammation using a 3D in vitro granuloma model. Despite being descriptive and showing mostly correlative relationships, the findings are useful and provide some solid support regarding the functional impact of M. tuberculosis's natural diversity on host-pathogen interactions. The study will interest researchers working on mycobacteria and motivate future studies to understand how genetic diversity influences virulence and immunity outcomes.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript reports a comparison of microbial traits and host response traits in a laboratory model of infected granuloma using Mtb strains from different lineages. The authors report increased bacillary growth and granuloma formation, inversely associated with T cell activation that is characterized by CXCL9, granzyme B and TNF expression. They therefore infer that these T cell responses are likely to be host-protective and that the greater virulence of modern Mtb lineages may be driven by their ability to avoid triggering these responses.

      Strengths:

      The comparison of multiple Mtb lineages in a granuloma model that enables evaluation of the potential role of multiple host cells in Mtb control, offers a valuable experimental approach to study the biological mechanisms that underpin differential virulence of Mtb lineages that has been previously reported in clinical and epidemiological studies.

      Weaknesses:

      The study is rather limited to descriptive observations, and lacks experiments to test causal relationships between host and pathogen traits. Some of the presentation of the data are difficult to interpret, and some conclusions are not adequately supported by the data.

      Comments on revisions:

      The authors have addressed my previous comments with appropriate revisions and explanations.

    3. Reviewer #3 (Public review):

      Arbués and colleagues describe the impact of mycobacterial genetic diversity on host-infection phenotypes. The authors evaluate Mtb infection and contextualize host-responses, bacterial growth and metabolic transitioning in vitro using their previously established model of blood-derived, primary-human-cells cultured within a collagen/fibronectin matrix. They seek to demonstrate the effectiveness of the model in determining mycobacterial strain specific granuloma-dependent host-pathogen interactions.

      Understanding the way mycobacterial genetic diversity impacts granuloma biology in tuberculosis is an important goal. One of this works strengths is the use of primary human cells and two constituents of pulmonary extracellular matrix to model Mtb infection. The authors and others have previously shown that Mtb infected PBMC aggregates share important characteristics with early pulmonary TB granulomas. Use of multiple genetically distinct strains of Mtb defines this work and further bolsters it potential impact. However, the study is not comprehensive as lineages 6 and 7 are not tested. Experiments are primarily descriptive, and the methodologies are conventional. Correlative relationships are the manuscripts focus and effect sizes are generally small.

      The main aim of this work is to extend an in vitro granuloma model to the study of a large collection of well characterized, genetically diverse representatives of the mycobacterium tuberculosis complex (MTBC). I believe that they accomplish that aim. The work does investigate MTBC infection of aggregated PBMCs using three strains each of Mtb lineages 1-5 and H37Rv, which is not a trivial undertaking. The experimental aims are to show that MTBC genetic diversity impacts growth and dormancy of granuloma bound bacteria and, the host responses of granulomatous aggregation as well as macrophage apoptosis, lymphocyte activation and soluble mediator release within granulomas. The methodologies employed are sufficient to test most of these aims. The authors conclusions regarding their results are mostly supported by the data. The conclusion that lineage impacts growth within granulomas is likely true and the data as presented reflect such a relationship. Their conclusions regarding lineage's impact on dormancy are partially supported, as their findings demonstrate that assays for dormancy identify strain-specific metabolic changes in the bacteria consistent with a dormancy-like state but also identify replicating bacteria as being dormant. The data strongly supports the impact of mycobacterial genetic diversity on a spectrum of granulomatous responses in their model system. Those findings are a highlight of the publication. The data further supports the idea that strain diversity impacts macrophage apoptosis but a relationship of apoptosis to the granulomatous response is not effectively evaluated. The association of lymphocyte activation with reduced mycobacterial growth as an aspect of granulomas is well documented in the literature and a negative correlation between T cell activation and growth is supported by the authors results. Their data also support the conclusion that soluble mediator production by PBMCs is different based on the infecting strain of mycobacteria and that IL1b modulates aggregate phenotypes in their model.

      The authors contribute some valuable insights, particularly in Figure 3. Their model is higher echelon relative to others in the field, but I don't believe that it possesses all the components necessary to replicate formation of mycobacterial granulomas in vivo. That being said, their identification of donor-dependent aggregation phenotypes by mycobacterial strain has the potential to enable future investigations of human and mycobacterial genetic components that are involved in the formation of TB granulomas.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #2:

      The authors indicated that they had added coefficients of variation for within-lineage heterogeneity (line 93), but I can't seem to find this.

      The coefficients of variation were indeed included as suggested, and can be found in lines 94-96 of the current revised version of the manuscript. The sentence states: “Nevertheless, substantial intra-lineage heterogeneity could be observed, particularly within L1 and L2 (coefficients of variation 84.4% [L1] and 66.0% [L2] vs. 32.6% [L3], 34.6% [L4] and 31.9% [L5]).”

      They were unable to address my question on the impact of T-cell depletion from PBMC on bacterial growth? Their discussion should include that this experimental limitation means that they are unable to test cause and effect for the relationship between T cell proliferation and bacterial growth.

      As recommended, this experimental limitation is now included in the discussion in lines 344-346.

      Reviewer #3:

      EM:

      Based on the authors lack of resources, I don't believe that electron microscopy experiments should be required for this publication. However, it should be noted that EM is performed on fixed samples such that implementation of those protocols as it relates to bio-safety is no more demanding than the preparation of samples for other common assays performed outside of the BSL3.

      We appreciate your understanding regarding our lack of resources to carry out the EM experiments, although we recognize the possibility of them being performed on BSL3 samples.

      Granuloma score:

      From the author comments and the manuscript's text, it appears that the "granuloma score" is an attempt at quantitation of PBMC organization. Where every component of the metric [(mean area / mean aspect ratio) / mean n ] is a visual facet of the relative integration of PBMCs into a more organized aggregate. The area and number (n) of aggregates both address regional coalescence of the total number of PBMCs added into the matrix. Whereas the aspect ratio component is an indicator of uniformity of the PBMCs that have been assigned to an individual aggregate. Perhaps another roundness estimation would have been a more precise, but aspect ratio seems fine for their assay. Considering these factors and the author's contention that the aggregates making up (n) are granulomas, the name "granuloma score" is inaccurate and a more appropriate title would be "aggregate organization score" or "aggregate organization index".

      Thank you for the suggested alternative terminology, the term “granuloma score” has been substituted with “aggregate formation score” throughout the manuscript.

      Dormancy:

      In the manuscript, the authors should explicitly reference the validation studies which demonstrate induction of the DosR regulon in the model, lest their previously generated and conducted studies go unappreciated by a broader audience. In the title of that previous work (PMID: 32069329) this group used the designation "dormant-like" to describe the state observed in bacteria within their in vitro granuloma model system, as they also do in LINE 124. This term or a variation of it should be exchanged for dormant/dormancy throughout the manuscript when referring to observations in the model bacteria. It is a more precise description. Further, "dormant-like" allows the latitude to refer to actively growing bacteria in the context of dormancy without running the risk of putting forth confusing or potentially erroneous assertions.

      As recommended, the suffix “-like” has been added to the designation “dormant” when referring to the bacterial phenotype induced in the model. In addition, de induction of the DosR regulon in the model is now mentioned in line 116 and the reference to Kapoor’s work that originally demonstrated it by qPCR included.

      PBMC aggregation:

      I would like to make the authors aware that in well vetted models, cell aggregation as a function of infection does not typically occur in PBMCs on tissue culture plates until day 6 post infection (PMID: 25691598, Fig 2). Further, this group's own published protocol for the model under consideration in this manuscript (PMID: 33659472, Fig1) explicitly states that "Formation of granuloma like structures can be observed after 7-8 days", the implication being that prior to 7 days granuloma like structures cannot be observed reliably. Regardless, it seems evident that the authors will not be conducting additional experiments for this publication, which I find acceptable. However a proper negative control would certainly strengthen evidence for the association of strain specific bacterial and host responses with the granulomatous response in this model.

      We had interpreted the reviewer’s previous comment regarding PBMC aggregation as referring to a different experimental model rather than a matter of timing. Since many other studies have previously assessed the impact of strain/lineage variability in macrophage responses, in this work we decided to focus on later time points and we did include uninfected as a negative control. Nonetheless, we agree it would be indeed very interesting to additionally evaluate monocyte/macrophage early responses and we will take it into account for future studies.

      Use of antiquated terminology:

      I can appreciate the desire to establish continuity between publications by using the same abbreviation for TNF but it will come at a cost. Using outdated terms in general makes people more dismissive of the work. Perhaps something to consider.

      Since this seems an important issue to the reviewer, we have replaced the term TNF-a with TNF throughout the manuscript.

    1. eLife Assessment

      The study by Chen and Phillips provides evidence for a dynamic switch in the small RNA repertoire of the Argonaute protein NRDE-3 during embryogenesis in C. elegans. The work is supported by convincing experimental data, shedding light on RNA regulation during development. While the functional relevance of this process warrants further investigation, this study provides valuable insights into small RNA pathways with broader implications for developmental biology and gene regulation in other systems.

    2. Reviewer #1 (Public review):

      Summary:

      Chen and Phillips describe the dynamic appearance of cytoplasmic granules during embryogenesis analogous to SIMR germ granules, and distinct from CSR-1-containing granules, in the C. elegans germline. They show that the nuclear Argonaute NRDE-3, when mutated to abrogate small RNA binding, or in specific genetic mutants, partially colocalizes to these granules along with other RNAi factors, such as SIMR-1, ENRI-2, RDE-3, and RRF-1. Furthermore, NRDE-3 RIP-seq analysis in early vs. late embryos is used to conclude that NRDE-3 binds CSR-1-dependent 22G RNAs in early embryos and ERGO-1-dependent 22G RNAs in late embryos. These data lead to their model that NRDE-3 undergoes small RNA substrate "switching" that occurs in these embryonic SIMR granules and functions to silence two distinct sets of target transcripts - maternal, CSR-1 targeted mRNAs in early embryos and duplicated genes and repeat elements in late embryos.

      Strengths:

      The identification and function of small RNA-related granules during embryogenesis is a poorly understood area and this study will provide the impetus for future studies on the identification and potential functional compartmentalization of small RNA pathways and machinery during embryogenesis.

      Weaknesses:

      (1) The authors acknowledge the following issue that loss of SIMR granules have no significant impact on NRDE-3 small RNA loading weakens the functional relevance of these structures. However, this point is clearly discussed and, as they note in their Discussion, it is entirely possible that these embryonic granules may be "incidental condensates."

    3. Reviewer #2 (Public review):

      Summary:

      NRDE-3 is a nuclear WAGO-clade Argonaute that, in somatic cells, binds small RNAs amplified in response to the ERGO-class 26G RNAs that target repetitive sequences. This manuscript reports that, in the germline and early embryos, NRDE-3 interacts with a different set of small RNAs that target mRNAs. This class of small RNAs were previously shown to bind to a different WAGO-clade Argonaute called CSR-1, which is cytoplasmic unlike nuclear NRDE-3. The switch in NRDE-3 specificity parallels recent findings in Ascaris where the Ascaris NRDE homolog was shown to switch from sRNAs that target repetitive sequences to CSR-class sRNAs that target mRNAs.

      The manuscript also correlates the change in NRDE-3 specificity with the appearance in embryos of cytoplasmic condensates that accumulate SIMR-1, a scaffolding protein that the authors previously implicated in sRNA loading for a different nuclear Argonaute HRDE-1. By analogy, and through a set of corelative evidence, the authors argue that SIMR foci arise in embryogenesis to facilitate the change in NRDE-3 small RNA repertoire. The paper presents lots of data that beautifully documents the appearance and composition of the embryonic SIMR-1 foci, including evidence that a mutated NRDE-3 that cannot bind sRNAs accumulate in SIMR-1 foci in SIMR-1-dependent fashion.

    4. Reviewer #3 (Public review):

      Summary:

      Chen and Phillips present intriguing work that extends our view on the C. elegans small RNA network significantly. While the precise findings are rather C. elegans specific there are also messages for the broader field, most notably the switching of small RNA populations bound to an argonaute, and RNA granules behavior depending on developmental stage. The work also starts to shed more light on the still poorly understood role of the CSR-1 argonaute protein and supports its role in the decay of maternal transcripts. Overall, the work is of excellent quality, and the messages have a significant impact.

      Strengths:

      Compelling evidence for major shift in activities of an argonaute protein during development, and implications for how small RNAs affect early development. Very balanced and thoughtful discussion.

      Weaknesses:

      The switch between maternal and zygotic NRDE-3 remains unaddressed

    1. eLife Assessment

      This useful study presents findings on the efficacy and mechanisms of linalool protection against Saprolegnia parasitica oomycetes in the grass carp model. The evidence presented is solid since the methods, data and analyses broadly support the claims with only minor weaknesses. This work will be of great interest to scientists within the fields of aquaculture, ichthyology, microbiology, and drug discovery.

    2. Reviewer #1 (Public review):

      Summary:

      The works seeks to investigate the efficacy of linalool as a natural alternative for combating Saprolegnia parasitica infections, which would provide great benefit to aquaculture. This paper shows the effect of linalool in vitro using a variety of techniques including changes in S. parasitica membrane integrity following linalool exposure and alterations in cell metabolism and ribosome function. Additionally, this work goes on to show that prophylactic and concurrent treatment of linalool at the time of S. parasitica infection can improve survival and tissue damage in vivo in their grass carp infection model. The conclusions of the paper are partially supported by the data with the corrections done by the authors improving clarity such that I believe there is merit in the work.

    3. Reviewer #2 (Public review):

      Summary:

      In this study, the authors aimed to delineate the antimicrobial activity of linalool and tried to investigate the mode of action on linalool against S. parasitica infection. One of the main focus of this work was to identify the in vitro and in vivo mechanisms associated with the protective role of linalool against S. parasitica infection.

      Strengths:

      (1) Authors have used a variety of techniques to prove their hypothesis.<br /> (2) Adequate number of replicates were used in their studies.<br /> (3) Their findings showed a protective role of linalool against oomycetes and makes it an attractive future antibiotic in the aquaculture industry.

      Weaknesses: The revised version of the manuscript is more thoroughly written with clearer explanations, however there are a few weaknesses in this manuscript.

      (1) Although the introduction section was rewritten with rationale, it's still lengthy and not very much to the point.<br /> (2) The claim of linalool regulating the gut microbiota is based on the correlation analysis only. It's not super convincing and requires experimental validation to strengthen the claim.

      Overall, the conclusions drawn by the authors are justified by the data. Importantly, this paper has discovered the novelty of the compound linalool as a potent antimicrobial agent and might open up future possibilities to use this compound in the aquaculture industry.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      (1) Adding microscopy of the untreated group to compare Figure 2A with would further strengthen the findings here.

      First of all, we would like to thank Reviewer #1 for their comments and efforts on our manuscript. We have carefully revised it. We used a time-lapse method to capture images at 0 minutes, before any drugs were added. We will change '0 min' to 'untreated,' which will further strengthen the findings.

      (2) Quantification of immune infiltration and histological scoring of kidney, liver, and spleen in the various treatment groups would increase the impact of Figure 4.

      Thank you very much to Reviewer #1 for their comments and efforts on our manuscript. We have revised it carefully. We conducted quantitative analysis of immune infiltration in the kidney, liver, and spleen across different treatment groups. However, due to the extremely low number of abnormal cells in the negative control, treatment, and prophylactic groups, neither the instrument nor manual methods could reliably gate the cells. Consequently, quantification of immune infiltration and histological scoring were not performed.

      (3) The data in Figure 6 I is not sufficiently convincing as being significant.

      Thanks so much for Reviewer #1 comments and efforts for our manuscript. We have revised it carefully. Previous researches have shown that antibiotics and other drugs can cause alterations in gut microbiota. Therefore, we plan to study the effects of antibiotics on gut microbiota. To conduct this research, we need to isolate these microbes from the gut. Although this process is challenging, we still aim to explore the gut microbiota. If possible, we will continue to delve into interesting aspects of how antibiotics affect gut microbiota in future studies.

      (4) Comparisons of the global transcriptomic analysis of the untreated group to the PC, LP, and LT groups would strengthen the author's claims about the immunological and transcriptomic changes caused by linalool and provide a true baseline.

      Thanks so much for Reviewer #1 comments and efforts for our manuscript. We have revised it carefully. Due to the initial research design and data analysis strategy, we have focused on comparisons among the PC, LP, and LT groups to more directly explore the differences under various treatment conditions. Specifically, while the transcriptomic data from the untreated group could provide a basic reference, it has shown limited relevance to the core hypotheses of our study. Our research has aimed to investigate the immunological and transcriptomic changes among the treatment groups rather than comparing treated and untreated states. We believe that the current experimental design and data analysis have effectively revealed the mechanisms of linalool and that the additional comparisons among the treatment groups have further supported our conclusions. We hope the reviewer understands the rationale behind our experimental design. If there are additional suggestions, we are more than willing to further optimize the content of our manuscript.

      Reviewer #2 (Public review):

      (1) The authors have taken for granted that the readers already know the experiments/assays used in the manuscript. There was not enough explanation for the figures as well as figure legends.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. We will provide more detailed explanations of the experiments and assays used in the manuscript, as well as enhance the descriptions in the figure legends, to ensure that readers have a clear understanding of the figures and their context.

      (2) The authors missed adding the serial numbers to the references.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. We will add serial numbers to the references to ensure proper citation and improve the clarity of our manuscript.

      (3) The introduction section does not provide adequate rationale for their work, rather it is focused more on the assays done.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. We will add a section to the introduction that provides a rationale for our work, specifically focusing on the impact of plant extract on immunoregulation.

      (4) Full forms are missing in many places (both in the text and figure legends), also the resolution of the figures is not good. In some figures, the font size is too small.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We will ensure that all abbreviations are expanded where necessary, both in the text and figure legends. Additionally, we will improve the resolution of the figures and increase the font size where needed to enhance clarity.

      (5) There is much mislabeling of the figure panels in the main text. A detailed explanation of why and how they did the experiments and how the results were interpreted is missing.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. We will improve the labeling of the figure panels, provide detailed explanations of the experimental methods, including their rationale and interpretation, and clarify the connections between the methods.

      (6) There is not enough experimental data to support their hypothesis on the mechanism of action of linalool. Most of the data comes from pathway analysis, and experimental validation is missing.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Actually, in our manuscript the transcriptomic data are not alone, and we carried out many experiments to substantiate the changes inferred from the transcriptomic data as SEM, TEM, CLSM, molecular docking, RT-qPCR, histopathological examinations. The detailed information is listed as below.

      As shown in Figure 2, we combined the transcriptomic data related to membrane and organelle with SEM, TEM, and CLSM images. After deep analysis of these data and observation together, we illustrated that cell membrane may be a potential target for linalool.

      As shown in Figure 3, we carried out molecular docking to explore the specific binding protein of linalool with ribosome which were screen out as potential target of linalool by transcriptomic data.

      As shown in Figure 5, transcriptomic data illustrated that linalool enhanced the host complement and coagulation system. To substantiate these changes, we carried out RT-qPCR to detect those important immune-related gene expressions, and found that RT-qPCR analysis results were consistent with the expression trend of transcriptome analysis genes.

      As shown in Figure 4 and 5, transcriptomics data revealed that linalool promoted wound healing tissue repair, and phagocytosis (Figure. 5E). To ensure these, we carried out histopathological examinations, and found that linalool alleviated tissue damage caused by S. parasitica infection on the dorsal surface of grass carp and enhancing the healing capacity (Figure. 4G).

      Overall, we will conduct additional experiments to verify the mechanism of action of linalool in the future.

      Reviewer #1 (Recommendations for the authors):

      (1) Figure 1 Panel G is not referenced in the legend, this should be fixed

      Thanks so much for Reviewer #1 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 1. The order of Panel F and G in Figure 1 is wrong. We have modified the order of Figure 1.

      (2) Statistical comparisons between groups in Figure 4 Panels C-F is lacking and should be added.

      Thanks so much for Reviewer #1 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 4 C-F. We have added statistical comparisons between groups in Figure 4 Panels C-F.

      (3) Capitalize Kidney label in Figure 4G.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 4G. We have capitalized the K of kidney.

      Reviewer #2 (Recommendations for the authors):

      (1) The authors missed adding the serial numbers to the references. I could not go through the references to cross-check if they cited the right ones because it's extremely difficult to figure out which one corresponds to which reference number.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the references. We have added the serial numbers to the references.

      (2) In the last paragraph of the introduction section, most of the techniques in the paper were summarized which does not go with the flow of the paper. The introduction should not be focused on the different techniques used the focus should be more on the rationale of the work. It would be nice if the last paragraph could be rewritten.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 85-94. We have added a section to the introduction that provides a rationale for our work, specifically focusing on the impact of plant extract on immunoregulation.

      (3) The resolution of the figures is not good.

      Thank you for your suggestion. We have revised it carefully. Please check all the figures. We have increased the resolution and size of all the figures.

      (4) Mostly, the figure legends sound like results, with not enough explanation. Full forms are missing in many places which would make the readers go back to the text/other figures each time.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it throughout the manuscript and all the figure legends. We have added full names and abbreviations to both the manuscript and all the figure legends so that we don't make the readers go back to the text/other figures each time.

      (5) Figure 1:

      Figure 1A: there is not enough explanation for this panel. It's not clear from the text which other EOs than Linalool are referred to here. Which EOs were extracted from daidai flowers?

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in the Figure 1A. Figure 1A is divided into “Essential oils (EOs)” and “The main compounds of EOs” to make it easier to distinguish.

      Figure 1B: do the three different wells of each set represent three replicates? If so, are they biological/technical replicates? Also, I'm not sure how the MFC was determined from this figure (line 116) because clearly this panel only corresponds to the determination of MICs, not MFCs.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 126-130. The three different wells of each set represent three biological replicates. After adding 5 μL of resazurin dye, when the color of the wells turned to pink, the linalool concentration in the first non-pink well corresponded to the MIC. The culture liquid in the well where no mycelium growth was seen was marked onto the plate and incubated at 25°C for 7 days. The well with the lowest linalool concentration and no mycelium growth was identified as MFC.

      Figure 1C: the figure legend says that the effect of linalool on mycelium growth inhibition was done over a 6hr timepoint but according to the figure the timepoint was 60hr. I am also confused about the concentrations of linalool used. Although a range of concentration from 0 to 0.4% is mentioned, I only see the time vs diameter curves for 7 concentrations.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 983 and Figure 1C. We have changed 6 h to 60 h in the figure legends. The reason why only the time vs diameter curves for 7 concentrations in Figure1C is that the growth inhibition of 0.4%, 0.2% and 0.1% linalool on mycelial growth is the same. As a result, the time vs diameter curves coincide. We have shown the time and diameter curves of 0.4%, 0.2% and 0.1% concentration with three dotted lines of different colors and sizes in Figure 1C.

      Figure 1D: mislabeled as 1G in the figure panel.

      Figures 1E and 1G: Figure 1E is missing and I do not see any figure legend for Figure 1G.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 1. The order of Panel F and G in Figure 1 is wrong. We changed the order of Figure 1 ABCDEF, no Figure G.

      Overall, Figure 1 is very confusing and needs rewriting. Also, there is a need to add more explanation of the figure panels in the results section.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 1. We have corrected all the problems in Figure1. And we have added more explanation of the figure panels in the results section, and increased the correlation between methods, in order to show how to carry out the experiment logically and interpret the results, please check them in Line 126-130, 144-147, 174-179, 213-217, 343-345, 677-682.

      (6) Figure 2:

      The authors could justify the reason for doing the experiments before moving into the results they got.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the methods and results in the manuscript, please check them in Line 126-130, 144-147, 174-179, 213-217, 343-345, 677-682. We have added more explanation of the figure panels in the results section, and increased the correlation between methods, in order to show how to carry out the experiment logically and interpret the results.

      What concentration of linalool was used?

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 992-996. The mycelium treated with 6×MIC (0.3%) linalool was observed by Confocal laser scanning microscopy (CLSM), and the mycelium treated with 1×MIC 0.05% linalool was observed by Scanning Electron Microscope (SEM) and transmission electron microscopy (TEM).

      The full form of DEGs has been mentioned later, but it should be mentioned in the figure legend of Figure 2 as this is the first time the term was used. Also, what is the full form of DEPs?

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 168, 175, 182, 631, 998, 1001. The word DEPs in Figure 2I was incorrect, and we have changed DEPs to DEGs.

      Is there a particular reason for looking into the cellular component rather than molecular function and biological processes in the GO analysis? (what I see is that Figure 2H indicates the prevalence of catalytic activity, binding, cellular, and metabolic processes as well). Also, there is not enough explanation of the observation from Figure 2I (both in the results section and figure legend).

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 174-179, 998-1002 (Figure 2I). The reason we looked at cellular components rather than molecular functions and biological processes in GO analysis is because we focused more on the effects of cell membranes and cell walls. These results are closely related to and echo the results of our scanning electron microscopy (SEM) and transmission electron microscopy (TEM), and also support the results of electron microscopy. Enough explanations have added to the results and figure legend section to explain the observations from Figure 2I.

      (7) Figure 3:

      Figures 3A and 3B: The adjusted p value is already indicated in the figures, so there is no need to add statistical significance (Asterix) to each bar. The resolution for these panels is not good and the font is too small.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 3A and 3B. We have removed statistical significance (Asterix) from Figure3A and 3B. If we are lucky, we will upload the clearest figures when the manuscript is published.

      Figure 3C: the figure legend is missing (wrongly added as KEGG analysis, which should be network analysis). The numbering for the figure legends is wrong. What are the node sizes (5, 22, 40, 58) mentioned in the figure represent? Also, I wonder why ribosome biogenesis in eukaryotes has been indicated as the most enriched pathway despite its less connection to the other nodes.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 3C. Figure 3C is KEGG analysis generated by software, not network analysis. For the convenience of readers, we have made a new Figure of KEGG analysis.

      Figure 3D: KEGG enrichment and GO analysis: global/local search? Which database was used as a reference?

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the 633-635. Functional enrichment analysis was performed using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. KEGG pathway analysis was conducted using Goatools.

      Figure 3E: why were the RNA pol structures compared? The authors did not mention anything about this panel in their results.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the line 207. We found that many DEGs related to ribosome biogenesis (Figure 3D) and RNA polymerase (Figure 3E) are down expressed. Because RNA polymerase is closely related to ribosome biogenesis, the downregulation of RNA polymerase directly affects the synthesis of ribosome-related RNAs, including rRNA, mRNA, and tRNA, thereby inhibiting ribosome production. This relationship is particularly significant in cell growth, division, and the response to external environmental changes.

      Figures 3F and 3G: please mention which model is illustrated (ribbon/sphere model).

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the line 1010-1015. The tertiary structure of NOP1 was displayed using a cartoon representation. Molecular docking of linalool with NOP1 was performed by enlarging the regions binding to the NOP1 activation pocket to showcase the detailed amino acid structures, which were presented using a surface model, while the small molecule was displayed with a ball-and-stick representation.

      Figure 3H: this panel needs more explanation. Why were some of the ABC transporters upregulated while some were downregulated?

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. It is a common phenomenon that microorganisms adjust the expression of genes related to substance transport in response to different environmental stimuli to optimize their survival strategies. The expression of ATP-binding cassette (ABC) transporters can be upregulated or downregulated due to various factors, such as environmental stimuli, metabolic demands, energy consumption, species specificity, and signaling molecules. This explains why some ABC transporters are upregulated while others are downregulated.

      (8) Figure 4:

      There was no statistical significance shown in the figures (D-F) which makes me wonder how they worked out that there was any significant increase/decrease, as mentioned in the text. What are the p values? What is the number of replicates? What concentration of linalool was used?

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully.  Please check the Figure 4D-F. In this study, 4 groups were established: (1) Positive control (PC) group (10 fish infected with S. parasitica). (2) Linalool therapeutic (LT) group (10 fish infected with S. parasitica, soaked in 0.00039% linalool in a 20L tank for 7 days). (3) Linalool prophylactic (LP) group (10 uninfected fish soaked in 0.00039% linalool in a 20L tank for 2 days, followed by the addition of 1×10<sup>6</sup> spores/mL secondary zoospores). (4) Negative control (NC) group (10 uninfected fish without linalool treatment). Each group had 3 replicate tanks. In each group, 8 fish were utilized for immunological assays, and on day 7, blood samples were collected from the tail veins using heparinized syringes and left to coagulate overnight at 4°C. Kits from Nanjing Jiancheng Institute (Nanjing, China) were used to measure lysozyme (LZY) activity, superoxide dismutase (SOD) activity, and alkaline phosphatase (AKP) activity.

      (9) Figure 5:

      Again, the resolution and font size are off. Please mention the full forms of the terms used in the figure legend. The interpretation of the in vivo protective mechanism of linalool is completely based on GO enrichment and KEGG pathway analysis (also some transcriptional analysis). The only wet lab validation done was by checking the mRNA level of some cytokines but that does not necessarily validate what the authors claim.

      Thank you for your suggestion. We have revised it carefully. Please check all the figures and figure legend. We have increased the resolution and size of all the figures and used the full forms of the terms in figure legend. If we are lucky, we will upload the clearest figures when the manuscript is published. Currently, in the field of aquaculture research, mRNA quantification at the genetic level faces numerous challenges compared to model organisms like mice and zebra fish, primarily due to the lack of available antibodies. For instance, antibodies related to grass carp have not yet been commercialized, making protein-level studies and validations significantly more difficult. This lack of antibodies limits the progress of protein verification. However, we hope to design more experiments and validation tests in the future to gradually overcome these technical bottlenecks and provide stronger support for research in the future.

      (10) Figure 6:

      There is not enough explanation on why and how the experiments were done. It seems like the authors already presumed that the readers know the experiments. The interpretation of the PCA plot is not clear. Why are the quadrant sizes different? How was the heat map plotted? Also, the claim of linalool regulating the gut microbiota is only dependent on the correlation analysis and there is no wet lab validation for this. The data represented in this figure is not enough to prove their hypothesis and needs further investigation.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check the Figure 6. We will improve the labeling of the figure panels, provide detailed explanations of the experimental methods, including their rationale and interpretation, and clarify the connections between the methods.

      The goal of PCoA is to preserve the distance relationships between samples as much as possible through the principal coordinates, thereby revealing the differences or patterns in microbial composition among different groups. For example, in our study, PCoA analysis demonstrated that the microbial compositions of the positive control (PC), linalool prophylactic (LP), and linalool therapeutic (LT) groups showed significant differences in the reduced dimensional space, possibly indicating that these treatments had a notable impact on the microbial community.

      In our study, the heatmap was generated using the Majorbio Cloud Platform. This platform visualized the preprocessed microbial community data, providing an intuitive representation of the differences in microbial composition and relative abundance among samples. The platform automatically performed steps such as data normalization, color mapping, and clustering analysis, offering convenience for data analysis and interpretation.

      Previous researches have shown that antibiotics and other drugs can cause alterations in gut microbiota. Therefore, we plan to study the effects of antibiotics on gut microbiota. To conduct this research, we need to isolate these microbes from the gut. Although this process is challenging, we still aim to explore the gut microbiota. If possible, we will continue to delve into interesting aspects of how antibiotics affect gut microbiota in future studies.

      (11) Figure 7:

      This figure does not clarify how they did the interpretation. The in vivo study does not phenocopy their in vivo studies.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. we have carefully reviewed and confirmed the current experimental design and data analysis. Although we have not made any changes to Figure 7, we have further clarified the interpretation of the results in the revised manuscript, especially concerning the discrepancies between the in vivo and in vitro studies. We have added more experimental background information to help better understand the possible reasons for these differences. We hope the reviewer will understand our explanation and we look forward to your further feedback.

      (12) Minor comments:

      Line 61: what's meant by "et al"?

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 61. We have removed "et al".

      Line 87-88: please add a citation referring to the earlier studies.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 109.

      Line 151-152: the term "related to" has been used a couple of times. Mentioning it once in the beginning and avoiding repeating the same word might be better.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 168-171.We have rewritten this paragraph to avoid repeating the word “related to”.

      How did they reconstitute the EO compounds?

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. The EO compounds we used in our experiments were partially extracted from essential oils in the laboratory and partially purchased from ThermoFisher (USA).

      Line 544: needs explanation of how there was a 2-fold dilution in the concentrations shown in the figure compared to the concentrations mentioned here.

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. We set the concentration of MIC assay for mycelium to be 0.8%, 0.4%, 0.2%, 0.1%, 0.05%, 0.025%, 0.0125%, and 0.00625%, and the concentration of MIC assay for spores to be 0.4%, 0.2%, 0.1%. 0.05%, 0.025%, 0.0125%, 0.00625%. Figure 1B shows the MIC determination of linalool on spores, while the MIC determination of mycelium is not shown.

      Line 546: remove "were".

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 573. We have removed "were".

      Line 555: what concentration of malachite green and tween 20 was used?

      Thanks so much for Reviewer #2 comments and efforts for our manuscript. We have revised it carefully. Please check it in Line 579-580. 2.5mg /mL malachite green and 1% Tween 20 were used.

    1. eLife Assessment

      This useful study uses a model of Streptococcus suis (a pig pathogen) infection in mice using an intranasal route, the natural route of infection ignored in most of the literature. The study aims to understand how capsular polysaccharides (CPS) contribute to neuropathology and virulence. The findings suggest that the olfactory route may lead to meningitis before bacteremia occurs and that CPS down-regulation may play a role in this process. However, the study remains incomplete as presented.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript by Wang et al. investigates the relationship between Streptococcus Suis (S. Suis) growth phases and levels of virulence factor, capsular polysaccharide (CPS), in the bacterial cell wall. They use an understudied mouse intranasal infection model to connect growth phase related CPS abundance to the pathogenicity of the bacteria in the nose, blood, and other organs. Adoptive transfer of serum against either CPS or V5 (five other virulence factors) reinforces their discovery of CPS levels on S. Suis in different organs and stages of infection. Vaccination against bacterial infections can be difficult, and understanding how the serotype of a bacterial pathogen changes between infection sights and systemic disease is critical. Further, understanding host-pathogen interactions at early time points in the upper respiratory tract may have broad implications for vaccine development. While some of the results are interesting and compelling, others are not supported by the data and require further experimental work.

      Strengths:

      The model of intranasal infection is compelling to expand upon work previously done in vitro and with systemic routes of infection. The histology and fluorescent imaging of the olfactory epithelium and olfactory bulb complement work in Figure 2 about the attachment of S. suis to epithelial cells and the bacterial burden over time in different organs of Figure 3. Histology was performed at 1 hour and 9 days after intranasal infection with stationary phase S. suis and drives home that this pathogen can invade the olfactory nerve and may potentially cause bacterial meningitis seen in some infected swine.

      The adoptive transfer of either anti-CPS or anti-V5 to mice before infection at both longer (12 hr), and shorter (0.5 hr) time points is useful to demonstrate that the changes in cell wall composition between the NALT/CSF and blood compartments result in different efficacy in clearing bacteria from those locations. This is fundamental for the development of vaccines for the swine industry and begs those developing other bacterial vaccines to consider what virulence factors are the most useful as neutralizing antibody targets at the sight of bacterial invasion.

      Demonstrating that the amount of CPS within the cell wall of S. Suis is related to the growth phase of the bacteria is an important consideration for vaccine development. While others had previously shown that CPS levels were higher in the blood than in the CNS, and that CPS decreases the invasion of epithelial cells, the close look at the olfactory epithelium at an early time point ties together in vitro findings. The control of a CPS-negative strain was critical to understanding their findings. The location and the microbial community that bacterial pathogens live within may change the growth phase and therefore also the cell wall components.

      Weaknesses:

      The authors present compelling data that is relevant to the development of anti-bacterial vaccinations and show a relationship between CPS levels and pathogenicity. However, the use of a laboratory murine model requiring acetic acid pre-treatment and a high i.n. dose. Therefore, the findings presented may not represent what occurs in swine. Furthermore, several conclusions are not supported by the data and require substantial new experimental support. Thus, major concerns remain that impact the validity of the findings.

      Major concerns for the manuscript:

      The intranasal infections were done with S. Suis in the stationary phase which has been shown to have less CPS on the cell wall. While this mimics the literature that shows S. Suis to have less CPS in the CNS, the difference in the pathogenesis of a log phase vs. stationary phage intranasal infection would be interesting. Especially because the bacteria is a part of the natural microbial community of swine tonsils, it is curious if the change in growth phase and therefore CPS levels may be a causative reason for pathogenic invasion in some pigs. To take this line of thought a step further, the authors should consider taking the bacteria from NALT/CSF and blood and compare the lag times bacteria from different organs take to enter a log growth phase to show whether the difference in CPS is because S. Suis in each location is in a different growth phase. If log phase bacteria were intranasally delivered, would it adapt a stationary phase life strategy? How long would that take? Lastly, the authors should be cautious about claims about S. suis downregulating CPS in the NALT for increased invasion and upregulating CPS to survive phagocytosis in blood. While it is true that the data shows that there are different levels of CPS in these locations, the regulation and mechanism of the recorded and observed cell wall difference is not investigated past the correlation to the growth phase. While mechanistic work is outside the scope of the current work, readers should keep in mind that these results may be explained multiple ways. In addition, the mouse model is used rather than the usual host of a pig. The NALTs of conventional pigs and SPF mice certainly have unique microbial communities and this may affect the pathogenesis of S. suis in the mouse, therefore influencing the results. Because the authors show a higher infection rate in the mouse with acetic acid, they may want to consider investigating what the mouse NALT microenvironment is naturally doing to exclude more bacterial invasion in future studies. Is it simply a host mismatch or is there something about the microbiome or steady-state immune system in the nose of mice that is different from pigs?

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer 1:

      (1) Some conclusions are not completely supported by the present data, and at times the manuscript is disjoint and hard to follow. While the work has some interesting observations, additional experiments and controls are warranted to support the claims of the manuscript.

      Thank you for the comments. We revised some of the claims and conclusions to be more objective and result-supportive.

      (2) While the authors present compelling data that is relevant to the development of anti-bacterial vaccinations, the data does not completely match their assertions and there are places where some further investigation would further the impact of their interesting study.

      We do not fully agree with the reviewer's comments. We have demonstrated that changes in CPS levels during infection are associated with pathogenesis, which will guide future studies on the underlying mechanisms. A significant amount of effort is required for studying mechanisms, which is beyond the scope of this research. We concur with the reviewer that assertions should be made cautiously until further studies are conducted. We have revised these assertions to align with the data and to avoid extrapolating the results (pages 7, lines 126, 133-136; page 11, lines 216-218; page 13, line 264; and page 18, lines 378-383).

      (3) The difference in the pathogenesis of a log phase vs. stationary phage intranasal infection would be interesting. Especially because the bacteria is a part of the natural microbial community of swine tonsils, it is curious if the change in growth phase and therefore CPS levels may be a causative reason for pathogenic invasion in some pigs.

      S. suis is a part of the natural microbial community of swine tonsils but not mouse NALT. It is interesting to know if CPS levels are low in pig tonsils since CPS is hydrophilic and not conducive to bacterial adhesion. In the study, mice were i.n. infected with a high dose of the bacteria, which could increase opportunities for dissemination (acidic acid may not be a contributor since with or without it is similar). S. suis getting into other body compartments from pig tonsils might be triggered by other conditions, such as viral coinfection, nasal cavity inflammation, cold weather, and decreased immunity.

      Experiments with pig blood and phagocytes have shown that genes involved in the synthesis of CPS are upregulated in pig blood. In contrast, these genes are downregulated [1]. In addition, the absence of CPS correlated with increased hydrophobicity and phagocytosis, proposing that S. suis undergoes CPS phase variation and could play a role in the different steps of S. suis infection [2]. We showed direct evidence of encapsulation modulation associated with S. suis pathogenesis in mice. A pig infection model is required to confirm these findings.

      (4) The authors should consider taking the bacteria from NALT/CSF and blood and compare the lag times bacteria from different organs take to enter a log growth phase to show whether the difference in CPS is because S. suis in each location is in a different growth phase. If log phase bacteria were intranasally delivered, would it adapt a stationary phase life strategy? How long would that take? 

      What causes CPS regulation in vivo is not known. CPS changes in different culture stages, indicating that stress, such as nutrition levels, is one of the signals triggering CPS regulation. The microenvironment in the body compartments is far more complex than in vitro, in which host cells, immune factors and others may affect CPS regulation, individually or collectively. The reviewer’ question is important but the suggested experiment is impracticable since bacterial numbers taken from organs are few, and culturing the bacteria in vitro would obliterate the in vivo status.  

      (5) Authors should be cautious about claims about S. suis downregulating CPS in the NALT for increased invasion and upregulating CPS to survive phagocytosis in blood. While it is true that the data shows that there are different levels of CPS in these locations, the regulation and mechanism of the recorded and observed cell wall difference are not investigated past the correlation to the growth phase.

      We lower the tone and change the claim as “suggest a correlation between lower CPS in the NALT and a greater capacity for cellular association, whereas elevated CPS levels in the blood are linked to improved resistance against bactericidal activity. However, the mechanisms behind these associations remain unknown.” (page 7, lines 133-136).

      (6) The mouse model used in this manuscript is useful but cannot reproduce the nasal environment of the natural pig host. It is not clear if the NALTs of pigs and mice have similar microbial communities and how this may affect the pathogenesis of S. Suis in the mouse. Because the authors show a higher infection rate in the mouse with acetic acid, they may want to consider investigating what the mouse NALT microenvironment is naturally doing to exclude more bacterial invasion. Is it simply a host mismatch or is there something about the microbiome or steady-state immune system in the nose of mice that is different from pigs?

      It is a very interesting comment. The mice are SPF level. The microenvironment in SPF mouse NALT should be significantly different from conventional pig tonsils. Although NALT in mice resembles pig tonsils in function, many factors may contribute to the sensitivity to S. suis colonization in the pig nasal cavity, such as the microbiome and local steady-state immune system. More complex microbiota in tonsils could be one of the factors. Analyzing what makes S. suis inclined towards colonization in pig tonsils by SPF and conventional pigs are an ideal experiment to answer the question. 

      (7) Have some concerns regarding the images shown for neuroinvasion because I think the authors mistake several compartments of the mouse nasal cavity as well as the olfactory bulb. These issues are critical because neuroinvasion is one of the major conclusions of this work.

      Thank you for your comments. The olfactory epithelium (OE) is located directly underneath the olfactory bulb in the olfactory mucosa area and lines approximately half of the nasal cavities of the nasal cavity. The remaining surface of the nasal cavity is lined by respiratory epithelium, which lacks neurons. The olfactory receptor neuron in OE is stained green in the images by β-tubulin III, a neuron-specific marker. The respiratory epithelium is colorless due to the absence of nerve cells. Similarly, the green color stained by β-tubulin III identifies the olfactory bulb. The accuracy of the anatomic compartments of the mouse nasal cavity has been checked and confirmed by referring to related literature [3, 4].

      References

      (1) Wu Z, Wu C, Shao J, Zhu Z, Wang W, Zhang W, Tang M, Pei N, Fan H, Li J, Yao H, Gu H, Xu X, Lu C. The Streptococcus suis transcriptional landscape reveals adaptation mechanisms in pig blood and cerebrospinal fluid. RNA. 2014 Jun;20(6):882-98.

      (2) Charland N, Harel J, Kobisch M, Lacasse S, Gottschalk M. Streptococcus suis serotype 2 mutants deficient in capsular expression. Microbiology (Reading). 1998 Feb;144 ( Pt 2):325-332.

      (3) Pägelow D, Chhatbar C, Beineke A, Liu X, Nerlich A, van Vorst K, Rohde M, Kalinke U, Förster R, Halle S, Valentin-Weigand P, Hornef MW, Fulde M. The olfactory epithelium as a port of entry in neonatal neurolisteriosis. Nat Commun. 2018;9(1):4269.

      (4) Sjölinder H, Jonsson AB. Olfactory nerve--a novel invasion route of Neisseria meningitidis to reach the meninges. PLoS One. 2010 Nov 18;5(11):e14034.

      Reviewer 2:

      (1) However, there are serious concerns about data collection and interpretation that require further data to provide an accurate conclusion. Some of these concerns are highlighted below:

      Both reviewers were concerned about some of the interpretations of the results. We modified the interpretations in related lines throughout the manuscript (Please see the related responses to Reviewer 1).

      (2) In figure 2, the authors conclude that high levels of CPS confer resistance to phagocytic killing in blood exposed S. suis. However, it seems equally likely that this is resistance against complement mediated killing. It would be important to compare S. suis killing in animals depleted of complement components (C3 and C5-9).

      We thank the reviewer for the comment. The experiment should be Bactericidal Assay instead of anti-phagocytosis killing. CPS is a main inhibitor of C3b deposition [1]. It interferes with complement-mediated and receptor-mediated phagocytosis; and direct killing. Data in Figure 2C is expressed as “% of bacterial survival in whole blood” for clarity (page 8, Fig. 2C and page 23, lines 489-490).

      (3) Intranasal administration non-CPS antisera provides a nice contrast to intravenous administration, especially in light of the recently identified "blood-olfactory barrier". Can the authors provide any insight into how long and where this antibody would be located after intranasal administration? Would this be antibody mediated cellular resistance, or something akin to simple antibody "neutralization"

      Anti-V5 may not stay long locally following intranasal administration. Efficient reduction of S. suis colonization in NALT supports that anti-V5 could recognize and neutralize the bacteria in NALT quickly, thereby reducing further dissemination in the body. Antibody-mediated phagocytosis may not play a major role because neutrophils are mainly present in the blood but not in the tissues.  

      (4) The micrographs in Figure 7 depict anatomy from the respiratory mucosa. While there is no histochemical identification of neurons, the tissues labeled OE are almost certainly not olfactory and in fact respiratory. However, more troubling is that in figures 7A,a,b,e, and f, the lateral nasal organ has been labeled as the olfactory bulb. This undermines the conclusion of CNS invasion, and also draws into question other experiments in which the brain and CSF are measured.

      We understand the significance of your concerns and appreciate your careful review of Figure 7. The olfactory epithelium (OE) is situated directly beneath the olfactory bulb in the olfactory mucosa area and covers about half of the nasal cavity. This positioning allows information transduction between the olfactory and the olfactory epithelium. The remaining surface of the nasal cavity is lined with respiratory epithelium, which does not contain neurons and primarily serves as a protective barrier. In contrast, the olfactory epithelium consists of basal cells, sustentacular cells, and olfactory receptor neurons. The olfactory receptor neurons are specifically stained green in the images using β-tubulin III, a marker that is unique to neurons. The respiratory epithelium appears colorless due to the lack of nerve cells. Similarly, the green staining with β-tubulin III also highlights the olfactory bulb. The anatomical structures indicated in the images are consistent with those described in the literature [2, 3], confirming that the anatomy of the nasal cavity has been accurately identified.

      (5) Micrographs of brain tissue in 7B are taken from distal parts of the brain, whereas if olfactory neuroinvasion were occurring, the bacteria would be expected to arrive in the olfactory bulb. It's also difficult to understand how an inflammatory process would be developed to this point in the brain -even if we were looking at the appropriate region of the brain -within an hour of inoculation (is there a control for acetic acid induced brain inflammation?). Some explanations about the speed of the immune responses recorded are warranted.

      Thank you for highlighting this issue. Cerebrospinal fluid (CSF) flows into the subarachnoid space surrounding the spinal cord and the brain. There are direct connections from this subarachnoid space to lymphatic vessels that wrap around the olfactory nerves as they cross the cribriform plate towards the nasal submucosa. This connection allows for the drainage of CSF into the nasal submucosal lymphatics in mice [4, 5]. Bacteria may utilize this CSF outflow channel in the opposite direction, which explains the development of brain inflammation in the distal areas of brain tissue adjacent to the subarachnoid space. We have included additional relevant information in the revised manuscript (page 16, lines 323-325).

      (6) The detected presence of S. suis in the CSF 0.5hr following intranasal inoculation is difficult to understand from an anatomical perspective. This is especially true when the amount of S. suis is nearly the same as that found within the NALT. Even motile pathogens would need far longer than 0.5hr to get into the brain, so it's exceedingly difficult to understand how this could occur so extensively in under an hour. The authors are quantifying CSF as anything that comes out of the brain after mincing. Firstly, this should more accurately be referred to as "brain", not CSF. Secondly, is it possible that the lateral nasal organ -which is mistakenly identified as olfactory bulb in figure 7- is being included in the CNS processing? This would explain the equivalent amounts of S. suis in NALT and "CSF".

      The high dose of inoculation used in the experiment may explain the rapid presence of S. suis in the CSF. Mice exhibit low sensitivity to S. suis infection, and the range for the effective intranasal infectious dose is quite narrow. Higher doses lead to the quick death of the mice, while lower doses do not initiate an infection at all. The dose used in this study is empirical and is intended to facilitate the observation of the progression of S. suis infection in mice.

      The NALT tissue and CSF samples are collected separately. After obtaining the NALT tissue, the nasal portion was carefully separated from the rest of the head along the line of the eyeballs. The brain tissue was then extracted from the remaining part of the head to collect the CSF, and it was lacerated to expose the subarachnoid space without being minced. This procedure aims to preserve the integrity of the brain tissue as much as possible. Further details about the CSF collection process can be found in the Materials and Methods section (page 24, lines 508-512).

      (7) To support their conclusions about neuroinvasion along the olfactory route and /CSF titer the authors should provide more compelling images to support this conclusion: sections stained for neurons and S. suis, images of the actual olfactory bulb (neurons, glomerular structure etc).

      Thank you. We respectfully disagree with the reviewer. We stained neurons using a neuron-specific marker to identify the anatomical structures of the olfactory bulb and olfactory epithelium (in green). We used an S. suis-specific antibody to highlight the bacteria present in these areas (in orange and red). The images, along with the bacteria found in the cerebrospinal fluid (CSF) and the brain inflammation observed early in the infection, strongly support our conclusion regarding brain invasion through the olfactory pathway. Please see the response to question 4 for further clarification.

      References

      (1) Seitz M, Beineke A, Singpiel A, Willenborg J, Dutow P, Goethe R, Valentin-Weigand P, Klos A, Baums CG. Role of capsule and suilysin in mucosal infection of complement-deficient mice with Streptococcus suis. Infect Immun. 2014 Jun;82(6):2460-71.

      (2) Sjölinder H, Jonsson AB. Olfactory nerve--a novel invasion route of Neisseria meningitidis to reach the meninges. PLoS One. 2010 Nov 18;5(11):e14034.

      (3) Pägelow D, Chhatbar C, Beineke A, Liu X, Nerlich A, van Vorst K, Rohde M, Kalinke U, Förster R, Halle S, Valentin-Weigand P, Hornef MW, Fulde M. The olfactory epithelium as a port of entry in neonatal neurolisteriosis. Nat Commun. 2018;9(1):4269.

      (4) Yoon JH, Jin H, Kim HJ, Hong SP, Yang MJ, Ahn JH, Kim YC, Seo J, Lee Y, McDonald DM, Davis MJ, Koh GY. Nasopharyngeal lymphatic plexus is a hub for cerebrospinal fluid drainage. Nature. 2024 Jan;625(7996):768-777.

      (5) Spera I, Cousin N, Ries M, Kedracka A, Castillo A, Aleandri S, Vladymyrov M, Mapunda JA, Engelhardt B, Luciani P, Detmar M, Proulx ST. Open pathways for cerebrospinal fluid outflow at the cribriform plate along the olfactory nerves. EBioMedicine. 2023 May;91:104558.

      Response to Recommendations for the authors:

      Reviewer 1:

      Minor concerns for the manuscript:

      (1) In the introduction, please consider giving a little more background about the bacteria itself and how it causes pathogenesis.

      We appreciate your suggestion. We have included additional background on the virulent factors and the pathogenesis of the bacteria in the introduction to enhance understanding of the results (page 4, lines 63-69).

      (2) Figure 2C would be more correct to say percent survival as the CFUs before and after are what are being compared and not if the bacteria is being phagocytosed or not. Flow cytometry of the leukocytes and a fluorescent S. Suis would show phagocytosis. Unless that experiment is performed, the authors cannot claim that there is a resistance to phagocytosis.

      Thank you for your feedback. We agree with the reviewer that the experiment should be Bactericidal Assay rather than anti-phagocytosis killing. CPS interferes with complement-mediated phagocytosis and direct killing, and receptor-mediated phagocytosis. To enhance clarity, the data in Fig. 2C has been presented as “% of bacterial survival in whole blood” (page 8).  

      (3) There are two different legends present for Figure 1. Please resolve.

      We apologize for the oversight. The redundant figure legend has been removed (page 6).

      (4) There are places such as in lines 194-195, that there are assertions and interpretations about the data that are not directly drawn from the data. These hypotheses are valuable, but please move them to the discussion.

      Thank you for your suggestion. The hypothesis has been moved to the Discussion section (page 19, lines 402 - 405).

      (5) In Figure 4B, higher resolution images would strengthen the ability of non-microbiologists to see the differences in CPS levels in the cell wall.

      We achieved the highest resolution possible for clearer distinctions in CPS levels. To enhance the visualization of the different CPS levels in the images, we revised the description of the CPS changes in Figure 4B within the results section (page 11, lines 208-213).

      (6) In Figure 5 there is no D. Further, the schematics throughout would be easier to parse with the text if the challenge occurred at time 0. Consider revising them for clarity.

      Thank you for highlighting the error. We have removed "i.v + i.n (Fig. 5)" from Figure 5A and made adjustments to the schematic illustrations in Figures 5 and 6 as recommended by the reviewer (page 14).

      (7) What is the control for the serum? The findings for figures 5 and 6 would be much stronger if a non- S. Suis isotype control serum was also infused.

      We used a naive serum as a control to avoid interference from a non-S. suis isotype control that targets other surface molecules of S. suis serotypes.

      (8) Figure 6 legend does not include the anti-CPS treatment.

      Thank you. We have added anti-CPS serum in the legend (page 15, line 249).

      (9) Figure 7 legend does not include the time point for panel 7A.

      Thank you. The time point is shown on Fig.7A (page 17).

      (10) Figure 7 should show OB micrographs or entire brain including the OB.

      The neuron-specific marker, β-tubulin III, identifies the neuro cells in the olfactory bulb (OB) as shown in Fig. 7A. Unfortunately, we were unable to provide an image of the entire brain that includes the OB due to limitations in our section preparation. We apologize for the mislabeled structure in Fig. 7A, which may have caused confusion. We have corrected the labeling for consistency (see page 15, lines 257-260). Additionally, we included a drawing of the sagittal plane of the rodent's nose, depicting the compartments of the OB, olfactory epithelium (OE), nasal cavity (NC), and brain. This illustration, presented in Fig. 7B on page 17, aims to clarify the structural and functional connections between the nasopharynx and the CNS.

      (11) Some conclusions may be better drawn if figures were to be consolidated. As noted above, the data at times feels disjointed and the importance is more difficult for readers to follow because data are presented further apart. Particularly figures 5 and 6 which are similar with different time points and controls of antisera administrative routes; placing these figures together would be an example of increasing continuity throughout the paper.

      Thank you for the valuable suggestion. Figures 5 and 6, along with their related descriptions in the results section, have been combined for better cohesiveness (pages 14-15).

      Reviewer #2:

      To support their conclusions about neuroinvasion along the olfactory route and /CSF titer the authors should provide more compelling images to support this conclusion: sections stained for neurons and S. suis, images of the actual olfactory bulb (neurons, glomerular structure etc).

      Please refer to our responses to Reviewer 1's Question 7, Reviewer 2's Questions 4 and 7 in the public reviews, and Reviewer 1's Question 10 in the authors' recommendations.

    1. eLife Assessment

      This valuable study reports the link between a disruption in testicular mineral (phosphate) homeostasis, FGF23 expression, and Sertoli cell dysfunction. The data supporting the conclusion are solid. This work will be of interest to biomedical researchers working on testis biology and male infertility. The assessment is based on the editors' critical evaluation of the authors' responses.

    2. Reviewer #1 (Public review):

      The authors have strengthened their conclusions by providing additional information about the specificity of their antibodies, but at the same time the authors have revealed concerning information about the source of their antibodies.

      It appears that many of the antibodies used in this study have been discontinued because the supplier company was involved in a scandal of animal cruelty and all their goats and rabbits Ab products were sacrificed. The authors acknowledge that this is unfortunate but they also claim that the issue is out of their hands.

      The authors' statement is false; the authors ought to not use these antibodies, just as the providing company chose to discontinue them, as<br /> those antibodies are tied to animal cruelty. The issue that the authors feel OK with using them is of concern. In short, please remove any results from unethical antibodies.

      Removal of such results also best serves science. That is, any of their results using the discontinued antibodies means that the authors' results are non-reproducible and we should be striving to publish good, reproducible science.

      For the antibodies that do not have unethical origins the authors claim that their antibodies have been appropriately validated, by "testing in positive control tissue and/or Western blot or in situ hybridization". This is good but needs to be expanded upon. It is a strong selling point that the Abs are validated and I want to see additional information in their Supplementary Table 2 stating for each Ab specifically:

      (1) What +ve control tissue was used in the validation of each Ab and which species that +ve control came from. Likewise, if competition assays to confirm validity was used, please also specify.

      (2) Which assay was the Ab validated for (WB, IHC, ELISA, all etc)

      (3) For Antibodies that were validated for, or using WBs please let the reader know if there were additional bands showing.

      (4) Include references to the literature that supports these validations. That is, please make it easy for the reader to appreciate the hard work that went into the validation of the Antibodies.

      Finally, for the Abs, when the authors write that "All antibodies used have been validated by testing in positive control tissue and/or Western blot or in situ hybridization" I fail to understand what in situ hybridisation means in this context. I am under the impression that in situ hybridisation is some nucleic acid -hybridising-to-organ or tissue slice. Not polypeptide binding.

    3. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Public review):

      The authors have strengthened their conclusions by providing additional information about the specificity of their antibodies, but at the same time the authors have revealed concerning information about the source of their antibodies.

      It appears that many of the antibodies used in this study have been discontinued because the supplier company was involved in a scandal of animal cruelty and all their goats and rabbits Ab products were sacrificed. The authors acknowledge that this is unfortunate but they also claim that the issue is out of their hands.

      The authors' statement is false; the authors ought to not use these antibodies, just as the providing company chose to discontinue them, as those antibodies are tied to animal cruelty. The issue that the authors feel OK with using them is of concern. In short, please remove any results from unethical antibodies.

      Removal of such results also best serves science. That is, any of their results using the discontinued antibodies means that the authors' results are non-reproducible and we should be striving to publish good, reproducible science.

      For the antibodies that do not have unethical origins the authors claim that their antibodies have been appropriately validated, by "testing in positive control tissue and/or Western blot or in situ hybridization". This is good but needs to be expanded upon. It is a strong selling point that the Abs are validated and I want to see additional information in their Supplementary Table 2 stating for each Ab specifically:

      (1) What +ve control tissue was used in the validation of each Ab and which species that +ve control came from. Likewise, if competition assays to confirm validity was used, please also specify.

      (2) Which assay was the Ab validated for (WB, IHC, ELISA, all etc)

      (3) For Antibodies that were validated for, or using WBs please let the reader know if there were additional bands showing.

      (4) Include references to the literature that supports these validations. That is, please make it easy for the reader to appreciate the hard work that went into the validation of the Antibodies.

      Finally, for the Abs, when the authors write that "All antibodies used have been validated by testing in positive control tissue and/or Western blot or in situ hybridization" I fail to understand what in situ hybridisation means in this context. I am under the impression that in situ hybridisation is some nucleic acid -hybridising-to-organ or tissue slice. Not polypeptide binding.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Remove results that have been obtained by unethically-sourced antibody reagents.

      Strengthen the readers' confidence about the appropriateness & validity of your antibodies.

      First, we want to stress that reviewer 1 has raised his critique related to the used of antibodies from Santa Cruz biotechnology not only through the journal. The head of our department and two others were contacted by reviewer 1 directly without going through the journal or informing/approaching the corresponding or first author. It is our opinion that this debate and critique should be handled through the journal and editorial office and not with people without actual involvement in the project.

      It is correct that we have purchased antibodies from Santa Cruz Biotechnologies both mouse, rabbit and goat antibodies as stated in the correspondence with the reviewer.

      As stated in our previous rebuttal – the goat antibodies from Santa Cruz were discontinued due to inadequate treatment of goats after settling with the authorities in 2016.

      https://www.nature.com/articles/nature.2016.19411

      https://www.science.org/content/blog-post/trouble-santa-cruz-biotechnology

      We have used 11 mouse, rabbit or goat antibodies from Santa Cruz biotechnologies in the manuscript as listed in supplementary table 2 of the manuscript and all of them have been carefully validated in other control tissues supported by ISH and/or WB and many of them already used in several publications by our group (https://pubmed.ncbi.nlm.nih.gov/34612843/, https://pubmed.ncbi.nlm.nih.gov/33893301/, https://pubmed.ncbi.nlm.nih.gov/32931047/, https://pubmed.ncbi.nlm.nih.gov/32729975/, https://pubmed.ncbi.nlm.nih.gov/30965119/, https://pubmed.ncbi.nlm.nih.gov/29029242/, https://pubmed.ncbi.nlm.nih.gov/23850520/, https://pubmed.ncbi.nlm.nih.gov/23097629/, https://pubmed.ncbi.nlm.nih.gov/22404291/, https://pubmed.ncbi.nlm.nih.gov/20362668/, https://pubmed.ncbi.nlm.nih.gov/20172873/,  and other research groups. All antibodies used in this manuscript were purchased before the whole world was aware of mistreatment of goats that was evident several years later.

      We do not support animal cruelty in anyway but the purchase of antibodies from Santa Cruz biotechnologies were conducted long before mistreatment was reported. Moreover, antibodies from Santa Cruz biotechnologies are being used in thousands of publications annually. The company has been punished for their misconduct, and subsequently granted permission to produce antibodies from the relevant authorities again.


      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Despite the study being a collation of important results likely to have an overall positive effect on the field, methodological weaknesses and suboptimal use of statistics make it difficult to give confidence to the study's message.

      Strengths:

      Relevant human and mouse models approached with in vivo and in vitro techniques.

      Weaknesses:

      The methodology, statistics, reagents, analyses, and manuscripts' language all lack rigour.

      (1) The authors used statistics to generate P-values and Rsquare values to evaluate the strength of their findings.

      However, it is unclear how stats were used and/or whether stats were used correctly. For instance, the authors write: "Gaussian distribution of all numerical variables was evaluated by QQ plots". But why? For statistical tests that fall under the umbrella of General Linear Models (line ANOVA, t-tests, and correlations (Pearson's)), there are several assumptions that ought to be checked, including typically:

      (a) Gaussian distribution of residuals.

      (b) Homoskedasticity of the residuals.

      (c) Independence of Y, but that's assumed to be valid due to experimental design.

      So what is the point of evaluating the Gaussian distribution of the data themselves? It is not necessary. In this reviewer's opinion, it is irrelevant, not a good use of statistics, and we ought to be leading by example here.

      Additionally, it is not clear whether the homoscedasticity of the residuals was checked. Many of the data appear to have particularly heteroskedastic residuals. In many respects, homoscedasticity matters more than the normal distribution of the residuals. In Graphpad analyses if ANOVA is used but equal variances are assumed (when variances among groups are unequal then standard deviations assigned in each group will be wrong and thus incorrect p values are being calculated.

      Based on the incomplete and/or wrong statistical analyses it is difficult to evaluate the study in greater depth.

      We agree with the reviewer that we should lead by example and improve clarity on the use of the different statistical tests and their application. In response to the reviewer’s suggestion, we have extended the statistical section, focusing on the analyses used. Additionally, we have specified the statistical test used in the figure legends for each figure. Additionally, we did check for Gaussian distribution and homoskedasticity of residuals before conducting a general linear model test, and this has now been specified in the revised manuscript. In case the assumptions were not met, we have specified which non-parametric test we used. If the assumptions were not met, we specified which non-parametric test was used.

      While on the subject of stats, it is worth mentioning this misuse of statistics in Figure 3D, where the authors added the Slc34a1 transcript levels from controls in the correlation analyses, thereby driving the intercept down. Without the Control data there does not appear to be a correlation between the Slc34a1 levels and tumor size.

      We agree with the reviewer that a correlation analysis is inappropriate here and have removed this part of the figure.

      There is more. The authors make statements (e.g. in the figure levels as: "Correlations indicated by R2.". What does that mean? In a simple correlation, the P value is used to evaluate the strength of the slope being different from zero. The authors also give R2 values for the correlations but they do not provide R2 values for the other stats (like ANOVAs). Why not?

      We agree with the reviewer and have replaced the R2 values with the Pearson correlation coefficient in combination with the P value.

      (2) The authors used antibodies for immunos and WBs. I checked those antibodies online and it was concerning:

      (a) Many are discontinued.

      Many of the antibodies we have used were from the major antibody provider Santa Cruz Biotechnology (SCBT). SCBT was involved in a scandal of animal cruelty and all their goats and rabbits were sacrificed, which explains why several antibodies were discontinued, while the mice antibodies were allowed to continue. This is unfortunate but out of our hands.

      (b) Many are not validated.

      We agree with the reviewer that antibody validation is essential. All antibodies used in this manuscript have been validated. The minimal validation has been to evaluate cellular expression in positive control tissue for instance bone, kidney, or mamma. Moreover, many of the antibodies have been used and validated in previous publications (doi: 10.1593/neo.121164, doi:10.1096/fj.202000061RR, doi: 10.1093/cvr/cvv187) including knockout models. Moreover, many antibodies but not all have been validated by western blot or in situ hybridization. We have included the following in the Materials and Methods section: “All antibodies used have been validated by testing in positive control tissue and/or Western blot or in situ hybridization”.

      (c) Many performed poorly in the Immunos, e.g. FGF23, FGFR1, and Kotho are not really convincing. PO5F1 (gene: OCT4) is the one that looks convincing as it is expressed at the correct cell types.

      We fail to understand the criticism raised by the reviewer regarding the specificity of these specific antibodies. We believe the FGF23 and Klotho antibodies are performing exceptionally well, and FGFR1 is abundantly expressed in many cell types in the testis. As illustrated in Figure 2E, the expression of Klotho, FGF23, and FGFR1 is very clear, specific, and convincing. FGF23 is not expressed in normal testis – which is in accordance with no RNA present there either. However, it is abundantly expressed in GCNIS where RNA is present. On the other hand, Klotho is abundantly expressed in germ cells from normal testis but not expressed in GCNIS.

      (d) Others like NPT2A (product of gene SLC34A1) are equally unconvincing. Shouldn't the immuno show them to be in the plasma membrane?

      If there is some brown staining, this does not mean the antibodies are working. If your antibodies are not validated then you ought to omit the immunos from the manuscript.

      We acknowledge your concerns regarding the NPT2A, NPT2B, and NPT2C staining. While the NPT2A antibody is performing well, we understand your reservations about the other antibodies. It's worth noting that NPT2A is not expressed in normal testis (no RNA either) but is expressed in GCNIS where the RNA is also present. Although it is typically present in the plasma membrane, cytoplasmic expression can be acceptable as membrane availability is crucial for regulating NPT2A function, particularly in the kidney where FGF23 controls membrane availability. We are currently involved in a comprehensive study exploring these phosphate transporters in the organs lining the male reproductive tract. In functional animal models, we have observed very specific staining with this NPT2A antibody following exposed to high phosphate or FGF23. Additionally, we are conducting Western Blot analyses with this antibody, which reinforces our belief that the antibody has a specific binding.

      Reviewer #2 (Public Review):

      Summary:

      This study set out to examine microlithiasis associated with an increased risk of testicular germ cell tumors (TGCT). This reviewer considers this to be an excellent study. It raises questions regarding exactly how aberrant Sertoli cell function could induce osteogenic-like differentiation of germ cells but then all research should raise more questions than it answers.

      Strengths:

      Data showing the link between a disruption in testicular mineral (phosphate)homeostasis, FGF23 expression, and Sertoli cell dysfunction, are compelling.

      Weaknesses:

      Not sure I see any weaknesses here, as this study advances this area of inquiry and ends with a hypothesis for future testing.

      We thank the reviewer for the acknowledgment and highlighting that this is an important message that addresses several ways to develop testicular microlithiasis, which indicates that it is not only due to malignant disease but also frequent in benign conditions.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I applaud the authors' approach to nomenclature for rodent and human genes and proteins (italicised for genes, all caps for humans, capitalised only for rodents, etc), but the authors frequently got it wrong when referring to genes or proteins. A couple of examples include:

      (1) SLC34A1 (italics) refers to gene (correct use by the authors) but then again the authors use e.g. SLC34A1 (not italics) to refer to the protein product of SLC34A1(italics) gene. In fact, the protein product of the SLC34A1 (italics) gene is called NPT2A (non-italics).

      (2) OCT4 (italics) refers to gene (correct use by the authors) but then again the authors use e.g. OCT4 (not italics) to refer to the protein product of OCT4 (italics)gene. In fact, the protein product of the OCT4 gene (italics) gene is called PO5F1(non-italics).

      The problem with their incorrect and inconsistent nomenclature is widespread in the manuscript making further evaluation difficult.

      Please consult a reliable protein-based database like Uniprot to derive the correct protein names for the genes. You got NANOG correct though.

      We thank the reviewer for addressing this important point. We have corrected the nomenclature throughout the manuscript as suggested.

      (3) The authors use the word "may" too many times. Also often in conjunction with words like "indicates", and "suggests". Examples of phrases that reflect that the authors lack confidence in their own results, conclusions, and understanding of the literature are:

      "...which could indicate that the bone-specific RUNX2 isoform may also be expressed... "

      "...which indicates that the mature bone may have been..."

      Are we shielding ourselves from being wrong in the future because "may" also means "may not"? It is far more engaging to read statements that have a bit more tooth to them, and some assertion too. How about turning the above statements around, to :

      "...which shows that the bone-specific RUNX2 isoform is also expressed... "

      "...which reveals that the mature bone were..."

      ...then revisit ambiguous language ("may", "might" "possibly", "could", "indicate" etc.) throughout the manuscript?

      It's OK to make a statement and be found wrong in the future. Being wrong is integral to Science.

      Thank you for addressing this. We agree with the reviewer that it is fair to be more direct and have revised many of these vague phrases throughout the manuscript.

      (4) The authors use the word "transporter" which in itself is confusing. For instance, is SLC34A1 an importer or an exporter of phosphate? Or both? Do SLC34As move phosphate in or out of the cells or cellular compartments? "Transporter" sounds too vague a word.

      We understand that it might be easier for the reader with the term "importer". However, we should use the specific nomenclature or "wording" that applies to these transporters. The exact terminology is a co-transporter or sodium-dependent phosphate cotransporter as reported here (doi: 10.1152/physrev.00008.2019). Thus, we will use the terms “co-transporter” and “transporter” throughout the revised manuscript.

    1. eLife Assessment

      This study investigates a dietary intervention that employs a smartphone app to promote meal regularity, findings that have theoretical or practical implications for a subfield and may be clinically useful. The intervention to entice participants to adhere to specific meal times represents a restrictive diet (even though it does not ask to limit caloric intake) similar to a time-restricted feeding diet, while the control subjects are not experiencing or adhering to dietary restrictions. The authors report significant weight loss but did not rigorously assess caloric intake which remains a weakness of this study as food diaries are notoriously unreliable. While the concept is very interesting, the study is considered incomplete, and the rigor of the results should be strengthened in follow-up studies to add more stringent methods to assess caloric intake. Additionally, the study hypothesizes that the intervention resets the circadian clock. However, the study needs an objective method for assessing circadian rhythms, such as actigraphy, in addition to a subjective questionnaire.

    2. Reviewer #3 (Public review):

      In this study, the authors tested a dietary intervention focused on improving meal regularity. Participants first utilized a smartphone application to track their meal frequencies, and then they were asked to restrict their meal intake to times when they most often eat to enhance meal regularity for six weeks. This, supposedly, resulted in some weight loss, supposedly independent of changes in caloric intake.

      The concept is appealing, and it is interesting to use a smartphone app in participants' typical everyday environment to regularize food intake. It asks from participants to stick to meal intake times that are supported in many cultures, and it asks them not to eat outside of what are likely unhealthy habits such as grazing a refrigerator late at night. In essence, this is a restrictive diet, not restricting caloric intake but the timing of food intake, and it has many parallel to time restricted feeding. It is important to note that there are many restrictive diets, and a common problem with restrictive diets is that while they allow one to lose a couple of pounds for a couple of months just as with this diet, the long-term success is very poor because they depend on restriction. This issue is still not discussed.

      Further, why the participants lose weight, whether this is indeed due to a reduction in food intake as implied, or if the weight loss occurred without a reduction in caloric intake as first stated by the authors and now suggested remains to be determined as the method of food diary as a method to assess caloric intake lacks rigor as has been well established and has been shown again and again to be misleading even though many readers without that knowledge draw conclusions from such studies and they should best have been omitted.

      The authors hypothesize that the intervention improves metabolism by improving circadian rhythmicity. That's plausible, but the study provides only a subjective questionnaire and lacks more objective measures such as actigraphy.

      While the authors now state now that this as a pilot study, the study falls short of providing mechanistic insights into what underlies the weight loss and the many correlations provided do not make up for this weakness.

      Overall, while this pilot study introduces an interesting approach to meal regularity, its limitations highlight the need for more rigorous studies to validate these findings.

      (1) Unreliable method of caloric intake

      The trial's reliance on self-reported caloric intake is problematic, as participants tend to underreport intake. As pointed out earlier by me and now cited in the revised manuscript, the NEJM paper (DOI: 10.1056/NEJM199212313272701) reported that some participants underreported caloric intake by approximately 50%, rendering such data unreliable and hence misleading. The question is, why include such unreliable data that is more misleading than informative at all? These data should have been omitted. More rigorous methods for assessing food intake should have been utilized. I understand this requires more effort, such as providing participants with meals, or using better methods that photograph and weigh the meals, etc., but it is certainly feasible. It has been done many times in other studies. Further, the control group was not asked to restrict their diet in any way, and hence, asking for a restriction in timing in the treatment group may be sufficient to reduce caloric intake and induce weight loss.<br /> Merely acknowledging the unreliability of self-reported caloric intake is insufficient, as it still leaves the reader with the impression that this weight loss is independent of caloric intake when, in reality, we actually have no idea if food intake contributes to it. A more robust approach to assessing food intake is imperative. Even if a decrease in caloric intake is observed through rigorous measurement, as I am convinced a more rigorous study would unveil testing this paradigm, this intervention may merely represent another restrictive diet among countless others that show that one may lose weight by going on a diet. Seemingly, any restrictive diet works for a few months. The trouble is they do not work long-term because they depend on restriction. I agree with the authors that their intervention seems common sense and has little downside, but one also needs to be realistic about the prospects of this intervention.

      (2) Lack of objective data regarding circadian rhythm

      The assessment of circadian rhythm using the MCTQ, a self-reported measure of chronotype, is subjective. More objective methods like actigraphy would have strengthened the study.

      Actigraphy is considered better than a sleep questionnaire for assessing circadian rhythms because it provides objective data on activity patterns over time, offering a more accurate picture of sleep-wake cycles compared to subjective self-reported information from a questionnaire.

      The authors' responses to my prior review are misleading.

      I understand that this is a pilot study. Is it appropriate to point out weaknesses and flaws in the conclusion drawn from a pilot study? Absolutely, that is the reviewer's job.

      I also understand that food intake can affect circadian rhythm, which was part of the rationale behind the study. Is it appropriate to criticize the study for not examining the effect of the intervention on circadian rhythm using objective measures provided by actigraphy? Yes, it is, as this would have provided mechanistic insights that are more rigorous. I understand that this was not the declared goal, but it should have been examined in a pilot study. To jump to the conclusion that based on prior studies, the intervention will improve circadian rhythms as the authors do is not rigorous and hence a weakness.

      A less rigorous method, such as a food questionnaire, to assess caloric intake can result in inadequately supported and potentially misleading conclusions. By including it, the reader may conclude that there was no change in caloric intake when indeed we do not know. I disagree with the authors that this is a minor issue. The associations and correlations the authors provide do not solve the issue. Hence, to make it very clear, it remains to be studied if this intervention reduces weight by reducing caloric intake or other mechanisms. Including this data reduces the study's rigor as it suggests that there is no difference in food intake.

      I did not suggest to only use an actimeter (which is a device); I suggested actigraphy. Actigraphy is widely recognized in the field for its utility in circadian rhythm research and provides objective data, while the questionnaire used is subjective. The authors do quote papers comparing their survey to actigraphy by correlation analysis, but the fundamental difference of the two approaches remains. Does an objective measure increase rigor compared to a subjective assessment? Yes, it does.

      Similarly, I did not state "that any form of imposed diet appears to lead to weight loss over several months." I said that many forms of restrictive diets do induce weight loss of a similar magnitude to this diet.

      The authors should have discussed the fundamental confounder of the study in that the treatment group is asked to restrict food intake to specific times while the control group is not asked to restrict in any way and the potential contribution of this to the weight loss observed.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      We would like to remind the editors and reviewers that the present project is a pilot study that does not claim to produce definitive results. Pilot studies are exploratory preliminary studies to test the validity of hypotheses, the feasibility of a study as well as the research methods and the study design. From our point of view, our hypotheses and the feasibility of the pilot study have been confirmed to such an extent that the implementation of a larger study is justified. At the same time, it became clear during the pilot that the methods and design need to be adapted in some areas in order to increase the reliability of the results - a finding that pilot studies are usually conducted to obtain. We discussed these limitations in detail in order to explain the planned changes in the follow-up study. What the reviewers and editors interpret as incompleteness is therefore due to the nature of a pilot study.  We consider it necessary that appropriate standards are taken into account in the evaluation of the present work.

      In addition, we would like to make a counterstatement as to what our main claims, which should be used to assess the strength of evidence, are - and what they are not:

      In the introduction, we describe the background that led to the formation of our hypotheses: Previous animal and human studies show that food, along with light, serves as the main Zeitgeber for circadian clocks. It has also been shown that chrononutrition can lead to weight loss and improved well-being. Based on this, we hypothesized that individualized meal timing can enhance these positive effects. This hypothesis has been validated on the basis of the available results. Contrary to what the editors and reviewers stated, the assumption that the observed beneficial effects are indeed related to an alteration or resetting of endogenous circadian rhythms was not intended to be investigated in this study and is not one of our main claims. This has already been sufficiently demonstrated and, in our view, need not and should not be repeated in every study on chrononutrition. Accordingly, this assumption was not formulated as a working hypothesis or main claim. It is described in the paper as a potential mechanism, the assumption of which is justified on the basis of previous studies. The lack of a corresponding examination and the erroneous insinuation that corresponding results were nevertheless listed by us in the paper as a main claim should therefore not be used as a criterion for downgrading the assessment of the strength of evidence.

      The main criticism of our study is the collection of data using self-reported food and food quantities. This form of data collection is indeed prone to error, as there is little control over the accuracy of the reported data. However, we believe that this problem is limited in scope.

      (1) Contrary to what the editors and reviewers claim, at no point do we write that we are convinced that food intake has not changed. On the contrary, in Figure 2 we explicitly show that there was a change in what some participants reported to us regarding their food intake. We make it clear throughout the text that we could not find any correlation between weight change and the changes in the reports of food quantities/meals. These statements are correct and only what are actual and formulated main claims should be included in the evaluation of the study.

      (2) As previously stated, we conducted analyses that suggest that an unreported reduction in food intake is unlikely to be the cause of weight loss. For the most part, participants did not change their reporting behavior during the exploration and intervention phases. That is, participants who underreported food intake reported similar amounts in both phases of the study, but lost weight only in the intervention phase. To explain their weight loss with imprecise reporting, it would have to be assumed that these participants began to eat less in the intervention phase and at the same time report more in order to achieve similar calorie counts and food composition in the evaluation. We consider such behavior to be very unlikely, especially since it would apply to numerous participants.

      (3) The editors and reviewers reduce the results to the absence of a correlation between weight loss and reported food quantity and composition. In their assessment of the significance of the findings, however, they ignore the fact that we did find a significant correlation in our analyses, namely between weight loss and an increase in the regularity of food intake. There is no correlation between an increase in regularity and a reduction in reported calories (R<sup>2</sup> = 0.01472). This is credible in our view, as it is unlikely that the more regularly participants ate, the more pronounced the error in their reports was (while in reality they ate less than before).

      (4) We also had the requirement for the study design that the participants could carry out the intervention in their normal everyday life and environment in order to test and ensure implementation in real life. We consider it unrealistic to be able to monitor food intake continuously and without interruption over a period of several weeks under these conditions. We therefore see no alternative to self-reporting. As the reviewers and editors did not suggest any alternative methods of data collection that would fulfil the requirements of our study, we assume that, despite criticism and reservations, they generally agree with our assessment and take this into account in their evaluation.

      It is still criticized that some confounding factors are present. The reviewer makes no reference to the fact that we either eliminated these in the last version submitted (age range), identified them as unproblematic (unmatched cohorts, menstrual cycle, shift work) or even deliberately used them in order to be able to test our hypothesis more validly (inclusion of individuals with normal weight, overweight, and obesity).

      Besides, the use of actimeters to determine circadian rhythms as proposed by the editors and reviewers is not valid for this study and the requirement to use them to determine a circadian reset in the eLife assessment is misleading and inappropriate. This instrument only measures physical activity, but not the physiological parameters that are relevant for an investigation in this field of research.

      For the assessment of chronotype alone, the MCTQ questionnaire is a valid instrument that has been validated several times against actimetry (e.g., DOIs: 10.1080/07420528.2022.2025821, 10.1080/07420528.2023.2202246, 10.1016/j.ijpsycho.2016.07.433, 10.1155/2018/5646848). The reviewer's statement that the MCTQ questionnaire is unreliable for determining chronotype is unsupported and incorrect.

      Equally unproven is the statement that any form of imposed diet appears to lead to weight loss over a period of several months.

      Nevertheless, in order to prevent further misunderstandings, we have revised our text in a number of places and clarified that our statements are not irrefutable assertions, but potential interpretations of the results obtained in the pilot study, which are to be analyzed in more detail with regard to the planned more comprehensive study.

    1. eLife Assessment

      This study provides a comprehensive exploration of the role of IL-1β signaling during development of lung injury induced by a combination of underlying inflammation and mechanical ventilation. The data are convincing, and while the translatability of the findings related to therapeutic hypothermia may be somewhat complicated, they have the potential to be very valuable to the field.

    2. Reviewer #1 (Public review):

      Summary:

      The authors found that IL-1b signaling is pivotal for hypoxemia development and can modulate NETs formation in LPS+HVV ALI model.

      Strengths:

      They used IL1R1 ko mice and proved that IL1R1 is involved in ALI model proving that IL1b signalling leads towards ARDS. In addition, hypothermia reduces this effect, suggesting a therapeutic option.

      Weaknesses:

      (1) IL1R1 binds IL1a and IL1b. What would be the role of IL1a in this scenario?

      (2) The authors depleted neutrophils using anti-Ly6G. What about MDSCs? Do these latter cells be involved in ARDS and VILI?

      (3) The authors found that TH inhibited IL-1β release from macrophages led to less NETs formation and albumin leakage in the alveolar space in their lung injury model. A graphical abstract could be included suggesting a cellular mechanism.

      (4) If Macrophages are responsible for IL1b release that via IL1R1 induces NETosis, what happens if you deplete macrophages? what is the role of epithelial cells?

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript by Nosaka et al is a comprehensive study exploring the involvement of IL1beta signaling in a 2-hit model of lung injury + ventilation, with a focus on modulation by hypothermia.

      Strengths:

      The authors demonstrate quite convincingly that interleukin 1 beta plays a role in the development of ventilator-induced lung injury in this model, and that this role includes the regulation of neutrophil extracellular trap formation. The authors use a variety of in vivo animal-based and in vitro cell culture work, and interventions including global gene knockout, cell-targeted knockout and pharmacological inhibition, which greatly strengthen the ability to make clear biological interpretations.

      Weaknesses:

      A primary point for open discussion is the translatability of the findings to patients. The main model used, one of intratracheal LPS plus mechanical ventilation is well accepted for research exploring the pathogenesis and potential treatments for acute respiratory distress syndrome (ARDS). However, the interpretation may still be open to question - in the model here, animals were exposed to LPS to induce inflammation for only 2 hours, and seemingly displayed no signs of sickness, before the start of ventilation. This would not be typical for the majority of ARDS patients, and whether hypothermia could be effective once substantial injury is already present remains an open question. The interaction between LPS/infection and temperature is also complicated - in humans, LPS (or infection) induces a febrile, hyperthermic response, whereas in mice LPS induces hypothermia (eg. Ganeshan K, Chawla A. Nat Rev Endocrinol. 2017;13:458-465). Given this difference in physiological response, it is therefore unclear whether hypothermia in mice and hypothermia in humans are easily comparable. Finally, the use of only young, male animals such as in the current study has been typical but may be criticised as limiting translatability to people.

      Therefore while the conclusions of the paper are well supported by the data, and the biological pathways have been impressively explored, questions still remain regarding the ultimate interpretations.

    4. Author response:

      Public Reviews: 

      Reviewer #1 (Public review): 

      Summary: 

      The authors found that IL-1b signaling is pivotal for hypoxemia development and can modulate NETs formation in LPS+HVV ALI model.  

      Strengths: 

      They used IL1R1 ko mice and proved that IL1R1 is involved in ALI model proving that IL1b signalling leads towards ARDS. In addition, hypothermia reduces this effect, suggesting a therapeutic option.  

      We thank the Reviewer for recognizing the strengths of our study and their positive feedback.

      Weaknesses: 

      (1) IL1R1 binds IL1a and IL1b. What would be the role of IL1a in this scenario? 

      Thank you for asking this question. We have addressed this in our previous paper (Nosaka et al. Front Immunol 2020;11; 207) where we used  anti-IL-1a and IL-1a KO mice (Nosaka et al. Front Immunol 2020;11; 207) in our model and found that neither anti-IL-1a treated mice nor IL-1a KO mice were protected. Thus, IL-1b plays a role in inducing hypoxemia during LPS+HVV but not IL-1a. We will now add this point in our revised manuscript discussion.

      (2) The authors depleted neutrophils using anti-Ly6G. What about MDSCs? Do these latter cells be involved in ARDS and VILI?  

      Anti-Ly6G neutrophils depletion may potentially affect G-MDSCs as well (Blood Adv 2022 Jul 29;7(1):73–86), however, we have not looked directly at G-MDSCs.  If these cells were depleted we would have expected to see an increase in inflammation, which we did not.   

      Instead, anti-Ly6G treated mice were protected. Thus, we can not comment on any presumed role of G-MDSCs in LPS+HVV induced severe ALI model that we used.  

      (3) The authors found that TH inhibited IL-1β release from macrophages led to less NETs formation and albumin leakage in the alveolar space in their lung injury model. A graphical abstract could be included suggesting a cellular mechanism.  

      Thanks for summarizing our findings and the suggestion. Unfortunately, eLIFE does not publish a graphical abstract. We tried to mention this mechanism in the discussion.

      (4) If Macrophages are responsible for IL1b release that via IL1R1 induces NETosis, what happens if you deplete macrophages? what is the role of epithelial cells?  

      Previous studies have found that macrophage depletion is protective in several models of ALI (Eyal. Intensive Care Med. 2007;33:1212–1218., Lindauer.  J Immunol. 2009;183:1419–1426.), and other researchers have found that airway epithelial cells did not contribute to IL-1β secretion (Tang. PLoS ONE. 2012;7:e37689.). We have previously reported that epithelial cells produce IL-18 without LPS priming signal during LPS+HVV (Nosaka et al. Front Immunol 2020;11; 207). Thus, IL-18 is not sufficient to induce Hypoxemia as Saline+HVV treated mice do not develop hypoxemia (Nosaka et al. Front Immunol 2020;11; 207). We will now add this point to the revised discussion of the manuscript.

      Reviewer #2 (Public review): 

      Summary: 

      The manuscript by Nosaka et al is a comprehensive study exploring the involvement of IL1beta signaling in a 2-hit model of lung injury + ventilation, with a focus on modulation by hypothermia. 

      Strengths: 

      The authors demonstrate quite convincingly that interleukin 1 beta plays a role in the development of ventilator-induced lung injury in this model, and that this role includes the regulation of neutrophil extracellular trap formation. The authors use a variety of in vivo animal-based and in vitro cell culture work, and interventions including global gene knockout, cell-targeted knockout and pharmacological inhibition, which greatly strengthen the ability to make clear biological interpretations. 

      We thank the Reviewer for their positive feedback 

      Weaknesses: 

      A primary point for open discussion is the translatability of the findings to patients. The main model used, one of intratracheal LPS plus mechanical ventilation is well accepted for research exploring the pathogenesis and potential treatments for acute respiratory distress syndrome (ARDS). However, the interpretation may still be open to question - in the model here, animals were exposed to LPS to induce inflammation for only 2 hours, and seemingly displayed no signs of sickness, before the start of ventilation. This would not be typical for the majority of ARDS patients, and whether hypothermia could be effective once substantial injury is already present remains an open question. The interaction between LPS/infection and temperature is also complicated - in humans, LPS (or infection) induces a febrile, hyperthermic response, whereas in mice LPS induces hypothermia (eg. Ganeshan K, Chawla A. Nat Rev Endocrinol. 2017;13:458-465). Given this difference in physiological response, it is therefore unclear whether hypothermia in mice and hypothermia in humans are easily comparable. Finally, the use of only young, male animals such as in the current study has been typical but may be criticised as limiting translatability to people. 

      Therefore while the conclusions of the paper are well supported by the data, and the biological pathways have been impressively explored, questions still remain regarding the ultimate interpretations.  

      We agree with the reviewer that at two hours post LPS, there is only minimal pulmonary inflammation at that time (Dagvadorj et al Immunity 42, 640–653). This is a limitation to the experimental model we used in our study. Additionally, as the reviewer pointed out that LPS induces hyperthermia in human, but it is also well-established that physiological hypothermia occurs in humans with severe infections and sepsis (Baisse. Am J Emerg Med. 2023 Sep: 71: 134-138., Werner.  Am J Emerg Med. 2025 Feb;88:64-78.). Therefore, the difference between human and mouse responses to sepsis or infections may be more nuanced.  Furthermore, it is important to distinguish between physiological hypothermia (just <36°C) and therapeutic hypothermia (typically 32-34°C). We will add to the discussion whether hypothermia serves as a protective response, and the transition from normothermia to hyperthermia could have detrimental effects. We only used young male mice in our study as the Reviewer points out; we will also add this point to the revised discussion as a limitation of our study.

    1. eLife Assessment

      This study highlights ITCH as a regulator of SARS-CoV-2 replication by promoting K63-linked ubiquitination of M and E proteins. While the findings are potentially useful, the approaches are overly reliant on ectopic expression models and lack direct mechanistic evidence that ubiquitination of M and E has functional relevance. Accordingly, the strength of evidence is incomplete, as further experiments are needed to validate the findings and address potential confounding factors.

    2. Reviewer #1 (Public review):

      Summary:

      The authors investigated the role of an E3 ubiquitin ligase ITCH in regulating the viral life cycle of SARS-CoV-2. The authors showed that ITCH mediates ubiquitination of the membrane (M) and envelope (E) proteins of SARS-CoV-2. Ubiquitination of E and M results in enhanced interactions between the structural proteins and redistribution of the structural proteins into autophagosomes. The authors claim that the enhanced interactions between structural proteins and trafficking of the structural proteins into autophagosomes contribute to SARS-CoV-2 replication and egress, prompting ITCH as a potential antiviral target. ITCH also alters the cellular distribution of host proteases important for spike cleavage which protect and stabilize spike with cleavage. The authors also demonstrated that SARS-CoV-2 replication is augmented by ITCH in which virus replication is significantly impaired in cells lacking ITCH expression.

      Strengths:

      The authors provided high-quality data with appropriate experimental controls to justify their claims and conclusions. The mechanistic analyses are excellent and presented in a logical manner. The investigation of the role of ubiquitination in coronavirus assembly and egress is novel as most previous studies focused on its role in mediating innate immune responses.

      Weaknesses:

      Although the authors showed that ITCH ubiquitinates E and M proteins, the claim that such ubiquitination promotes virion assembly and egress is circumstantial. The enhanced interaction between the structural proteins and targeting of ubiquitinated structural proteins into autophagosomes does not necessarily result in increased virion production and release as suggested by the authors. There is a disconnect between the ubiquitination of structural proteins and the role of ITCH in augmenting virus replication as shown in Fig. 6A and B. In addition, the authors showed that the catalytic activity of ITCH is important for the localization and maturation of host proteases. However, the mechanism behind is unknown. Also, it is unclear how protection of spike from cleavage conferred by ITCH explains its role in promoting replication as a lack of spike cleavage would inevitably compromise entry. The major weakness of the manuscript is the lack of experimental data that explains the molecular role of ITCH in relation to its phenotype observed during SARS-CoV-2 infection.

    3. Reviewer #2 (Public review):

      Summary:<br /> In this manuscript Qiwang Xiang et al. investigated the role of the E3 ubiquitin ligase ITCH in the life cycle of SARS-CoV-2. They claim the following:<br /> i) ITCH promotes virion assembly by interacting with E and M proteins and enhancing their K63-linked ubiquitination<br /> ii) ITCH-mediated ubiquitination promotes autophagosome-dependent secretion of viral particles.<br /> iii) ITCH stabilizes the viral spike protein by impairing its processing by furin and catepsin L proteases.<br /> The manuscript provides an interesting exploration of ITCH's role in the SARS-CoV-2 life cycle but requires additional work to strengthen key claims and address potential confounding factors.

      Strengths:

      The experiments are sufficiently clear in documenting that ITCH activity is critical for efficient SARS-CoV-2 replication and for M and E proteins K63-linked ubiquitination

      Weaknesses:

      • The manuscript does not convincingly demonstrate how ITCH-mediated ubiquitination of E and M impacts virus assembly and release. Identifying the specific lysine residues in M and E targeted by ITCH, and generating mutant VLPs or recombinant viruses, would strengthen the conclusions.<br /> • Most of the conclusions rely on ITCH overexpression data, which may have off-target effects on Golgi integrity and vesicular trafficking. For instance, figure 4F provides evidence of altered Golgi morphology and TGN46 fragmentation raising concerns that ITCH overexpression could indirectly mislocalize furin, affecting S1/S2 cleavage of the spike protein. In addition, inhibition of furin activity may also lead to off-target effects, given its role in processing numerous host proteins.<br /> • Similarly, ITCH overexpression is likely to indirectly affect cathepsin-L maturation. In addition, the manuscript does not clarify how impaired cathepsin L activity would influence virus assembly or release.<br /> • A major concern is also the lack of quantification and statistical analysis of immunofluorescence images throughout the manuscript, which undermines the reliability of these observations.

    4. Reviewer #3 (Public review):

      Summary:

      Xiang et al. investigated the role of ubiquitin E3 ligase ITCH in SARS-CoV-2 replication. First, they described the role of ITCH on the structural proteins. Here, the ubiquitination of E and M (but not S) leads to an enhanced interaction and presumably virion assembly. In addition, E and M ubiquitination seems to be necessary for p62-guided sequestration into autophagosomes for secretion. Furthermore, ITCH regulates S proteolytic cleavage by changing furin localization and inhibiting CTSL protease maturation. In addition, SARS-CoV-2 infection upregulates ITCH phosphorylation, whereas knockout of ITCH reduces SARS-CoV-2 replication.

      Strengths:

      The proposed study is of interest to the virology community because it aims to elucidate the role of ubiquitination by ITCH in SARS-CoV-2 proteins. Understanding these mechanisms will address broadly applicable questions about coronavirus biology and enhance our knowledge of ubiquitination's diverse functions in cell biology.

      Weakness:

      The involvement of ubiquitin ligases in SARS-CoV-2 replication is not entirely new (see E3 Ubiquitin Ligase RNF5; Yuan et al., 2022; Li et al., 2023). While the data generally support the conclusions, additional work is needed to confirm the role of ITCH in SARS-CoV-2 replication in a biologically relevant context. The vast majority of data is based on transient overexpression experiments of ITCH, which ultimately leads to massive ubiquitination of several viral and host cell factors, including potentially low-affinity substrates not typically recognized under physiological conditions. In addition to that, nearly all experiments were done in cells co-overexpressing ITCH and the viral structural proteins (or cellular proteases) in HEK293T cells. Therefore, a proteomic analysis of protein ubiquitination in a) SARS-CoV-2-infected cells (ideally several cell types) and b) SARS-CoV-2-infected v2T-ITCH-KO cells would verify the ITCH-related ubiquitination of e.g., E and M and would strengthen the whole manuscript. In addition, the few key experiments using SARS-CoV-2 infected cells were performed in VeroE6 cells, which are neither human nor lung-derived. Only in one experiment were lung-derived Calu3 cells included.<br /> Moreover, the manuscript names ITCH as a central regulator of SARS-CoV-2 replication. If ITCH is beneficial for E and M interaction and thereby aids virion assembly, showing its effect on VLP production would be desirable. Clarifications regarding data acquisition and data analysis could strengthen the manuscript and its conclusions.

    1. eLife Assessment

      NCX1 is an important cardiac Ca2+/Na+ exchanger whose activity is tightly regulated. This manuscript describes the structural basis of activation by the lipid PIP2 and inhibition by binding of a small molecule to NCX1. These results provide key insights into NCX1 regulation and cellular Ca2+ signaling, but the evidence presented is still incomplete.

    2. Reviewer #1 (Public review):

      This study uses structural and functional approaches to investigate the regulation of the Na/Ca exchanger NCX1 by an activator, PIP2, and an inhibitor, SEA0400.

      State-of-the-art methods are employed, and the data are of high quality and presented very clearly. The manuscript combines two rather different studies (one on PIP2; and one on SEA0400) neither of which is explored in the depth one might have hoped to form robust conclusions and significantly extend knowledge in the field.

      The novel aspect of this work is the study of PIP2. Unfortunately, technical limitations precluded structural data on binding of the native PIP2, so an unnatural short-chained analog, di-C8 PIP2, was used instead. This raises the question of whether these two molecules, which have similar but very distinctly different profiles of activation, actually share the same binding pocket and mode of action. In an effort to address this, the authors mutate key residues predicted to be important in forming the binding site for the phosphorylated head group of PIP2. However, none of these mutations prevent PIP2 activation. The only ones that have a significant effect also influence the Na-dependent inactivation process independently of PIP2, thus casting doubt on their role in PIP2 binding, and thus identification of the PIP2 binding site. A more extensive mutagenic study, based on the di-C8 PIP2 binding site, would have given more depth to this work and might have been more revealing mechanistically.

      The SEA0400 aspect of the work does not integrate particularly well with the rest of the manuscript. This study confirms the previously reported structure and binding site for SEA0400 but provides no further information. While interesting speculation is presented regarding the connection between SEA0400 inhibition and Na-dependent inactivation, further experiments to test this idea are not included here.

    3. Reviewer #2 (Public review):

      The study by Xue et al. reports the structural basis for the regulation of the human cardiac sodium-calcium exchanger, NCX1, by the endogenous activator PIP2 and the small molecule inhibitor SEA400. This well-written study contextualizes the new data within the existing literature on NCX1 and the broader NCX family. This work builds upon the authors' previous study (Xue et al., 2023), which presented the cryo-EM structures of human cardiac NCX1 in both inactivated and activated states. The 2023 study highlighted key structural differences between the active and inactive states and proposed a mechanism where the activity of NCX1 is regulated by the interactions between the ion-transporting transmembrane domain and the cytosolic regulatory domain. Specifically, in the inward-facing state and at low cytosolic calcium levels, the transmembrane (TM) and cytosolic domains form a stable interaction that results in the inactivation of the exchanger. In contrast, calcium binding to the cytosolic domain at high cytosolic calcium levels disrupts the interaction with the TM domain, leading to active ion exchange.

      In the current study, the authors present two mechanisms explaining how both PIP2 stimulates NCX1 activity by destabilizing the protein's inactive state (i.e., by disrupting the interaction between the TM domain and the cytosolic domain) and how SEA400 stabilizes this interaction, thereby acting as a specific inhibitor of the system.

      The first part of the results section addresses the effect of PIP2 and PIP2 diC8 on NCX1 activity. This is pertinent as the authors use the diC8 version of this lipid (which has a shorter acyl chain) in their subsequent cryo-EM structure due to the instability of native PIP2. I am not an electrophysiology expert; however, my main comment would be to ask whether there is sufficient data here to characterise fully the differences between PIP2 and PIP2 diC8 on NCX1 function. It appears from the text that this study is the first to report these differences, so perhaps this data needs to be more robust. The spread of the data points in Figure 1B is possibly a little unconvincing given that only six measurements were taken. Why is there one outlier in Figure 1A? Were these results taken using the same batch of oocytes? Are these technical or biological replicates? Is the convention to use statistical significance for these types of experiments?

      I am also somewhat skeptical about the modelling of the PIP2 diC8 molecule. The authors state, "The density of the IP3 head group from the bound PIP2 diC8 is well-defined in the EM map. The acyl chains, however, are flexible and could not be resolved in the structure (Fig. S2)."

      However, the density appears rather ambiguous to me, and the ligand does not fit well within the density. Specifically, there is a large extension in the volume near the phosphate at the 5' position, with no corresponding volume near the 4' phosphate. Additionally, there is no bifurcation of the volume near the lipid tails. I attempted to model cholesterol hemisuccinate (PDB: Y01) into this density, and it fits reasonably well - at least as well as PIP2 diC8. I am also concerned that if this site is specific for PIP2, then why are there no specific interactions with the lipid phosphates? How can the authors explain the difference between PIP2 and PIP2 diC8 if the acyl chains don't make any direct interactions with the TM domain? In short, the structures do not explain the functional differences presented in Figure 1.

      The side chain densities for Arg167 and Arg220 are also quite weak. While there is some density for the side chain of Lys164, it is also very weak. I would expect that if this site were truly specific for PIP2, it should exhibit greater structural rigidity - otherwise, how is this specific?

      Given this observation, have the authors considered using other PIP2 variants to determine if the specificity lies with PI4,5P2 as opposed to PI3,5P2 or PI3,4P2? A lack of specificity may explain the observed poor density.

      I also noticed many lipid-like densities in the maps for this complex. Is it possible that the authors overlooked something? For instance, there is a cholesterol-like density near Val51, as well as something intriguing near Trp763, where I could model PIP2 diC8 (though this leads to a clash with Trp763). I wonder if the authors are working with mixed populations in their dataset. The accompanying description of the structural changes is well-written (assuming it is accurate).

      I would recommend that the authors update the figures associated with this section, as they are currently somewhat difficult to interpret without prior knowledge of NCX architecture. My suggestions include:

      - Including the density for the PIP2 diC8 in Figure 2A.

      - Adding membrane boundaries (cytosolic vs. extracellular) in Figure 2B.

      - Labeling the cytosolic domains in Figure 2B.

      - Adding hydrogen bond distances in Figure 2A.

      - Detailing the domain movements in Figure 2B (what is the significance of the grey vs. blue structures?).

      The section on the mechanism of SEA400-induced inactivation is strong. The maps are of better quality than those for the PIP2 diC8 complex, and the ligand fits well. However, I noticed a density peak below F02 on SEA400 that lies within the hydrogen bonding distance of Asp825. Is this a water molecule? If so, is this significant?

      Furthermore, there are many unmodeled regions that are likely cholesterol hemisuccinate or detergent molecules, which may warrant further investigation.

      The authors introduce SEA400 as a selective inhibitor of NCX1; however, there is little to no comparison between the binding sites of the different NCX proteins. This section could be expanded. Perhaps Fig. 4C could include sequence conservation data.

      Additionally, is the fenestration in the membrane physiological, or is it merely a hole forced open by the binding of SEA400? I was unclear as to whether the authors were suggesting a physiological role for this feature, similar to those observed in sodium channels.

    4. Reviewer #3 (Public review):

      NCXs are key Ca2+ transporters located on the plasma membrane, essential for maintaining cellular Ca2+ homeostasis and signaling. The activities of NCX are tightly regulated in response to cellular conditions, ensuring precise control of intracellular Ca2+ levels, with profound physiological implications. Building upon their recent breakthrough in determining the structure of human NCX1, the authors obtained cryo-EM structures of NCX1 in complex with its modulators, including the cellular activator PIP2 and the small molecule inhibitor SEA0400. Structural analyses revealed mechanistically informative conformational changes induced by PIP2 and elucidated the molecular basis of inhibition by SEA0400. These findings underscore the critical role of the interface between the transmembrane and cytosolic domains in NCX regulation and small molecule modulation. Overall, the results provide key insights into NCX regulation, with important implications for cellular Ca2+ homeostasis.

    1. eLife Assessment

      This valuable paper reports machine learning-based image analysis pipelines for the automated segmentation of micronuclei and the detection and sorting of micronuclei-containing cells. These are powerful new tools for researchers who study micronuclei and their physiologic consequences. The analysis of the new tools and their benchmarking is rigorous and convincing; applications and remaining limitations are well explained in the paper.

    2. Reviewer #1 (Public review):

      DiPeso et al. develop two tools to i) classify micronucleated (MN) cells, which they call VCS MN, and ii) segment micronuclei and nuclei with MNFinder. They then use these tools to identify transcriptional changes in MN cells.

      The strengths of this study are:

      - Developing highly specialized tools to speed up the analysis of specific cellular phenomena such as MN formation and rupture is likely valuable to the community and neglected by developers of more generalist methods.

      - A lot of work and ideas have gone into this manuscript. It is clearly a valuable contribution.

      - Combining automated analysis, single-cell labeling, and cell sorting is an exciting approach to enrich for phenotypes of interest, which the authors demonstrate here.

      The authors addressed my original concerns related to the first version of this manuscript.

    3. Reviewer #2 (Public review):

      Summary:

      Micronuclei are aberrant nuclear structures frequently seen following the missegregation of chromosomes. The authors present two image analysis methods, one robust and another rapid, to identify micronuclei (MN) bearing cells. To analyse their software efficacy, the authors study images of cells treated with MPS1 inhibitor to induce chromosome missegregation. Next, the authors use RNA-seq to assess the outcomes of their MN-identifying methods: they do not observe a transcriptomic signature specific to MN but find changes that correlate with aneuploidy status. Overall, this work offers new tools to identify MN-presenting cells, and it sets the stage with clear benchmarks for further software development.

      Strengths:

      Currently, there are no robust MN classifiers with a clear quantification of their efficiency across cell lines (mIoU score). The software presented here tries to address this gap. GitHub material (images, ground truth labels, tools, protocols, etc.) provided is a great asset to computational biologists. The method has been tested in more than one cell line. This method can help integrate cell biology and 'omics' data, making it suitable for multimodal studies.

      Weaknesses:

      Although the classifier outperforms available tools for MN segmentation by providing mIoU, it's not yet at a point where it can be reliably applied to functional genomics assays where we expect a range of phenotypic penetrance in most cell lines (e.g., misshapen, multinucleated, and lagging DNA in addition to micronucleated cells). The discussion considers the nature and proportion of MN in RPE1 cells, and how the classifier is well-suited for RPE1 that predominantly display MN structures. Whether the classifier can rigorously assign MN-presenting cells amidst drastic nuclear aberrancies following a spindle checkpoint loss needs to be tested in the future.

    4. Reviewer #3 (Public review):

      Summary:

      The authors develop automated methods to visually identify micronuclei (MN) and MN-containing cells. The authors then use these methods to isolate MN-containing RPE-1 cells post-photoactivation and analyze transcriptional changes in cells with and without micronuclei. The authors find that RPE-1 cells with MN have similar transcriptomic changes as aneuploid cells and that MN rupture does not lead to vast changes in the transcriptome.

      Strengths:

      The authors develop a method that allows for automating measurements and analysis of micronuclei. This has been something that the field has been missing for a long time. Using such a method has the potential to greatly enhance the field's ability to analyze micronuclei and understand the downstream consequences. The authors also develop a method to identify cells with micronuclei in real-time, mark them using photoconversion, and then isolate them via cell sorting, which could change the way we isolate and study MN-containing cells, and the scale at which we do it. The authors use this method to look at the transcriptome. This method is very powerful as it can allow for the separation of a heterogenous population and subsequent analysis with a much higher sample number than previously possible.

      Weaknesses:

      The major weakness of this paper is the transcriptomic analysis of MN. There is in general large variance between replicates in experiments looking at cells with ruptured versus intact micronuclei. This limits our ability to assess if lack of changes are due to truly not having changes between these populations or experimental limitations. More transcriptomic analysis will be necessary to fully understand the downstream consequences of MN rupture.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      DiPeso et al. develop two tools to (i) classify micronucleated (MN) cells, which they call VCS MN, and (ii) segment micronuclei and nuclei with MMFinder. They then use these tools to identify transcriptional changes in MN cells.

      The strengths of this study are:

      (1) Developing highly specialized tools to speed up the analysis of specific cellular phenomena such as MN formation and rupture is likely valuable to the community and neglected by developers of more generalist methods.

      (2) A lot of work and ideas have gone into this manuscript. It is clearly a valuable contribution.

      (3) Combining automated analysis, single-cell labeling, and cell sorting is an exciting approach to enrich phenotypes of interest, which the authors demonstrate here.

      Weaknesses:

      (1) Images and ground truth labels are not shared for others to develop potentially better analysis methods.

      We regret this omission and thank the reviewer for pointing it out. Both the images and ground truth labels for VCS MN and MNFinder are now available on the lab’s github page and described in the README.txt files. VCS MN: https://github.com/hatch-lab/fast-mn. MNFinder: https://github.com/hatch-lab/mnfinder.

      (2) Evaluations of the methods are often not fully explained in the text.

      The text has been extensively updated to include a full description of the methods and choices made to develop the VCS MN and MNFinder image segmentation modules.

      (3) To my mind, the various metrics used to evaluate VCS MN reveal it not to be terribly reliable. Recall and PPV hover in the 70-80% range except for the PPV for MN+. It is what it is - but do the authors think one has to spend time manually correcting the output or do they suggest one uses it as is?

      VCS MN attempts to balance precision and recall with speed to reduce the fraction of MN changing state from intact to ruptured during a single cell cycle during a live-cell isolation experiment. In addition, we chose to prioritize inclusion of small MN adjacent to the nucleus in our positive calls. This meant that there were more false positives (lower PPV) than obtained by other methods but allowed us to include this highly biologically relevant class of MN in our MN+ population. Thus, for a comprehensive understanding of the consequences of MN formation and rupture, we recommend using the finder as is. However, for other visual cell sorting applications where a small number of highly pure MN positive and negative cells is preferred, such as clonal outgrowth or metastasis assays, we would recommend using the slower, but more precise, MNFinder to get a higher precision at a cost of temporal resolution. In addition, MNFinder, with its higher flexibility and object coverage, is recommended for all fixed cell analyses.

      Reviewer #2 (Public review):

      Summary:

      Micronuclei are aberrant nuclear structures frequently seen following the missegregation of chromosomes. The authors present two image analysis methods, one robust and another rapid, to identify micronuclei (MN) bearing cells. The authors induce chromosome missegregation using an MPS1 inhibitor to check their software outcomes. In missegregation-induced cells, the authors do not distinguish cells that have MN from those that have MN with additional segregation defects. The authors use RNAseq to assess the outcomes of their MN-identifying methods: they do not observe a transcriptomic signature specific to MN but find changes that correlate with aneuploidy status. Overall, this work offers new tools to identify MN-presenting cells, and it sets the stage with clear benchmarks for further software development.

      Strengths:

      Currently, there are no robust MN classifiers with a clear quantification of their efficiency across cell lines (mIoU score). The software presented here tries to address this gap. GitHub material (tools, protocols, etc) provided is a great asset to naive and experienced computational biologists. The method has been tested in more than one cell line. This method can help integrate cell biology and 'omics' studies.

      Weaknesses:

      Although the classifier outperforms available tools for MN segmentation by providing mIOU, it's not yet at a point where it can be reliably applied to functional genomics assays where we expect a range of phenotypic penetrance.

      We agree that the MNFinder module has limitations with regards to the degree of nuclear atypia and cell density that can be tolerated. Based on the recall and PPV values and their consistency across the majority conditions analyzed, we believe that MNFinder can provide reliable results for MN frequency, integrity, shape, and label characteristics in a functional genomics assay in many commonly used adherent cell lines. We also added a discussion of caveats for these analyses, including the facts that highly lobulated nuclei will have higher false positive rates and that high cell confluency may require additional markers to ensure highly accurate assignment of MN to nuclei.

      Spindle checkpoint loss (e.g., MPS1 inhibition) is expected to cause a variety of nuclear atypia: misshapen, multinucleated, and micronucleated cells. It may be difficult to obtain a pure MN population following MPS1 inhibitor treatment, as many cells are likely to present MN among multinucleated or misshapen nuclear compartments. Given this situation, the transcriptomic impact of MN is unlikely to be retrieved using this experimental design, but this does not negate the significance of the work. The discussion will have to consider the nature, origin, and proportion of MN/rupture-only states - for example, lagging chromatids and unaligned chromosomes can result in different states of micronuclei and also distinct cell fates.

      We appreciate the reviewer’s comments and now quantify the frequency of other nuclear atypias and MN chromosome content in RPE1 cells after 24 h Mps1 inhibition (Fig. S1). In summary, we find only small increases in nuclear atypia, including multinucleate cells, misshapen nuclei, and chromatin bridges, compared to the large increase in MN formation. This contrasts with what is observed when mitosis is delayed using nocodazole or CENPE inhibitors where nuclear atypia is much more frequent. Importantly, after Mps1 inhibition, RPE1 cells with MN were only slightly more likely to have a misshapen nucleus compared to cells without MN (Fig. S1C).

      Interestingly, this analysis showed that the VCS MN pipeline, which uses the Deep Retina segmenter to identify nuclei, has a strong bias against lobulated nuclei and frequently fails to find them (Fig. S2B). Therefore, the cell populations analyzed by RNAseq were largely depleted of highly misshapen nuclei and differences in nuclear atypia frequency between MN+ and MN- cells in the starting population were lost (Fig. S9A, compare to Fig. S1C). This strongly suggests that the transcript changes we observed reflect differences in MN frequency and aneuploidy rather than differences in nuclei morphology.

      We agree with the reviewer that MN rupture frequency and formation, and downstream effects on cell proliferation and DNA damage, are sensitive to the source of the missegregated chromatin. In the revised manuscript we make clear that we chose Mps1 inhibition because it is strongly biased towards whole chromosome MN (Fig. S1E), limiting signal from DNA damage products, including chromosome fragments and chromatin bridges. This provides a base line to disambiguate the consequences of micronucleation and DNA damage in more complex chromosome missegregation processes, such as DNA replication disruption and irradiation. 

      Reviewer #3 (Public review):

      Summary:

      The authors develop a method to visually analyze micronuclei using automated methods. The authors then use these methods to isolate MN post-photoactivation and analyze transcriptional changes in cells with and without micronuclei of RPE-1 cells. The authors observe in RPE-1 cells that MN-containing cells show similar transcriptomic changes as aneuploidy, and that MN rupture does not lead to vast changes in the transcriptome.

      Strengths:

      The authors develop a method that allows for automating measurements and analysis of micronuclei. This has been something that the field has been missing for a long time. Using such a method has the potential to advance micronuclei biology. The authors also develop a method to identify cells with micronuclei in real time and mark them using photoconversion and then isolate them via FACS. The authors use this method to study the transcriptome. This method is very powerful as it allows for the sorting of a heterogenous population and subsequent analysis with a much higher sample number than could be previously done.

      Weaknesses:

      The major weakness of this paper is that the results from the RNA-seq analysis are difficult to interpret as very few changes are found to begin with between cells with MN and cells without. The authors have to use a 1.5-fold cut-off to detect any changes in general. This is most likely due to the sequencing read depth used by the authors. Moreover, there are large variances between replicates in experiments looking at cells with ruptured versus intact micronuclei. This limits our ability to assess if the lack of changes is due to truly not having changes between these populations or experimental limitations. Moreover, the authors use RPE-1 cells which lack cGAS, which may contribute to the lack of changes observed. Thus, it is possible that these results are not consistent with what would occur in primary tissues or just in general in cells with a proficient cGAS/STING pathway.

      We agree with the reviewer’s assessment of the limitations of our RNA-Seq analysis. After additional analysis, we propose an alternative explanation for the lower expression changes we observe in the MN+ and Mps1 inhibitor RNA-Seq experiments. In summary, we find that VCS MN has a strong bias against highly lobulated nuclei that depletes this class of cells from both the bulk analysis and the micronucleated cell populations (Fig. S9A). Based on this result, we propose that our analysis reduces the contribution of nuclear atypia to these transcriptional changes and that nuclear morphology changes are likely a signaling trigger associated with aneuploidy.

      We believe that this finding strengthens our overall conclusion that MN formation and rupture do not cause transcriptional changes, as suppressing the signaling associated with nuclei atypia should increase sensitivity to changes from the MN. However, we cannot completely rule out that MN formation or rupture cause a broad low-level change in transcription that is obscured by other signals in the dataset.

      As to cGAS signaling, several follow up papers and even the initial studies from the Greenburg lab show that MN rupture does not activate cGAS and does not cause cGAS/STING-dependent signaling in the first cell cycle (see citations and discussion in text). Therefore, we expect the absence of cGAS in RPE1 cells will have no effect in the first cell cycle, but could alter the transcriptional profile after mitosis. Although analysis of RPE1  cGAS+ cells or primary cells in these experiments will be required to definitively address this point, we believe that our interpretation of our RNAseq results is sufficiently backed up by the literature to warrant our conclusion that MN formation and rupture do not induce a transcriptional response in the first cell cycle.

      Reviewer #1 (Recommendations for the authors):

      I do not recommend additional experimental or computational work. Instead, I just recommend adapting the claims of the manuscript to what has been done. I am just asking for further clarification and minor rewriting.

      (1) The manuscript is written like a molecular biology paper with sparse explanations of the authors' reasoning, especially in the development of their algorithms. I was often lost as to why they did things in one way or another.

      The revised manuscript has thorough explanations and additional data and graphics defining how and why the VCS MN and MNFinder modules were developed. We hope that this clears up many of the questions the reviewer had and appreciate their guidance on making it more readable for scientists from different backgrounds.

      (2) Evaluations of their method are often not fully explained, for example:

      "On average, 75% of nuclei per field were correctly segmented and cropped."

      "MN segments were then assigned to 'parent' nuclei by proximity, which correctly associated 97% of MN."

      Were there ground truth images and labels created? How many? For example, I don't know how the authors could even establish a ground-truth for associating MNs to nuclei if MNs happened to be almost equidistant between two nuclei in their images.

      I suggest a separate subsection early in the Results section where the underlying imaging data + labels are presented.

      We added new sections to the text and figures at the beginning of the VCS MN and MNFinder subsections (Fig. S2 and Fig. S5) with specific information about how ground truth images and labels were generated for both modules and how these were broken up for training, validation, and testing.

      We also added information and images to explain how ground truth MN/nucleus associations were derived. In summary, we took advantage of the fact that 2xDendra-NLS is present at low levels in the cytoplasm to identify cell boundaries. This combined with a subconfluent cell population allowed us to unambiguously group MN and nuclei for 98% of MN, we estimate. These identifications were used to generate ground truth labels and analyze how well proximity defines MN/nuclei groups (Fig.s S1 and S2).

      (3) Overall, I find the sections long and more subtitles would help me better navigate the manuscript.

      Where possible, we have added subtitles.

      (4) Everything following "To train the model, H2B channel images were passed to a Deep Retina neural net ..." is fully automated, it seems to me. Thus, there seems to be no human intervention to correct the output before it is used to train the neural network. Therefore, I do not understand why a neural network was trained at all if the pipeline for creating ground truth labels worked fully automatically. At least, the explanations are insufficient.

      We apologize for the initial lack of clarity in the text and included additional details in the revision. We used the Deep Retina segmenter to crop the raw images to areas around individual nuclei to accelerate ground truth labeling of MN. A trained user went through each nucleus crop and manually labeled pixels belonging to MN to generate the ground truth dataset for training, validation, and imaging in VCS MN (Fig. S2A).

      (5) To my mind, the various metrics used to evaluate VCS MN reveal it not to be terribly reliable. Recall and PPV hover in the 70-80% range except for the PPV for MN+. It is what it is - but do the authors think one has to spend time manually correcting the output or do they suggest one uses it as is? I understand that for bulk transcriptomics, enrichment may be sufficient but for many other questions, where the wrong cell type could contaminate the population, it is not.

      Remarks in the Results section on what the various accuracies mean for different applications would be good (so one does not need to wait for the Discussion section).

      One of the strengths of the visual cell sorting system is that any image analysis pipeline can be used with it. We used VCS MN for the transcriptomics experiment, but for other applications a user could run visual cell sorting in conjunction with MNFinder for increased purity while maintaining a reasonable recall or use a pre-existing MN segmentation program that gives 100% purity but captures only a specific subgroup of micronucleated cells (e.g. PIQUE). 

      To maintain readability, especially with the expansion of the results sections, we kept the discussion of how we envision using visual cell sorting for other MN-based applications in the discussion section.

      (6) I am confused about what "cell" is referring to in much of the manuscript. Is it the nucleus + MNs only? Is it the whole cell, which one would ordinarily think it is? If so, are there additional widefield images, where one can discern cell boundaries? I found the section "MNFinder accurately ..." very hard to read and digest for this reason and other ambiguous wording. I suggest the authors take a fresh look at their manuscript and see whether the text can be improved for clarity. I did not find it an easy read overall, especially the computational part.

      After re-examining how “cell” was used, we updated the text to limit its use to the MNFinder arm tasked with identifying MN-nucleus associations where the convex hull defined by these objects is used to determine the “cell” boundary. In all other cases we have replaced cell with “nucleus” because, as the reviewer points out, that is what is being analyzed and converted. We hope this is clearer.

      (7) Post-FACS PPVs are not that great (Figure 3c). It depends on the question one wants to answer whether ~70% PPV is good enough. Again, would be good to comment on.

      We added discussion of this result to the revision. In summary, a likely reason for the reduced PPV is that, although we maintain the cells in buffer with a Cdk1 inhibitor, we know that some proportion of the cells go through mitosis post-sorting. Since MN are frequently reincorporated into the nucleus after mitosis (Hatch et al, 2013; Zhang et al., 2015), we expect this to reduce the MN+ population. Thus, we expect that the PPV in the RNAseq population is higher than what we can measure by analyzing post-sorted cells that have been plated and analyzed later.

      (8) I am thoroughly confused as to why the authors claim that their system works in the "absence of genetic perturbations" and why they emphasize the fact that their cells are non-transformed: They still needed a fluorescent label and they induce MNs with a chemical Mps1 inhibitor. (The latter is not a genetic manipulation, of course, but they still need to enrich MNs somehow. That is, their method has not been tested on a cell population in which MNs occur naturally, presumably at a very low rate, unless I missed something.) A more careful description of the benefits of their method would be good.

      We apologize for the confusion on these points and hope this is clarified in the revision. We were comparing our system, which can be made using transient transfection, if desired, to current tools that disambiguate aneuploidy and MN formation by deleting parts of chromosomes or engineering double strand breaks with CRISPR to generate single chromosome-specific missegregation events. Most of these systems require transformed cancer cells to obtain high levels of recombination. In contrast, visual cell sorting can isolate micronucleated cells from any cell line that can exogenously express a protein, including primary cells and non-transformed cells like RPE1s.

      Other minor points:

      (1) The authors should not refer to "H2B channels" but to "H2B-emiRFP703 channels". It may seem obvious to the authors but for someone reading the manuscript for the very first time, it was not. I was not sure whether there were additional imaging modalities used for H2B/nucleus/chromatin detection before I went back and read that only fluorescence images of H2B-emiRFP703 were used. To put it another way, the authors are detecting fluorescence, not histones -- unless I misunderstood something.

      To address this point, we altered the text to read “H2B-emiRFP703” when discussing images of this construct. For MNFinder some images were of cells expressing H2B-GFP, which has also been clarified.

      (2) If the level of zoom on my screen is such that I can comfortably read the text, I cannot see much in the figure panels. The features that I should be able to see are the size of a title. The image panels should be magnified.

      In the revision, the images are appended to the end at full resolution to overcome this difficulty. Thank you for your forbearance.

      Reviewer #2 (Recommendations for the authors):

      The methods are adequately explained. The Results text narrating experiments and data analysis is clear. Interpretation of a few results could be clarified and strengthened as explained below.

      (1) RNAseq experiments are a good proof of principle. To strengthen their interpretation in Figures 4 and 6, I would recommend the authors cite published work on checkpoint/MPS1 loss-induced chromosome missegregation (PMID: 18545697, PMID: 33837239, PMC9559752) and consider in their discussion the 'origin' and 'proportion' of micronucleated cells and irregularly shaped nuclei expected in RPE1 lines. This will help interpret Figure 6 findings on aneuploidy signature accurately. Not being able to see an MN-specific signature could be due to the way the biological specimen is presented with a mixture of cells with 'MN only' or 'rupture' or 'MN along with misshapen nuclei'. These features may all link to aneuploidy rather than 'MN' specifically.

      We appreciate the reviewer’s suggestion and added a new analysis of nuclear atypia after Mps1 inhibition in RPE1 cells to Fig. S1. Overall, we found that Mps1 inhibition significantly, but modestly, increased the proportion of misshapen nuclei and chromatin bridges. Multinucleate cells were so rare that instead of giving them their own category we included them in “misshapen nuclei.” These results are consistent with images of Msp1i treated RPE1 cells from He et al. 2019 and Santaguida et al. 2017 and distinct from the stronger changes in nuclear morphology observed after delaying mitosis by nocodazole or CENPE inhibition.

      We also found that the Deep Retina segmenter used to identify nuclei in VCS MN had a significant bias against highly lobulated nuclei (Fig. S2B) that led to misshapen nuclei being largely excluded from the RNAseq analyses. As a result we found no enrichment of misshapen nuclei, chromatin bridges, or dead/mitotic nuclear morphologies in MN+ compared to MN- nuclei in our RNASeq experiments (Fig. S9A).

      (2) As the authors clarify in the response letter, one round of ML is unlikely to result in fully robust software; additional rounds of ML with other markers will make the work robust. It will be useful to indicate other ML image analysis tools that have improved through such reiterations. They could use reviews on challenges and opportunities using ML approaches to support their statement. Also in the introduction, I would recommend labelling as 'rapid' instead of 'rapid and precise' method.

      We updated the text to reference review articles that discuss the benefit of additional training for increasing ML accuracy and changed the text to “rapid.”

      (3) The lack of live-cell studies does not allow the authors to distinguish the origin of MN (lagging chromatids or unaligned chromosomes). As explained in 1, considering these aspects in discussion would strengthen their interpretation. Live-cell studies can help reduce the dependencies on proximity maps (Figure S2).

      The revised text includes new references and data (Fig. S1E) demonstrating that Mps1 inhibition strongly biases towards whole chromosome missegregation and that MN are most likely to contain a single centromere positive chromosome rather than chromatin fragments or multiple chromosomes.

      (4) Mean Intersection over Union (mIOU) is a good measure to compare outcomes against ground truth. However, the mIOU is relatively low (Figure 2D) for HeLa-based functional genomics applications. It will help to discuss mIOU for other classifiers (non-MN classifiers) so that they can be used as a benchmark (this is important since the authors state in their response that they are the first to benchmark an MN classifier). There are publications for mitochondria, cell cortex, spindle, nuclei, etc. where IOU has been discussed.

      We added references to classifiers for other small cellular structures. We also evaluated major sources of error in MNFinder found that false negatives are enriched in very small MN (3 to 9 pixels, or about 0.4 µm<sup>2</sup> – 3 µm<sup>2</sup>, Fig. S6B). A similar result was obtained for VCS MN (Fig. S3B). Because small changes in the number of pixels identified in small objects can have outsized effects on mIoU scores, we suspect that this is exerting downward pressure on the mIoU value. Based on the PPV and recall values we identified, we believe that MNFinder is robust enough to use for functional genomics and screening applications with reasonable sample sizes.

      (5) Figure 5 figure legend title is an overinterpretation. MN and rupture-initiated transcriptional changes could not be isolated with this technique where several other missegregation phenotypes are buried (see point 1 above).

      We decided to keep the figure title legend based on our analysis of known missegregation phenotypes in Fig. S1 and S9 showing that there is no difference in major classes of nuclear atypia between MN+ and MN- populations in this analysis. Although we cannot rule out that other correlated changes exist, we believe that the title represents the most parsimonious interpretation.

      Minor comments

      (1) The sentence in the introduction needs clarification and reference. "However, these interventions cause diverse "off-target" nuclear and cellular changes, including chromatin bridges, aneuploidy, and DNA damage." Off-target may not be the correct description since inhibiting MPS1 is expected to cause a variety of problems based on its role as a master kinase in multiple steps of the chromosome segregation process. Consider one of the references in point 1 for a detailed live-cell view of MPS1 inhibitor outcomes.

      We have changed “off-target” to “additional” for clarity.

      (2) In Figure 3 or S3, did the authors notice any association between the cell cycle phase and MN or rupture presence? Is this possible to consider based on FACS outcomes or nuclear shapes?

      Previous work by our lab and others have shown that MN rupture frequency increases during the cell cycle (Hatch et al., 2013; Joo et al., 2023). Whether this is stochastic or regulated by the cell cycle may depend on what chromosome is in the MN (Mammel et al., 2021) and likely the cell line. Unfortunately, the H2B-emiRFP703 fluorescence in our population is too variable to identify cell cycle stage from FACS or nuclear fluorescence analysis.

      (3) Figure 5 - Please explain "MA plot".

      An MA plot, or log fold-change (M) versus average (A) gene expression, is a way to visualize differently expressed genes between two conditions in an RNASeq experiment and is used as an alternative to volcano plots. We chose them for our paper because most of the expression changes we observed were small and of similar significance and the MA plot spreads out the data compared to a volcano plot and allowed a better visualization of trends across the population.

      (4) Page 7: "our results strongly suggest that protein expression changes in MN+ and rupture+ cells are driven mainly by increased aneuploidy rather than cellular sensing of MN formation and rupture.". This is an overstatement considering the mIOU limits of the software tool and the non-exclusive nature of MN in their samples.

      We agree that we cannot rule out that an unknown masking effect is inhibiting our ability to observe small broad changes in transcription after MN formation or rupture. However, we believe we have minimized the most likely sources of masking effects, including nuclear atypia and large scale aneuploidy differences, and thus our interpretation is the most likely one.

      Reviewer #3 (Recommendations for the authors):

      Overall, the authors need to explain their methods better, define some technical terms used, and more thoroughly explain the parameters and rationale used when implementing these two protocols for identifying micronuclei; primarily as this is geared toward a more general audience that does not necessarily work with machine learning algorithms.

      (1) A clearer description in the methods as to how accuracy was calculated. Were micronuclei counted by hand or another method to assess accuracy?

      We significantly expanded the section on how the machine learning models were trained and tested, including how sensitivity and specificity metrics were calculated, in both the results and the methods sections. The code used to compare ground truth labels to computed masks is also now included in the MNFinder module available on the lab github page. 

      (2) Define positive predictive value.

      The text now says “the positive predictive value (PPV, the proportion of true positives, i.e. specificity) and recall (the proportion of MN found by the classifier, i.e. sensitivity)…”.

      (3) Why is it a problem to use the VCS MN at higher magnifications where undersegmentation occurs? What do the authors mean by diminished performance (what metrics are they using for this?).

      We have included a representative image and calculated mIoU and recall for 40x magnification images analyzed by MNFinder after rescaling in Fig. 2A. In summary, VCS MN only correctly labeled a few pixels in the MN, which was sufficient to call the adjacent nucleus “MN+” but not sufficient for other applications, such as quantifying MN area. In addition, VCS MN did much worse at identifying all the MN in 40x images with a recall, or sensitivity, metric of 0.36. We are not sure why. Developing MNFinder provided a module that was well suited to quantify MN characteristics in fixed cell images, an important use case in MN biology.

      (4) The authors should compare MN that are analyzed and not analyzed using these methods and define parameters. Is there a size limitation? Closeness to the main nucleus?

      We added two new figures defining what contributes to module error for both VCS MN (Fig. S3) and MNFinder (Fig. S6). For VCS MN, false negatives are enriched in very large or very small MN and tend to be dimmer and farther from the nucleus than true positives. False positives are largely misclassification of small dim objects in the image as MN. For MNFinder, the most missed class of MN are very small ones (3-9 px in area) and the majority of false positives are misclassifications of elongated nuclear blebs as MN.

      (5) Are there parameters in how confluent an image must be to correctly define that the micronucleus belongs to the correct cell? The authors discussed that this was calculated based on predicted distance. However, many factors might affect proper calling on MN. And the authors should test this by staining for a cytosolic marker and calculating accuracy.

      We updated the text with more information about how the cytoplasm was defined using leaky 2x-Dendra2-NLS signal to analyze the accuracy of MN/nucleus associations (Fig. S2G-H). In addition, we quantified cell confluency and distance to the first and second nearest neighbor for each MN in our training and testing image datasets. We found that, as anticipated, cells were imaged at subconfluent concentrations with most fields having a confluency around 30% cell coverage (Fig. S2E) and that the average difference in distance between the closest nucleus to an MN and the next closest nucleus was 3.3 fold (Fig. S2F). We edited the discussion section to state that the ability of MN/nuclear proximity to predict associations at high cell confluencies would have to be experimentally validated.

      (6) The authors measure the ratio of Dendra2(Red) v. Dendra2 (Green) in Figure 3B to demonstrate that photoconversion is stable. This measurement, to me, is confusing, as in the end, the authors need to show that they have a robust conversion signal and are able to isolate these data. The authors should directly demonstrate that the Red signal remains by analyzing the percent of the Red signal compared to time point 0 for individual cells.

      We found a bulk analysis to be more powerful than trying to reidentify individual cells due to how much RPE1 cells move during the 4 and 8 hours between image acquisitions. In addition, we sort on the ratio between red and green fluorescence per cell, rather than the absolute fluorescence, to compensate for variation in 2xDendra-NLS protein expression between cells. Therefore, demonstrating that distinct ratios remained present throughout the time course is the most relevant to the downstream analysis.

      To address the reviewer’s concern, we replotted the data in Fig. 3B to highlight changes over time in the raw levels of red and green Dendra fluorescence (Fig. S7D). As expected, we see an overall decrease in red fluorescence intensity, and complementary increase in green fluorescence intensity, over 8 hours, likely due to protein turnover. We also observe an increase in the number of nuclei lacking red fluorescence. This is expected since the well was only partially converted and we expect significant numbers of unconverted cells to move into the field between the first image and the 8 hour image.

      (7) The authors isolate and subsequently use RNA-sequencing to identify changes between Mps1i and DMSO-treated cells. One concern is that even with the less stringent cut-off of 1.5 fold there is a very small change between DMSO and MPS1i treated cells, with only 63 genes changing, none of which were affected above a 2-fold change. The authors should carefully address this, including why their dataset sees changes in many more pathways than in the He et al. and Santaguida et al. studies. Is this due to just having a decreased cut-off?

      The reviewer correctly points out that we observed an overall reduction in the strength of gene expression changes between our dataset of DMSO versus Mps1i treated RPE1 cells compared to similar studies. We suggest a couple reasons for this. One is that the log<sub>2</sub> fold changes observed in the other studies are not huge and vary between 2.5 and -3.8 for He et al., 3.3 and -2.3 for Santaguida et al., and -0.8 and 1.6 for our study. This variability is within a reasonable range for different experimental conditions and library prep protocols. A second is that our protocol minimizes a potential source of transcriptional change – nuclear lobulation – that is present in the other datasets.

      For the pathway analysis we did not use a fold-change cut-off for any data set, instead opting to include all the genes found to be significantly different between control and Mps1i treated cells for all three studies. Our read-depth was higher than that of the two published experiments, which could contribute to an increased DEG number. However, we hypothesize that our identification of a broader number of altered pathways most likely arises from increased sensitivity due to the loss of covering signal from transcriptional changes associated with increased nuclear atypia. Additional visual cell sorting experiments sorting on misshapen nuclei instead of MN would allow us to determine the accuracy of this hypothesis.

      (8) Moreover, clustering (in Figure 5E) of the replicates is a bit worrisome as the variances are large and therefore it is unclear if, with such large variance and low screening depth, one can really make such a strong conclusion that there are no changes. The authors should prove that their conclusion that rupture does not lead to large transcriptional changes, is not due to the limitations of their experimental design.

      We agree with the reviewers that additional rounds of RNAseq would improve the accuracy of our transcriptomic analysis and could uncover additional DEGs. However, we believe the overall conclusion to be correct based on the results of our attempt to validate changes in gene expression by immunofluorescence. We analyzed two of the most highly upregulated genes in the ruptured MN dataset, ATF3 and EGR1. Although we saw a statistically significant increase in ATF3 intensity between cells without MN and those with ruptured MN, the fold change was so small compared to our positive control (100x less) that we believe it is it is more consistent with a small increase in the probability of aneuploidy rather than a specific signature of MN rupture.

      (9) The authors also need to address the fact that they are using RPE-1 cells more clearly and that the lack of effect in transcriptional changes may be simply due to the loss of cGAS-STING pathway (Mackenzie et al., 2017; Harding et al., 2017; etc.).

      As we discuss above in the public comments section, the literature is clear that MN do not activate cGAS in the first cell cycle after their formation, even upon rupture. Therefore, we do not expect any changes in our results when applied to cGAS-competent cells. However, this expectation needs to be experimentally validated, which we plan to address in upcoming work.

    1. eLife Assessment

      This valuable study introduces a new method for detecting RNA modification. Since it does not rely on chemical modification of RNA, which often results in RNA degradation and therefore loss of RNA molecules, it complements other approaches for detecting RNA modification, and it might be of particular interest for sites where modifications are found in only a minority of interrogated molecules. The information provided is incomplete, however, to allow for comparison with other methods, since there is uncertainty regarding false positive and false negative rates.

    2. Reviewer #2 (Public review):

      The fledgling field of epitranscriptomics has encountered various technical roadblocks with implications as to the validity of early epitranscriptomics mapping data. As a prime example, the low specificity of (supposedly) modification-specific antibodies for the enrichment of modified RNAs, has been ignored for quite some time and is only now recognized for its dismal reproducibility (between different labs), which necessitates the development of alternative methods for modification detection. Furthermore, early attempts to map individual epitranscriptomes using sequencing-based techniques are largely characterized by the deliberate avoidance of orthogonal approaches aimed at confirming the existence of RNA modifications that have been originally identified.

      Improved methodology, the inclusion of various controls, and better mapping algorithms as well as the application of robust statistics for the identification of false-positive RNA modification calls have allowed revisiting original (seminal) publications whose early mapping data allowed making hyperbolic claims about the number, localization and importance of RNA modifications, especially in mRNA. Besides the existence of m6A in mRNA, the detectable incidence of RNA modifications in mRNAs has drastically dropped.

      As for m5C, the subject of the manuscript submitted by Zhou et al., its identification in mRNA goes back to Squires et al., 2012 reporting on >10.000 sites in mRNA of a human cancer cell line, followed by intermittent findings reporting on pretty much every number between 0 to > 100.000 m5C sites in different human cell-derived mRNA transcriptomes. The reason for such discrepancy is most likely of a technical nature. Importantly, all studies reporting on actual transcript numbers that were m5C-modified relied on RNA bisulfite sequencing, an NGS-based method, that can discriminate between methylated and non-methylated Cs after chemical deamination of C but not m5C. RNA bisulfite sequencing has a notoriously high background due to deamination artifacts, which occur largely due to incomplete denaturation of double-stranded regions (denaturing-resistant) of RNA molecules. Furthermore, m5C sites in mRNAs have now been mapped to regions that have not only sequence identity but also structural features of tRNAs. Various studies revealed that the highly conserved m5C RNA methyltransferases NSUN2 and NSUN6 do not only accept tRNAs but also other RNAs (including mRNAs) as methylation substrates, which in combination account for most of the RNA bisulfite-mapped m5C sites in human mRNA transcriptomes. Is m5C in mRNA only a result of the Star activity of tRNA or rRNA modification enzymes, or is their low stoichiometry biologically relevant?

      In light of the short-comings of existing tools to robustly determine m5C in transcriptomes, other methods, like DRAM-seq, aiming to map m5C independently of ex situ RNA treatment with chemicals, are needed to arrive at a more solid "ground state", from which it will be possible to state and test various hypotheses as to the biological function of m5C, especially in lowly abundant RNAs such as mRNA.

      Importantly, the identification of >10.000 sites containing m5C increases through DRAM-Seq, increases the number of potential m5C marks in human cancer cells from a couple of 100 (after rigorous post-hoc analysis of RNA bisulfite sequencing data) by orders of magnitude. This begs the question, whether or not the application of these editing tools results in editing artefacts overstating the number of actual m5C sites in the human cancer transcriptome.

    1. eLife Assessment

      This important study shows how genetic variation is associated with fecundity following a period of reproductive diapause in female Drosophila. The work identifies the olfactory system as central to successful diapause with associated changes in longevity and fecundity. While the methods used are convincing, a limitation of the study, as of any other laboratory-based investigation is the challenge of demonstrating how well measures for fitness related to diapause and its recovery correlates with realities encountered during development in the wild.

    2. Reviewer #1 (Public review):

      Summary:

      The paper begins with phenotyping the DGRP for post-diapause fecundity, which is used to map genes and variants associated with fecundity. There are overlaps with genes mapped in other studies and also functional enrichment of pathways including most surprisingly neuronal pathways. This somewhat explains the strong overlap with traits such as olfactory behaviors and circadian rhythm. The authors then go on to test genes by knocking them down effectively at 10 degrees. Two genes, Dip-gamma and sbb are identified as significantly associated with post-diapause fecundity, which they also find the effects to be specific to neurons. They further show that the neurons in the antenna but not arista are required for the effects of Dip-gamma and sbb. They show that removing antenna has a diapause specific lifespan extending effect, which is quite interesting. Finally, ionotropic receptor neurons are shown to be required for the diapause associated effects.

      Strengths:

      Overall I find the experiments rigorously done and interpretations sound. I have no further suggestions except an ANOVA to estimate heritability of the post-diapause fecundity trait, which is routinely done in the DGRP and offers a global parameter regarding how reliable phenotyping is.

      Weaknesses:

      A minor point is I cannot find how many DGRP lines are used.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      The paper begins with phenotyping the DGRP for post-diapause fecundity, which is used to map genes and variants associated with fecundity. There are overlaps with genes mapped in other studies and also functional enrichment of pathways including most surprisingly neuronal pathways. This somewhat explains the strong overlap with traits such as olfactory behaviors and circadian rhythm. The authors then go on to test genes by knocking them down effectively at 10 degrees. Two genes, Dip-gamma and sbb, are identified as significantly associated with post-diapause fecundity, and they also find the effects to be specific to neurons. They further show that the neurons in the antenna but not the arista are required for the effects of Dip-gamma and sbb. They show that removing the antenna has a diapause-specific lifespan-extending effect, which is quite interesting. Finally, ionotropic receptor neurons are shown to be required for the diapause-associated effects. 

      Strengths and Weaknesses: 

      Overall I find the experiments rigorously done and interpretations sound. I have no further suggestions except an ANOVA to estimate the heritability of the post-diapause fecundity trait, which is routinely done in the DGRP and offers a global parameter regarding how reliable phenotyping is. 

      We added to the Methods: “We performed a one-way ANOVA to get the mean squares for between-group and withingroup variances and calculated broad-sense heritability using the formula: H<sup>2</sup> = MS<sub>G</sub> - MS<sub>E</sub> / MS<sub>G</sub> + (k-1) MS<sub>E</sub> where MS<sub>G</sub> - Mean square between groups and MS<sub>G</sub> - Mean square within groups and k - Number of individuals per group. Using this formula, the broad-sense heritability for normalized post-diapause fecundity was found to be 0.51.” 

      We added to the Results: “The broad-sense heritability for normalized post-diapause fecundity was found to be 0.51 (see Methods).”

      A minor point is I cannot find how many DGRP lines are used. 

      Response: We screened 193 lines and have added that to the Results. 

      Reviewer #2 (Public Review):

      Summary

      In this study, Easwaran and Montell investigated the molecular, cellular, and genetic basis of adult reproductive diapause in Drosophila using the Drosophila Genetic Reference Panel (DGRP). Their GWAS revealed genes associated with variation in post-diapause fecundity across the DGRP and performed RNAi screens on these candidate genes. They also analyzed the functional implications of these genes, highlighting the role of genes involved in neural and germline development. In addition, in conjunction with other GWAS results, they noted the importance of the olfactory system within the nervous system, which was supported by genetic experiments. Overall, their solid research uncovered new aspects of adult diapause regulation and provided a useful reference for future studies in this field.

      Strengths:

      The authors used whole-genome sequenced DGRP to identify genes and regulatory mechanisms involved in adult diapause. The first Drosophila GWAS of diapause successfully uncovered many QTL underlying post-diapause fecundity variations across DGRP lines. Gene network analysis and comparative GWAS led them to reveal a key role for the olfactory system in diapause lifespan extension and post-diapause fecundity.

      Comments on revised version:

      While the authors have addressed many of the minor concerns raised by the reviewers, they have not fully resolved some of the key criticisms. Notably, two reviewers highlighted significant concerns regarding the phenotype and assay of post-diapause fecundity, which are critical to the study. The authors acknowledged that this assay could be confounded by the 'cold temperature endurance phenotype,' potentially altering the interpretation of their results.

      However, they responded by stating that it is not obvious how to separate these effects experimentally. This leaves the analysis in this research ambiguous, as also noted by Reviewer #3.

      We should have clarified earlier that we actually chose to measure post-diapause fecundity in order to minimize any impact of ‘cold temperature endurance.” In fact, we chose post-diapause fecundity as the appropriate measure of successful diapause for both technical and conceptual reasons. Conceptually, the benefit of diapause is to perpetuate the species. It seems obvious to us that post-diapause fecundity is more relevant to species propagation than other measures of diapause such as how many egg chambers contain yolk or how many eggs are laid. Technically, we chose 5-week diapause and recovery based on pilot studies that showed that nearly all DGRP lines showed excellent survival at 5 weeks in diapause conditions. Therefore, our experimental design minimized as much as possible any effect of cold temperature endurance - in the sense of the ability to survive at 10°C - on our phenotype. 

      We apologize for not clarifying that point earlier and have added this text to the Results: “We chose 5 weeks based on pilot studies that showed that nearly all DGRP lines showed excellent survival at 5 weeks in diapause conditions while exhibiting sufficient variation in post-diapause fecundity to carry out GWAS. Beyond 5 weeks, fecundity was low and there was insufficient variation to conduct a GWAS.”

      Additionally, I raised concerns about the validity of prioritizing genes with multiple associated variants. Although the authors agreed with this point, they did not revise the manuscript accordingly. The statement that 'Genes with multiple SNPs are good candidates for influencing diapause traits' is not a valid argument within the context of population and quantitative genetics.

      We apologize for neglecting to revise the manuscript accordingly. We have revised Supplemental Table: S4 and ranked the genes by p-value.

    1. Author Response:

      Reviewer #1 (Public Review):

      [...] Strengths: This study utilized multiple in vitro approaches, such as proteomics, siRNA, and overexpression, to demonstrate that PCBP2 is an intrinsic factor of BMSC aging.

      Weaknesses:

      This study did not perform in vivo experiments.

      Response: We will continue to conduct animal experiments in subsequent studies.

      Reviewer #2 (Public Review):

      [...] Weaknesses: It is unclear if PCBP2 can also function as an intrinsic factor for BMSC cells in female individuals. More work may be needed to further dissect the mechanism of how PCBP2 impacts FGF2 expression. Could PCBP2 impact the FGF2 expression independent of ROS?

      Response: Thank you very much for your valuable comments, which is also the focus of our follow-up work. We will sort out the data and publish the relevant research results as soon as possible.

      Additional context that would help readers interpret or understand the significance of the work: In the current work, the authors studied the aging process of BMSC cells, which are related to osteoporosis. Aging processes also impact many other cell types and their function, such as in muscle, skin, and the brain.

      Response: Thank you very much for your valuable comments, we will continue to improve the writing logic of the article to make the article more understandable.

    1. eLife Assessment

      This useful manuscript reports mechanisms behind the increase in fecundity in response to sub-lethal doses of pesticides in the crop pest, the brown plant hopper. The authors hypothesize that the pesticide works by inducing the JH titer, which through the JH signaling pathway induces egg development. Evidence for this is, however, incomplete.

    2. Reviewer #1 (Public review):

      Summary:

      Gao et al. has demonstrated that the the pesticide emamectin benzoate (EB) treatment of brown plathopper (BPH) leads to increased egg laying in the insect, which is a common agricultural pest. The authors hypothesize that EB upregulates JH titer resulting in increased fecundity.

      Strengths:

      The finding that a class of pesticide increases fecundity of brown planthopper is interesting.

      Weaknesses:

      (1) EB is an allosteric modulator of GluCl. That means it EB physically interacts with GluCl initiating a structural change in the cannel protein. Yet the authors here central hypothesis is about how EB can upregulate the mRNA of GluCl. I do not know whether there is any evidence that an allosteric modulator can function as a transcriptional activator for the same receptor protein. The basic premise of the paper sounds counterintuitive. This is a structural problem and should be addressed by the authors by giving sufficient evidence about such demonstrated mechanisms before.<br /> (2) I am surprised to see a 4th instar larval application or treatment with EB results in upregulation of JH in the adult stages. Complicating the results further is the observation that a 4th instar EB application results in an immediate decrease in JH titer. There is a high possibility that this late JH titer increase is an indirect effect.<br /> (3) The writing quality of the paper needs improvement. Particularly with respect to describing processes, and abbreviations. In several instances authors have not adequately described the processes they have introduced, thus confusing the readers.<br /> (4) In the section 'EB promotes ovarian development' the authors have shown that EB treatment results in increased detention of eggs which contradicts their own results which show that EB promotes egg laying. Again, this is a serious contradiction that nullifies their hypothesis.<br /> (5) Furthermore, the results suggest that oogenesis is not affected by EB application. The authors should devote a section to discussing how they are observing increased egg numbers in EB-treated insects while not impacting Oogenesis.<br /> (6) Met is the receptor of JH and to my understanding, remains mostly constant in terms of its mRNA or protein levels throughout various developmental periods in many different insects. Therefore, the presence of JH becomes the major driving factor for physiological events and not the presence of the receptor Met. Here the authors have demonstrated an increase in Met mRNA as a result of EB treatment. Their central hypothesis is that EB increases JH titer to result in enhanced fecundity. JH action will not result in the activation of Met. Although not contradictory to the hypothesis, the increase in mRNA content of Met is contrary to the findings of the JH field thus far.<br /> (7) As pointed out before, it is hard to rationalize how a 4th instar exposure to EB can result in upregulation of key genes involved in JH synthesis at the adult stage. The authors must consider providing a plausible explanation and discussion in this regard.<br /> (8) I have strong reservations against such an irrational hypothesis that Met (the receptor for JH) and JH-Met target gene Kr-h1 regulates JH titer (Line 311, Fig 3 supplemental 2D). This would be the first report of such an event on the JH field and therefore must be analysed to depth before it may go to publication. I strongly suggest the authors remove such claims from the manuscript without substantiating it.<br /> (9) Kr-h1 is JH/Met target gene. The authors demonstrate that silencing of Kr-h1 results in inhibition of FAMeT, which is a gene involved in JH synthesis. The feedback loop in JH synthesis is unreported. Authors must go ahead with a mechanistic detail of Kr-h1 mediated JH upregulation before this can be concluded. Mere qPCR experiments are not sufficient to substantiate a claim that is completely contrary to the current understanding of JH signalling pathway.<br /> (10) Authors have performed knockdowns of JHAMT, Met and Kr-h1 to demonstrate the effect of these factors on fecundity n BPH. Additionally, they have performed rescue experiments with EB application on these knockdown insects (Figure 3K-M). This I believe is a very flawed experiment. The authors demonstrate EB works through JHAMT in upregulating JH titer. In the absence of JHAMT, EB application is not expected to rescue the phenotype. But authors have reported a complete rescue here. In the absence of Met, the receptor of JH, either EB or JH is not expected to rescue the phenotype. But a complete rescue has been reported. These two experimental results contradict their own hypothesis.<br /> (11) A significant section of the paper deals with how EB upregulates JH titer. JH is a hormone synthesized in the Corpora Allata. Yet the authors have chosen to use the whole body for all of their experiments. Changes in the whole body for mRNA of those enzymes involved in JH synthesis does may not reflect on the situation in Corpora Allata. Although working with corpora Allata is challenging, discarding the abdomen and thorax region and working with the head and neck region of the insect is easily doable. Results from such sampling is always more convincing when it comes to JH synthesis studies.<br /> (12) The phenomenon reported was specific for BPH and not found in other insects. This limits the implications of the study.<br /> (13) Overall, the molecular experiments are very poorly designed and can at best be termed superficial. There are several contradictions within the paper and no discussion or explanation has been provided for that.

      Comments on revisions:

      (1) The onus of making the revisions understandable to the reviewers lies with the authors. In its current form, how the authors have approached the review is hard to follow, in my opinion. Although the authors have taken a lot of effort in answering the questions posed by reviewers, parallel changes in the manuscript are not clearly mentioned. In many cases, the authors have acknowledged the criticism in response to the reviewer, but have not changed their narrative, particularly in the results section.<br /> (2) In the response to reviewers, the authors have mentioned line numbers in the main text where changes were made. But very frequently, those lines do not refer to the changes or mention just a subsection of changes done. The problem is throughout the document making it very difficult to follow the revision and contributing to the point mentioned above.<br /> (3) The authors need to infer the performed experiments rationally without over interpretation. Currently, many of the claims that the authors are making are unsubstantiated. As a result of the first review process, the authors have acknowledged the discrepancies, but they have failed to alter their interpretations accordingly.<br /> (4) I would like to point to the fact that there are significant experimental modifications added to the manuscript. The decision from the first cycle of review was given on 8th Nov 2024. The authors re-submitted the manuscript on 20th Nov 2024. It just beats my understanding, how so many experiments can be done in such a short time. The rush in resubmission is evident in the writing quality as well. Which I think is now poorer than the original version.<br /> (5) The writing quality is still extremely poor.

    1. eLife Assessment

      This valuable study confirms the association between the human leukocyte antigen (HLA)-II region and tuberculosis (TB) susceptibility in genetically admixed South African populations, specifically identifying a near-genome-wide significant association in the HLA-DPB1 gene, which originates from KhoeSan ancestry. The evidence supporting the association between the HLA-II region and TB susceptibility is solid, and the work will be of interest to those studying the genetic basis of tuberculosis susceptibility/infection resistance.

    2. Reviewer #1 (Public review):

      Summary:

      The authors aimed to confirm the association between the human leukocyte antigen (HLA)-II region and tuberculosis (TB) susceptibility within admixed African populations. Building upon previous findings from the International Tuberculosis Host Genetics Consortium (ITHGC), this study sought to address the limitations of small sample size and the inclusion of admixed samples by employing the Local Ancestry Allelic Adjusted (LAAA) model, as well as identify TB susceptibility loci in an admixed South African cohort.

      Strengths:

      The major strengths of this study include the use of multiple TB case-control datasets from diverse South African populations and ADMIXTURE for global ancestry inference.

      Weaknesses:

      The major weakness of this study include insufficient significant novel discoveries and reliance on cross-validation. The use of existing models did not add value to this study.

      Appraisal:<br /> The authors achieved their aims. However, the results still needed to be further validated in the future.

      Impact:<br /> The innovative use of the LAAA model and the comprehensive dataset in this study may make contributions to the field of genetic epidemiology.

    3. Reviewer #2 (Public review):

      Summary:

      This manuscript is about using different analytical approaches to allow ancestry adjustments to GWAS analyses amongst admixed populations. This work is a follow-on from the recently published ITHGC multi-population GWAS (https://doi.org/10.7554/eLife.84394), with the focus on the admixed South African populations. Ancestry adjustment models detected a peak of SNPs in the class II HLA DPB1, distinct from the class II HLA DQA1 loci signficant in the ITHGC analysis.

      Strengths:

      Excellent demonstration of GWAS analytical pipelines in highly admixed populations. Particularly the utility of ancestry adjustment to improve study power to detect novel associations. Further confirmation of the importance of the HLA class II locus in genetic susceptibility to TB.

      Weaknesses:

      Limited novelty compared to the group's previous existing publications and the body of work linking HLA class II alleles with TB susceptibility in South Africa or other African populations. This work includes only ~100 new cases and controls from what has already been published. High resolution HLA typing has detected significant signals in both the DQA1 and DPB1 regions identified by the larger ITHGC and in this GWAS analysis respectively (Chihab L et al. HLA. 2023 Feb; 101(2): 124-137).<br /> Despite the availability of strong methods for imputing HLA from GWAS data (Karnes J et Plos One 2017), the authors did not confirm with HLA typing the importance of their SNP peak in the class II region. This would have supported the importance of this ancestry adjustment versus prior ITHGC analysis.<br /> The populations consider active TB and healthy controls (from high-burden presumed exposed communities) and do not provide QFT or other data to identify latent TB infection.

      Important methodological points for clarification and for readers to be aware of when reading this paper:

      (1) One of the reasons cited for the lack of African ancestry-specific associations or suggestive peaks in the ITHGC study was the small African sample size. The current association test includes a larger African cohort and yields a near-genome-wide significant threshold in the HLA-DPB1 gene originating from the KhoeSan ancestry. Investigation is needed as to whether the increase in power is due to increased African samples and not necessarily the use of the LAAA model as stated on lines 295 and 296?

      Authors response - The Manhattan plot in Figure 3 includes the results for all four models: the traditional GWAS model (GAO), the admixture mapping model (LAO), the ancestry plus allelic (APA) model and the LAAA model. In this figure, it is evident that only the LAAA model identified the association peak on chromosome 6, which lends support the argument that the increase in power is due to the use of the LAAA model and not solely due to the increase in sample size.<br /> Reviewer comment - This data supports the authors conclusions that increase power is related to the LAAA model application rather than simply increase sample size.

      (2) In line 256, the number of SNPs included in the LAAA analysis was 784,557 autosomal markers; the number of SNPs after quality control of the imputed dataset was 7,510,051 SNPs (line 142). It is not clear how or why ~90% of the SNPs were removed. This needs clarification.

      Authors response:<br /> In our manuscript (line 194), we mention that "...variants with minor allele frequency (MAF) < 1% were removed to improve the stability of the association tests." A large proportion of imputed variants fell below this MAF threshold and were subsequently excluded from this analysis.

      Reviewers additional comment: The authors should specify the number of SNPs in the dataset before imputation and indicate what proportion of the 784,557 remaining SNPs were imputed. Providing this information might help the reader better understand the rationale behind the imputation process.

      (3) The authors have used the significance threshold estimated by the STEAM p-value < 2.5x10-6 in the LAAA analysis. Grinde et al. (2019 implemented their significance threshold estimation approach tailored to admixture mapping (local ancestry (LA) model), where there is a reduction in testing burden. The authors should justify why this threshold would apply to the LAAA model (a joint genotype and ancestry approach).

      Authors response: We describe in the methods (line 189 onwards) that the LAAA model is an extension of the APA model. Since the APA model itself simultaneously performs the null global ancestry only model and the local ancestry model (utilised in admixture mapping), we thus considered the use of a threshold tailored to admixture mapping appropriate for the LAAA model.

      Reviewers additional comment: While the LAAA model is an extension of the APA model, the authors describe the LAAA test as 'models the combination of the minor allele and the ancestry of the minor allele at a specific locus, along with the effect of this interaction,' thus a joint allele and ancestry effects model. Grinde et al. (2019) proposed the significance threshold estimation approach, STEAM, specifically for the LA approach, which tests for ancestry effects alone and benefits from the reduced testing burden. However, it remains unclear why the authors found it appropriate to apply STEAM to the LAAA model, a joint test for both allele and ancestry effects, which does not benefit from the same reduction in testing burden.

      (4) Batch effect screening and correction (line 174) is a quality control check. This section is discussed after global and local ancestry inferences in the methods. Was this QC step conducted after the inferencing? If so, the authors should justify how the removed SNPs due to the batch effect did not affect the global and local ancestry inferences or should order the methods section correctly to avoid confusion.

      Authors response: The batch effect correction method utilised a pseudo-case-control comparison which included global ancestry proportions. Thus, batch effect correction was conducted after ancestry inference. We excluded 36 627 SNPs that were believed to have been affected by the batch effect. We have amended line 186 to include the exact number of SNPs excluded due to batch effect.<br /> The ancestry inference by RFMix utilised the entire merged dataset of 7 510 051 SNPs. Thus, the SNPs removed due to the batch effect make up a very small proportion of the SNPs used to conduct global and local ancestry inferences (less than 0.5%). As a result, we do not believe that the removed SNPs would have significantly affected the global and local ancestry inferences. However, we did conduct global ancestry inference with RFMix on each separate dataset as a sanity check. In the Author response tables 1 and 2, we show the average global ancestry proportions inferred for each separate dataset, the average global ancestry proportions across all datasets and the average global ancestry proportions inferred using the merged dataset. The SAC and Xhosa cohorts are shown in two separate tables due to the different number of contributing ancestral populations to each cohort. The differences between the combined average global ancestry proportions across the separate cohorts does not differ significantly to the global ancestry proportions inferred using the merged dataset.

      This is an excellent response and should remain accessible to readers to clarify this issue.

    1. eLife Assessment

      Studying several allergens in different mouse strains, the authors assessed the role of IgM in airway inflammatory responses and show that IgM deficient mice have reduced airway hyperresponsiveness. Although the findings are useful and interesting and among others show the expression of a protein that regulates actin in smooth cells, the study remains incomplete as the data and analyses only partly support their primary claim.

    2. Reviewer #1 (Public review):

      Summary:

      The authors of this study sought to define a role for IgM in responses to house dust mites in the lung.

      Strengths:

      Unexpected observation about IgM biology.<br /> Combination of experiments to elucidate function.

      Weaknesses:

      Would love more connection to human disease

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript by Hadebe and colleagues describes a striking reduction in airway hyperresponsiveness in Igm-deficient mice in response to HDM, OVA and papain across the B6 and BALB-c backgrounds. The authors suggest that the deficit is not due to improper type 2 immune responses, nor an aberrant B cell response, despite a lack of class switching in these mice. Through RNA-Seq approaches, the authors identify few differences between the lungs of WT and Igm-deficient mice, but see that two genes involved in actin regulation are greatly reduced in IgM-deficient mice. The authors target these genes by CRISPR-Cas9 in in vitro assays of smooth muscle cells to show that these may regulate cell contraction. While the study is conceptually interesting, there are a number of limitations, which stop us from drawing meaningful conclusions.

      Strengths:

      Fig. 1. The authors clearly show that IgMKO mice have striking reduced AHR in the HDM model, despite the presence of a good cellular B cell response.

      Weaknesses:

      Due to several technical and experimental limitations, it is unclear what leads to the reduction in airway hyperresponsiveness in IGM-KO mice. The limitations as outlined previously remain.

    1. eLife Assessment

      This important study describes how a single effector of the Type Six Secretion System (T6SS) has two distinct functions, which may contribute to bacterial survival and the development of novel antibacterials. The authors utilized various methods in biochemistry, microbiology, and microscopy to produce convincing data supporting their claims about the protein's function; however, they could clarify the implications for non-experts to enhance the accessibility of this work. This manuscript is of interest to those studying T6SS, particularly those interested in effectors and bacterial enzymes.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript performs a comprehensive biochemical, structural, and bioinformatic analysis of TseP, a type 6 secretion system effector from Aeromonas dhakensis that includes identification of a domain required for secretion and residues conferring target organism specificity. Through targeted mutations, they have expanded the target range of a T6SS effector to include a gram-positive species, which are not typically susceptible to T6SS attack. Although this is not the first dual domain effector to be described, this is the first time anyone has been able to modify a T6SS effector to have an expanded target species range.

      Strengths:

      The thorough dissection of TseP activity and modulation of target specificity represent a novel contribution to the field of antibacterial research.

      Weaknesses:

      Although the mechanistic activity of TseP is fully dissected here, there are some unaddressed questions regarding the importance/evolution of the dual activity domain organization. For example, does the modified Gram-positive targeting TseP effector still kill Gram-negative bacteria in bacterial mixtures? And if so, what is the evolutionary benefit of having a TseP that cannot target Gram-positives? And can something be inferred about the biology of Aeromonas from this?

      Comments on revisions:

      The comments and critiques from the initial submission have been addressed. However, some of them have only been addressed in the author's rebuttal. Some of the discussion particularly regarding the validity of using E. coli PG, the ability for TseP_C4+ to still kill E. coli, and the advantages of having dual domain function effectors probably should be present in the actual manuscript.

    3. Reviewer #2 (Public review):

      Summary:

      Wang et al. investigate the role of TseP, a Type VI secretion system (T6SS) effector molecule, revealing its dual enzymatic activities as both an amidase and a lysozyme. This discovery significantly enhances the understanding of T6SS effectors, which are known for their roles in interbacterial competition and survival in polymicrobial environments. TseP's dual function is proposed to play a crucial role in bacterial survival strategies, particularly in hostile environments where competition between bacterial species is prevalent.

      Strengths:

      (1) The dual enzymatic function of TseP is a significant contribution, expanding the understanding of T6SS effectors.<br /> (2) The study provides important insights into bacterial survival strategies, particularly in interbacterial competition.<br /> (3) The findings have implications for antimicrobial research and understanding bacterial interactions in complex environments.

      Weaknesses:

      (1) The manuscript assumes familiarity with previous work, making it difficult to follow. Mutants and strains need clearer definition and references.<br /> (2) Figures lack proper controls, quantification, and clarity in some areas, notably in Figures 1A and 1C.<br /> (3) The Materials and Methods section is poorly organized, hindering reproducibility. Biophysical validation of Zn²⁺ interaction and structural integrity of proteins need to be addressed.<br /> (4) Discrepancies in protein degradation patterns and activities across different figures raise concerns about data reliability.

      Comments on revisions:

      The authors have addressed most of the comments, significantly improving the manuscript. They provided clear details of mutant constructs and strains, including additional references and a revised strain. Individual data points and statistical analyses were added to key figures, ensuring transparency and reproducibility. Supplemental data, such as protein purification details and loading controls, were included to address concerns about experimental reliability. However, the authors did not perform new experiments, such as isothermal titration calorimetry (ITC) to demonstrate the interaction between Zn<sup>2+</sup> and TsePN or stop-flow spectroscopy to examine enzymatic kinetics, which could have further strengthened the manuscript. I trust these aspects will be addressed in future studies.

      The revised Materials and Methods section was significantly improved, providing detailed protocols for bioinformatics analyses, microscopic imaging, and enzymatic assays.

      These revisions provide a clearer and more robust presentation of TseP's dual enzymatic functions and their implications in bacterial competition. The manuscript now represents a significant contribution to understanding T6SS effectors, and I recommend it for publication in its current form.

    4. Reviewer #3 (Public review):

      Summary:

      Type VI secretion systems (T6SS) are employed by bacteria to inject competitor cells with numerous effector proteins. These effectors can kill injected cells via an array of enzymatic activities. A common class of T6SS effector are peptidoglycan (PG) lysing enzymes. In this manuscript, the authors characterize a PG-lysing effector-TseP-from the pathogen Aeromonas dhakensis. While the C-terminal domain of TseP was known to have lysozyme activity, the N-terminal domain was uncharacterized. Here, the authors functionally characterize TsePN as a zinc-dependent amidase. This discovery is somewhat novel because it is rare for PG-lysing effectors to have amidase and lysozyme activity. In the second half of the manuscript, the authors utilize a crystal structure of the lysozyme TsePC domain to inform the engineering of this domain to lyse gram-positive peptidoglycan.

      Strengths:

      The two halves of the manuscript considered together provide a nice characterization of a unique T6SS effector and reveal potentially general principles for lysozyme engineering.

      Weaknesses:

      The advantage of fusing amidase and lysozyme domains in a single effector is not discussed but would appear to be a pertinent question.

      Comments on revisions:

      The authors have adequately addressed my previous comments. The authors did not conduct any additional experiments to address the comments made by other reviewers. However, in most cases it seems that paring down the strength of claims made in the text or adding data to the supplement is sufficient to address these concerns.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The manuscript performs a comprehensive biochemical, structural, and bioinformatic analysis of TseP, a type 6 secretion system effector from Aeromonas dhakensis that includes the identification of a domain required for secretion and residues conferring target organism specificity. Through targeted mutations, they have expanded the target range of a T6SS effector to include a gram-positive species, which is not typically susceptible to T6SS attack.

      Strengths:

      All of the experiments presented in the study are well-motivated and the conclusions are generally sound.

      Thank you.

      Weaknesses:

      There are some issues with the clarity of figures. For example, the microscopy figures could have been more clearly presented as cell counts/quantification rather than representative images. Similarly, loading controls for the secreted proteins for the westerns probably should be shown.

      Also, some of the minor/secondary conclusions reached regarding the "independence" of the N and C term domains of the TseP are a bit overreaching.

      We thank the reviewer for pointing out the issues and have carefully revised the manuscript accordingly. We acknowledge the reviewer’s concern regarding the independence of the N- and C-terminal domains, and have toned down the relevant claims.

      Reviewer #2 (Public review):

      Summary:

      Wang et al. investigate the role of TseP, a Type VI secretion system (T6SS) effector molecule, revealing its dual enzymatic activities as both an amidase and a lysozyme. This discovery significantly enhances the understanding of T6SS effectors, which are known for their roles in interbacterial competition and survival in polymicrobial environments. TseP's dual function is proposed to play a crucial role in bacterial survival strategies, particularly in hostile environments where competition between bacterial species is prevalent.

      Strengths:

      (1) The dual enzymatic function of TseP is a significant contribution, expanding the understanding of T6SS effectors.

      (2) The study provides important insights into bacterial survival strategies, particularly in interbacterial competition.

      (3) The findings have implications for antimicrobial research and understanding bacterial interactions in complex environments.

      Thank you.

      Weaknesses:

      (1) The manuscript assumes familiarity with previous work, making it difficult to follow. Mutants and strains need clearer definitions and references.

      Thank you for raising the issue. We have revised the manuscript accordingly to improve the clarity by including more detailed descriptions of the mutants and strains, along with references to prior work where relevant, to improve clarity.

      (2) Figures lack proper controls, quantification, and clarity in some areas, notably in Figures 1A and 1C.

      We have now added the controls as requested by reviewers.

      (3) The Materials and Methods section is poorly organized, hindering reproducibility. Biophysical validation of Zn<sup>2+</sup> interaction and structural integrity of proteins need to be addressed.

      We have now included more details in the Materials and Methods section. While we recognize the importance of biophysical validation of the Zn<sup>2+</sup> interaction, this analysis lies beyond the primary scope of the current study. We plan to investigate the role of Zn²⁺ interaction and the EF-hand domain in greater depth as part of our follow-up studies. Thank you for this suggestion.

      (4) Discrepancies in protein degradation patterns and activities across different figures raise concerns about data reliability.

      We acknowledge the concern about discrepancies in protein degradation patterns. TseP exhibits inherent instability, which might explain the observed variations. We have added an explanation in the detailed response letter and the manuscript.

      Reviewer #3 (Public review):

      Summary:

      Type VI secretion systems (T6SS) are employed by bacteria to inject competitor cells with numerous effector proteins. These effectors can kill injected cells via an array of enzymatic activities. A common class of T6SS effector are peptidoglycan (PG) lysing enzymes. In this manuscript, the authors characterize a PG-lysing effector-TseP-from the pathogen Aeromonas dhakensis. While the C-terminal domain of TseP was known to have lysozyme activity, the N-terminal domain was uncharacterized. Here, the authors functionally characterize TsePN as a zinc-dependent amidase. This discovery is somewhat novel because it is rare for PG-lysing effectors to have amidase and lysozyme activity.

      In the second half of the manuscript, the authors utilize a crystal structure of the lysozyme TsePC domain to inform the engineering of this domain to lyse gram-positive peptidoglycan.

      Strengths:

      The two halves of the manuscript considered together provide a nice characterization of a unique T6SS effector and reveal potentially general principles for lysozyme engineering.

      Thank you.

      Weaknesses:

      The advantage of fusing amidase and lysozyme domains in a single effector is not discussed but would appear to be a pertinent question. Labeling of the figures could be improved to help readers understand the data.

      Thank you for the suggestions. We have revised the manuscript and figures to improve clarity.

      The advantage of having dual-domain functions relative to having just one of the two functions is likely for increasing competitive fitness. Although such dual functional cell-wall targeting effectors have not been characterized prior to this study, there are some examples that dual functions are encoded by the same secretion module, for example the VgrG1-TseL pair in Vibrio cholerae. The C-terminal of VgrG1 not only catalyzes actin crosslinking but also recognizes and delivers the downstream encoded lipase effector TseL through direct interaction. In this context, the VgrG1-TseL pair also represent a dual-functional module. Therefore, it is likely that fusing effector domains and coupling effector functions are parallel strategies for the evolution of T6SS effectors.

    1. eLife Assessment

      This manuscript reports an important new statistical method for calculating the significance of correlations between two time-series, which provides more accuracy than other methods when the data has few replicates. The proposed method solves a real-life problem that is frequently encountered and is broadly applicable to many realistic datasets in many experimental contexts. The technique is supported with compelling mathematical derivations as well as analysis of both computer-generated and previously published experimental data.

    1. eLife Assessment

      This important study uses extensive comparative analysis to examine the relationship between plasma glucose levels, albumin glycation levels, and diet and life history, within the framework of the "pace of life syndrome" hypothesis. The evidence that glucose is positively correlated with glycation levels and lifespan is convincing and, although there are some limitations related to data collection, they likely make the statistically significant findings more conservative. As the first extensive comparative analysis of glycation rates, life history, and glucose levels in birds, the study has the potential to be of interest to evolutionary ecologists and the aging research community more broadly.

    2. Reviewer #2 (Public review):

      Summary

      In this extensive comparative study, Moreno-Borrallo and colleagues examine the relationships between plasma glucose levels, albumin glycation levels, diet and life-history traits across birds. Their results confirmed the expected positive relationship between plasma blood glucose level and albumin glycation rate but also provided findings that are somewhat surprising or contrast with findings of some previous studies (positive relationships between blood glucose and lifespan, or absent relationships between blood glucose and clutch mass or diet). This is the first extensive comparative analysis of glycation rates and their relationships to plasma glucose levels and life history traits in birds that is based on data collected in a single study, with blood glucose and glycation measured using unified analytical methods (except for blood glucose data for 13 species collected from a database).

      Strengths

      This is an emerging topic gaining momentum in evolutionary physiology, which makes this study a timely, novel and important contribution. The study is based on a novel data set collected by the authors from 88 bird species (67 in captivity, 21 in the wild) of 22 orders, except for 13 species, for which data were collected from a database of veterinary and animal care records of zoo animals (ZIMS). This novel data set itself greatly contributes to the pool of available data on avian glycemia, as previous comparative studies either extracted data from various studies or a ZIMS database (therefore potentially containing much more noise due to different methodologies or other unstandardised factors), or only collected data from a single order, namely Passeriformes. The data further represents the first comparative avian data set on albumin glycation obtained using a unified methodology. The authors used LC-MS to determine glycation levels, which does not have problems with specificity and sensitivity that may occur with assays used in previous studies. The data analysis is thorough, and the conclusions are substantiated. Overall, this is an important study representing a substantial contribution to the emerging field evolutionary physiology focused on ecology and evolution of blood/plasma glucose levels and resistance to glycation.

      Weaknesses

      Unfortunately, the authors did not record handling time (i.e., time elapsed between capture and blood sampling), which may be an important source of noise because handling-stress-induced increase in blood glucose has previously been reported. Moreover, the authors themselves demonstrate that handling stress increases variance in blood glucose levels. Both effects (elevated mean and variance) are evident in Figure ESM1.2. However, this likely makes their significant findings regarding glucose levels and their associations with lifespan or glycation rate more conservative, as highlighted by the authors.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      The paper explored cross-species variance in albumin glycation and blood glucose levels in the function of various life-history traits. Their results show that

      (1) blood glucose levels predict albumin gylcation rates

      (2) larger species have lower blood glucose levels

      (3) lifespan positively correlates with blood glucose levels and

      (4) diet predicts albumin glycation rates.

      The data presented is interesting, especially due to the relevance of glycation to the ageing process and the interesting life-history and physiological traits of birds. Most importantly, the results suggest that some mechanisms might exist that limit the level of glycation in species with the highest blood glucose levels.

      While the questions raised are interesting and the amount of data the authors collected is impressive, I have some major concerns about this study:

      (1) The authors combine many databases and samples of various sources. This is understandable when access to data is limited, but I expected more caution when combining these. E.g. glucose is measured in all samples without any description of how handling stress was controlled for. E.g glucose levels can easily double in a few minutes in birds, potentially introducing variation in the data generated. The authors report no caution of this effect, or any statistical approaches aiming to check whether handling stress had an effect here, either on glucose or on glycation levels.

      (2) The database with the predictors is similarly problematic. There is information pulled from captivity and wild (e.g. on lifespan) without any confirmation that the different databases are comparable or not (and here I'm not just referring to the correlation between the databases, but also to a potential systematic bias (e.g. captivate-based sources likely consistently report longer lifespans). This is even more surprising, given that the authors raise the possibility of captivity effects in the discussion, and exploring this question would be extremely easy in their statistical models (a simple covariate in the MCMCglmms).

      (3) The authors state that the measurement of one of the primary response variables (glycation) was measured without any replicability test or reference to the replicability of the measurement technique.

      (4) The methods and results are very poorly presented. For instance, new model types and variables are popping up throughout the manuscript, already reporting results, before explaining what these are e.g. results are presented on "species average models" and "model with individuals", but it's not described what these are and why we need to see both. Variables, like "centered log body mass", or "mass-adjusted lifespan" are not explained. The results section is extremely long, describing general patterns that have little relevance to the questions raised in the introduction and would be much more efficiently communicated visually or in a table.

      Reviewer #2 (Public review):

      Summary

      In this extensive comparative study, Moreno-Borrallo and colleagues examine the relationships between plasma glucose levels, albumin glycation levels, diet, and lifehistory traits across birds. Their results confirmed the expected positive relationship between plasma blood glucose level and albumin glycation rate but also provided findings that are somewhat surprising or contradicting findings of some previous studies (relationships with lifespan, clutch mass, or diet). This is the first extensive comparative analysis of glycation rates and their relationships to plasma glucose levels and life history traits in birds that are based on data collected in a single study and measured using unified analytical methods.

      Strengths

      This is an emerging topic gaining momentum in evolutionary physiology, which makes this study a timely, novel, and very important contribution. The study is based on a novel data set collected by the authors from 88 bird species (67 in captivity, 21 in the wild) of 22 orders, which itself greatly contributes to the pool of available data on avian glycemia, as previous comparative studies either extracted data from various studies or a database of veterinary records of zoo animals (therefore potentially containing much more noise due to different methodologies or other unstandardised factors), or only collected data from a single order, namely Passeriformes. The data further represents the first comparative avian data set on albumin glycation obtained using a unified methodology. The authors used LC-MS to determine glycation levels, which does not have problems with specificity and sensitivity that may occur with assays used in previous studies. The data analysis is thorough, and the conclusions are mostly wellsupported (but see my comments below). Overall, this is a very important study representing a substantial contribution to the emerging field of evolutionary physiology focused on the ecology and evolution of blood/plasma glucose levels and resistance to glycation.

      Weaknesses

      My main concern is about the interpretation of the coefficient of the relationship between glycation rate and plasma glucose, which reads as follows: "Given that plasma glucose is logarithm transformed and the estimated slope of their relationship is lower than one, this implies that birds with higher glucose levels have relatively lower albumin glycation rates for their glucose, fact that we would be referring as higher glycation resistance" (lines 318-321) and "the logarithmic nature of the relationship, suggests that species with higher plasma glucose levels exhibit relatively greater resistance to glycation" (lines 386-388). First, only plasma glucose (predictor) but not glycation level (response) is logarithm transformed, and this semi-logarithmic relationship assumed by the model means that an increase in glycation always slows down when blood glucose goes up, irrespective of the coefficient. The coefficient thus does not carry information that could be interpreted as higher (when <1) or lower (when >1) resistance to glycation (this only can be done in a log-log model, see below) because the semi-log relationship means that glycation increases by a constant amount (expressed by the coefficient of plasma glucose) for every tenfold increase in plasma glucose (for example, with glucose values 10 and 100, the model would predict glycation values 2 and 4 if the coefficient is 2, or 0.5 and 1 if the coefficient is 0.5). Second, the semi-logarithmic relationship could indeed be interpreted such that glycation rates are relatively lower in species with high plasma glucose levels. However, the semi-log relationship is assumed here a priori and forced to the model by log-transforming only glucose level, while not being tested against alternative models, such as: (i) a model with a simple linear relationship (glycation ~ glucose); or (ii) a loglog model (log(glycation) ~ log(glucose)) assuming power function relationship (glycation = a * glucose^b). The latter model would allow for the interpretation of the coefficient (b) as higher (when <1) or lower (when >1) resistance in glycation in species with high glucose levels as suggested by the authors.

      Besides, a clear explanation of why glucose is log-transformed when included as a predictor, but not when included as a response variable, is missing.

      We apologize for missing an answer to this part before. Indeed, glucose is always log transformed and this is explained in the text.

      The models in the study do not control for the sampling time (i.e., time latency between capture and blood sampling), which may be an important source of noise because blood glucose increases because of stress following the capture. Although the authors claim that "this change in glucose levels with stress is mostly driven by an increase in variation instead of an increase in average values" (ESM6, line 46), their analysis of Tomasek et al.'s (2022) data set in ESM1 using Kruskal-Wallis rank sum test shows that, compared to baseline glucose levels, stress-induced glucose levels have higher median values, not only higher variation.

      Although the authors calculated the variance inflation factor (VIF) for each model, it is not clear how these were interpreted and considered. In some models, GVIF^(1/(2*Df)) is higher than 1.6, which indicates potentially important collinearity; see for example https://www.bookdown.org/rwnahhas/RMPH/mlr-collinearity.html). This is often the case for body mass or clutch mass (e.g. models of glucose or glycation based on individual measurements).

      It seems that the differences between diet groups other than omnivores (the reference category in the models) were not tested and only inferred using the credible intervals from the models. However, these credible intervals relate to the comparison of each group with the reference group (Omnivore) and cannot be used for pairwise comparisons between other groups. Statistics for these contrasts should be provided instead. Based on the plot in Figure 4B, it seems possible that terrestrial carnivores differed in glycation level not only from omnivores but also from herbivores and frugivores/nectarivores.

      Given that blood glucose is related to maximum lifespan, it would be interesting to also see the results of the model from Table 2 while excluding blood glucose from the predictors. This would allow for assessing if the maximum lifespan is completely independent of glycation levels. Alternatively, there might be a positive correlation mediated by blood glucose levels (based on its positive correlations with both lifespan and glycation), which would be a very interesting finding suggesting that high glycation levels do not preclude the evolution of long lifespans.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Line 84: "glycation scavengers" such as polyamines - can you specify what these polyamines do exactly?

      A clarification of what we mean with "glycation scavengers" is added.

      (2) Line 87-89: specify that the work of Wein et al. and this sentence is about birds.

      This is now clarified.

      (3) Line 95: "88 species" add "OF BIRDS". Also, I think it would be nice if you specified here that you are relying on primary data.

      This is now clarified (line 96).

      (4) Line 90-119: I find this paragraph very long and complex, with too many details on the methodology. For instance, I agree with listing your hypothesis, e.g. that with POL, but then what variables you use to measure the pace of life can go in the materials and methods section (so all lines between 112-119).

      This is explained here as a previous reviewer considered this presentation was indeed needed in the introduction.

      (5) Line 122-124: The first sentence should state that you collected blood samples from various sources, and list some examples: zoos? collaborators? designated wild captures? Stating the sample size before saying what you did to get them is a bit weird. Besides, you skipped a very important detail about how these samples were collected, when, where, and using what protocols. We know very well, that glucose levels can increase quickly with handling stress. Was this considered during the captures? Moreover, you state that you had 484 individuals, but how many samples in total? One per individual or more?

      We kindly ask the reviewer to read the multiple supplementary materials provided, in which the questions of source of the samples, potential stress effects and sample sizes for each model are addressed. All individuals contributed with one sample. More details about the general sources employed are given now in lines 125-127.

      (6) Line 135-36: numbers below 10 should be spelled out.

      Ok. Now that is changed.

      (7) Line 136: the first time I saw that you had both wild and captive samples. This should be among the first things to be described in the methods, as mentioned above.

      As stated above, details on this are included in the supplementary materials, but further clarifications have now been included in the main text (question 5).

      (8) Line 137-138: not clear. So you had 46 samples and 9 species. But what does the 3-3-3 sample mean? or for each species you chose 9 samples (no, cause that would be 81 samples in total)?

      This has now been clarified (lines 139-140).

      (9) Line 139-141: what methodological constraints? Too high glucose levels? Too little plasma?

      There were cases in which the device (glucometer) produced an unspecific error. This did not correspond to too high nor too low glucose levels, as these are differently signalled errors. Neither the manual nor the client service provided useful information to discern the cause. This may perhaps be related to the composition of the plasma of certain species, interfering with the measurement. Some clarifications have been added (lines 143-146).

      (10) Line 143: should be ZIMS.

      Corrected.

      (11) Line 120-148: you generally talk about individuals here, but I feel it would be more precise to use 'samples'.

      The use is totally interchangeable, as we never measured more than one sample for a given individual within this study. Besides, in some cases, saying “sample” could result less informative.

      (12) Line 150: missing the final number of measurements for glucose and glycation.

      Please, read the ESM6 (Table ESM6.1), where this information is given.

      (13) Line 154-155: so you took multiple samples from the same individual? It's the first time the text indicates so. Or do you mean technical replicates were not performed on the same samples?

      As previously indicated, each individual included only one sample. Replicates were done only for some individuals to validate the technique, as it would be unfeasible to perform replicates of all of them. This part of the text is referring to the fact that not all samples were analysed at the same time, as it takes a considerable amount of time, and the mass spectrometry devices are shared by other teams and project. Clarifications in this sense are now added (lines 160-163).

      (14) Line 171-172: "After realizing that diet classifications from AVONET were not always suitable for our purpose" - too informal. Try rephrasing, like "After determining that AVONET diet classifications did not align with our research needs...", but you still need to specify what was wrong with it and what was changed, based on what argument?

      The new formulation suggested by the reviewer has now been applied (lines 181-183). The details are given in the ESM6, as indicated in the text. 

      (15) Line 174-176: You start a new paragraph, talking about missing values, but you do not specify what variable are you talking about. you talk about calculating means, but the last variable you mentioned was diet, so it's even more strange.

      We refer to life history traits. It has now been clarified in the text (line 185).

      (16) Line 177: what longevity records? Coming from where? How did you measure longevity? Maximum lifespan ever recorded? 80-90% longevity, life expectancy???

      We refer to maximum lifespan, as indicated in the introduction and in every other case throughout the manuscript. Clarifications have now been introduced (188-190).

      (17) Line 180-183: using ZIMS can be problematic, especially for maximum longevity. There are often individuals who had a wrong date of birth entered or individuals that were failed to be registered as dead. The extremes in this database are often way off. If you want to combine though, you can check the correlation of lifespans obtained from different sources for the overlapping species. If it's a strong correlation it can be ok, but intuitively this is problematic.

      The species for which we used ZIMS were those for which no other databases reported any values. We could try correlations for other species, but this issue is not necessarily restricted to ZIMS, as the primary origin of the data from other databases is often difficultly traceable. Also, ZIMS is potentially more updated that some of the other databases, mainly Amniotes database, from which we rely the most, as it includes the highest number of species in the most easily accessible format.

      (18) Line 181-186: in ZIMS you calculate the average of the competing records, otherwise you choose the max. Why use different preferences for the same data?

      This constitutes a misunderstanding, for which we include clarifications now (line 196). We were referring here to the fact that for maximum lifespan the maximum is always chosen, while for other variables an average is calculated. 

      (19) Line 198: Burn-in and thinning interval is quite low compared to your number of iterations. How were model convergences checked?

      Please, check ESM1.

      (20) Line 201-203: What's the argument using these priors? Why not use noninformative ones? Do you have some a priori expectations? If so, it should be explained.

      Models have now been rerun with no expectations on the variance partitions so the priors are less informative, given the lack of firm expectations, and results are similar. Smaller nu values are also tried.

      (21) Line 217: "carried" OUT.

      Corrected (now in line 229).

      (22) Line 233-234: "species average model" - what is this? it was not described in the methods.

      Please, read the ESM6.

      (23) Line 232-246: (a) all this would be better described by a table or plot. You can highlight some interesting patterns, but describing it all in the text is not very useful I think, (b) statistically comparing orders represented by a single species is a bit odd.

      (a) Figure 1 shows this graphically, but this part was found to be quite short without descriptions by previous reviewers. (b) We recognise this limitation, but this part is not presented as one of the main results of the article, and just constitutes an attempt to illustrate very general patterns, in order to guide future research, as in most groups glycation has never been measured, so this still constitutes the best illustration of such patterns in the literature.

      (24) Line 281: the first time I saw "mass-adjusted maximum lifespan" - what is this, and how was it calculated? It should be described in the methods. But in any case, neither ratios, nor residuals should be used, but preferably the two variables should be entered side by side in the model.

      Please, see ESM6 for the explanations and justifications for all of this.

      (25) Line 281: there was also no mention of quadratic terms so far. How were polynomial effects tested/introduced in the models? Orthogonal polynomials? or x+ x^2?

      Please, read ESM6.

      (26) Table 1. What is 'Centred Log10Body mass', should be added in the methods.

      Please, read ESM6.

      (27) Table 1: what's the argument behind separating terrestrial and aquatic carnivores?

      This was mostly based on the a priori separation made in AVONET, but it is also used in a similar way by Szarka and Lendvai 2024 (comparative study on glucose in birds), where differences in glucose levels between piscivorous and carnivorous are reported. We had some reasons to think that certain differences in dietary nutrient composition, as discussed later, can make this difference relevant.

      (28) Table 1: The variable "Maximum lifespan" is discussed and plotted as 'massadjusted maximum lifespan' and 'residual maximum lifespan'. First, this is confusing, the same name should be used throughout and it should be defined in the methods section. Second, it seems that non-linear effects were tested by using x + x^2. This is problematic statistically, orthogonal polynomials should be used instead (check polyfunction in R). Also, how did you decide to test for non-linear effects in the case of lifespan but not the other continuous predictors? Should be described in the methods again.

      Please, read ESM6. Data exploration was performed prior to carry out these models. Orthogonal polynomials were considered to difficult the interpretation of the estimates and therefore the patterns predicted by the models, so raw polynomials were used. Clarifications have now been included in line 297.

      (29) Figure 2. From the figure label, now I see that relative lifespan is in fact residual. This is problematic, see Freckleton, R. P. (2009). The seven deadly sins of comparative analysis. Journal of evolutionary biology, 22(7), 1367-1375. Using body mass and lifespan side by side is preferred. This would also avoid forcing more emphasis on body mass over lifespan meaning that you subjectively introduce body mass as a key predictor, but lifespan and body size are highly correlated, so by this, you remove a large portion of variance that might in fact be better explained by lifespan.

      Please, read ESM6 for justifications on the use of residuals.

      Reviewer #2 (Recommendations for the authors):

      (1) If the semi-logarithmic relationship (glycation ~ log10(glucose)) is to be used to support the hypothesis about higher glycation resistance in species with high blood glucose (lines 318-321 and 386-388), it should be tested whether it is significantly better than the model assuming a simple linear relationship (i.e., glycation ~ glucose). Alternatively, if the coefficient is to be used to determine whether glycation rate slows down or accelerates with increasing glucose levels, log-log model (log10(glycation) ~ log10(glucose)) assuming power function relationship (glycation = a * glucose^b) should be used (as is for example in the literature about relationships between metabolic rates and body size). Probably the best approach would be to compare all three models (linear, semi-logarithmic, and log-log) and test if one performs significantly better. If none of them, then the linear model should be selected as the most parsimonious.

      Different options (linear, both semi-logarithmic combinations and log-log) have now been tested, with similar results. All of the models confirm the pattern of a significant positive relationship between glucose and glycation. Moreover, when standardizing the variables (both glucose and glycation, either log transformed or not), the estimate of the slope is almost equal for all the models. It is also lower than one, which in the case of both the linear and log-log confirms the stated prediction. The log-log model, showing a much lower DIC than the linear version, is now shown as the final model.

      (2) ESM6, line 46: Please note that Kruskal-Wallis rank sum test in ESM1 shows that, compared to baseline glucose levels, stress-induced glucose levels have higher median values (not only higher variation). With this in mind, what is the argument here about increased variation being the main driver of stress-induced change in glucose levels based on? It seems that both the median values and variation differ between baseline and stress-induced levels, and this should be acknowledged here.

      As discussed in the public answers, Kruskal Wallis does not allow to determine differences in mean, but just says that the groups are “different” (implicitly, in their ranksums, which does not mean necessarily in mean), while the Levene test performed signals heteroskedasticity. This makes this feature of the data analytically more grounded. Of course, when looking at the data, a higher mean can be perceived, but nothing can be said about its statistical significance. Still, some subtle changes have been introduced in corresponding section of the ESM6.

      (3) Have you recorded the sampling times? If yes, why not control them in the models? It is at least highly advisable to include the sampling times in the data (ESM5).

      As indicated in ESM6 lines 42-43, we do not have sampling times for most of the individuals (only zebra finches and swifts), so this cannot be accounted for in the models.

      (4) If sampling times will remain uncontrolled statistically, I recommend mentioning this fact and its potential consequences (i.e., rather conservative results) in the Methods section of the main text, not only in ESM6.

      A brief description of this has now been included in the main text (lines 129-132), referencing the more detailed discussion on the supplementary materials. Some subtle changes have also been included in the “Possible effects of stress” section of the ESM6.

      (5) ESM6, lines 52-53: The lower repeatability in Tomasek et al.' study compared to your study is irrelevant to the argument about the conservative nature of your results (the difference in repeatability between both studies is most probably due to the broader taxonomic coverage of the current study). The important result in this context is that repeatability is lower when sampling time is not considered within Tomasek et al's data set (ESM1). Therefore, I suggest rewording "showing a lower species repeatability than that from our data" to "showing lower species repeatability when sampling time is not considered" to avoid confusion. Please also note that you refer here to species repeatability but, in ESM1, you calculate individual repeatability. Nevertheless, both individual and species repeatabilities are lower when not controlling for sampling time because the main driver, in that case, is an increased residual variance.

      We recognize the current confusion in the way the explanation is exposed, and have significantly changed the redaction of the section. However, we would like to indicate that ESM1 shows both species and individual repeatability (for Tomasek et al. 2022 data, for ours only species as we do not have repeated individual values). Changes are now made to make it more evident.

      (6) I recommend providing brief guidelines for the interpretation of VIFs to the readers, as well as a brief discussion of the obtained values and their potential importance.

      Thank you for the recommendation. We included a brief description in lines 230-231. Also in the results section (lines 389-393).

      (7) Line: 264: Please note that the variance explained by phylogeny obtained from the models with other (fixed) predictors does not relate to the traits (glucose or glycation) per se but to model residuals.

      We appreciate the indication, and this has been rephrased accordingly (lines 280-286).

      (8) Change the term "confidence intervals" to "credible intervals" throughout the paper, since confidence interval is a frequentist term and its interpretations are different from Bayesian credible interval.

      Thank you for the remark, this has now been changed.

      (9) Besides lifespan, have you also considered quadratic terms for body mass? The plot in Figure 2A suggests there might be a non-linear relationship too.

      A quadratic component of body mass has not shown any significant effect on glucose in an alternative model. Also, a model with linear instead of log glucose (as performed in other studies) did not perform better by comparing the DICs, despite both showing a significant relationship between glucose and body mass. Therefore, this model remains the best option considered as presented in the manuscript.

      (10) ESM6, lines 115-116: It is usually recommended that only factors with at least 6 or 8 levels are included as random effects because a lower number of levels is insufficient for a good estimation of variance.

      In a Bayesian approach this does not apply, as random and fixed factors are estimated similarly. 

      (11) Typos and other minor issues:

      a) Line 66: Delete "related".

      b) Figure 2: "B" label is missing in the plot.

      c) Reference 9: Delete "Author".

      d) References 15 and 83 are duplicated. Keep only ref. 83, which has the correct citation details.

      e) ESM6, line 49: Change "GLLM" to "GLMM".

      Thank you for indicating this. Now it’s corrected.

    1. eLife Assessment

      This important study introduces a fully differentiable variant of the Gillespie algorithm as an approximate stochastic simulation scheme for complex chemical reaction networks, allowing kinetic parameters to be inferred from empirical measurements of network outputs using gradient descent. The concept and algorithm design are convincing and innovative. While the proofs of concept are promising, some questions are left open about implications for more complex systems that cannot be addressed by existing methods. This work has the potential to be of significant interest to a broad audience of quantitative and synthetic biologists.

    2. Reviewer #1 (Public review):

      Summary:

      This work introduces the differentiable Gillespie algorithm, DGA, which is a differentiable variant of the celebrated (and exact) Gillespie algorithm commonly used to perform stochastic simulations across numerous fields, notably in the life sciences. The proposed DGA approximates the exact Gillespie algorithm using smooth functions yielding a suitable approximate differentiable stochastic system as a proxy for the underlying discrete stochastic system, where DGA stochastic reactions have continuous reaction index and the species abundances. To illustrate their methodology, the authors specifically consider in detail the case of a well-studied two-state promoter gene regulation system that they analyze using a machine learning approach, and by combining simulation data with analytical results. For the two-state promoter gene system, the DGA is benchmarked by accurately reproducing the results of the exact Gillespie algorithm. For this same simple system, the authors also show how the DGA can be used for estimating kinetic parameters of both simulated and real noisy experimental data. This lets them argue convincingly that the DGA can become a powerful computation tool for applications in quantitative and synthetic biology. In order to argue that the DGA can be employed to design circuits with ad-hoc input-output relations, these considerations are then extended to a more complex four-state promoter model of gene regulation. The main strength of the paper is its clarity and its pedagogical presentation of the simulation methods.

      Strengths:

      The main strength of the paper is its clarity and its pedagogical presentation of the simulation methods.

      Weaknesses:

      It would have been useful to have a brief discussion, based on a concrete example, of what can be achieved with the DGA and is totally beyond the reach of the Gillespie algorithm and the numerous existing stochastic simulation methods. A more comprehensive and quantitative analysis of the limitations of the DGA, e.g. for rare events, and how it might be used for stochastic spatial systems would have also been helpful. However, this is arguably beyond the scope of this study whose primary goal is to introduce the DGA and demonstrate that it can achieve tasks like parameter estimation and network design.

      Comments on revisions:

      The authors have made a sound effort to address many of the comments raised in the previous reports. This has helped improve the clarity of the discussion.

    3. Reviewer #2 (Public review):

      Summary:

      In this work, the authors present a differentiable version of the widely-used Gillespie Algorithm. The Gillespie Algorithm has been used for decades to simulate the behavior of stochastic biochemical reaction networks. But while the Gillespie Algorithm is a powerful tool for the forward simulation of biochemical systems given some set of known reaction parameters, it cannot be used for reverse process, i.e. inferring reaction parameters given a set of measured system characteristics. The Differentiable Gillespie Algorithm ("DGA") overcomes this limitation by approximating two discontinuous steps in the Gillespie Algorithm with continuous functions. This makes it possible to calculate of gradients for each step in the simulation process which, in turn, allows the reaction parameters to be optimized via powerful backpropagation techniques. In addition to describing the theoretical underpinnings of DGA, the authors demonstrate different potential use-cases for the algorithm in the context of simple models of stochastic gene expression.

      Overall, the DGA represents an important conceptual step forward for the field and should lay the groundwork for exciting innovations in the analysis and design of stochastic reaction networks. At the same time, significantly more work is needed to establish when the approximations made by DGA are valid and to demonstrate the viability of the algorithm in the context of complicated reaction networks.

      Strengths:

      This work makes an important conceptual leap by introducing a version of the Gillespie Algorithm that is end-to-end differentiable. This idea alone has the potential to drive a number of exciting innovations in the analysis, inference, and design of biochemical reaction networks. Beyond the theoretical adjustments, the authors also implement their algorithm in a Python-based codebase that combines DGA powerful optimization libraries like PyTorch. This codebase has the potential to be of interest to a wide range of researchers, even if the true scope of the method's applicability remains to be fully determined.

      The authors also demonstrate how DGA can be used in practice both to infer reaction parameters from real experimental data (Figure 7) and to design networks with user-specified input-output characteristics (Figure 8). These illustrations should provide a nice roadmap for researchers interested in applying DGA to their own projects/systems.

      Finally, although it does not stem directly from DGA, the exploration of pairwise parameter dependencies in different network architectures provides an interesting window into the design constraints (or lack thereof) that shape the architecture of biochemical reaction networks.

      Weaknesses:

      While it is clear that the DGA represents an important conceptual advancement, the authors do not do enough in the present manuscript to (i) validate the robustness of DGA inference and (ii) demonstrate that DGA inference works in the kinds of complex biochemical networks where it would actually be of legitimate use.

      It is to the authors' credit that they are open and explicit about the potential limitations of DGA due to breakdowns in its continuous approximations. However they do not provide the reader with nearly enough empirical (i.e. simulation-based) or theoretical context to assess when, why, and to what extent DGA will fail in different situations. In Figure 2, they compare DGA to GA (i.e. ground-truth) in the context of a simple two state model of a stochastic transcription. Even in this minimal system, we see that DGA deviates notably from ground-truth both in the simulated mRNA distributions (Figure 2A) and in the ON/OFF state occupancy (Figure 2C). This begs the question of how DGA will scale to more complicated systems, or systems with non-steady state dynamics. Will the deviations become more severe? This is important because, in practice, there is really not much need for using DGA with a simple 2 state system-we have analytic solutions for this case. It is the more complex systems where DGA has the potential to move the needle.

      A second concern is that the authors' present approach for parameter inference and error calculation does not seem to be reliable. For example, in Figure 5A, they show DGA inference results for the ON rate of a two-state system. We see substantial inference errors in this case, even though the inference problem should be non-degenerate in this case. One reason for this seems to be that the inference algorithm does not reliably find the global minimum of the loss function (Figure 2B). To turn DGA into a viable approach, it is paramount that the authors find some way to improve this behavior, perhaps by using multiple random initializations to better search the loss space.

      Finally, the authors do a good job of illustrating how DGA might be used to infer biological parameters (Figure 7) and design reaction networks with desired input-output characteristics (Figure 8). However, analytic solutions exist for both of the systems they select for examples. This means that, in practice, there would be no need for DGA in these contexts, since one could directly optimize, e.g., the expressions for the mean and Fano Factor of the system in Figure 7A. I still believe that it is useful to have these examples, but it seems critical to add a use-case where DGA is the only option.

      Comments on revisions:

      I am concerned that the results in Figure 8D may not be correct, or that the authors may be mis-interpreting them. From my reading of the paper they cite (Lammers & Flamholz 2023), the equilibrium sharpness limit for the network they consider in Figure 8 should be 0.25. But both solutions shown in Figure 8D fall below this limit, which means that they have sharpness levels that could have been achieved with no energy expenditure. If this is the case, then it would imply that while both systems do dissipate energy, they are not doing so productively; meaning that the same results could be achieved while holding Phi=0.

      I acknowledge that this could be due to a difference in how they measure sharpness, but wanted to raise it here in case it is, in fact, a genuine issue with the analysis.

      There should be an easy fix for this: just set the sharper "desired response" curve in 8b to be such that it demands non-equilibrium sharpness levels (0.25)