10,733 Matching Annotations
  1. Last 7 days
    1. Author Response

      Reviewer #1 (Public Review):

      In this study, the protein composition of exocytotic sites in dopaminergic neurons is investigated. While extensive data are available for both glutamatergic and GABA-ergic synapses, it is far less clear which of the known proteins (particularly proteins localized to the active zone) are also required for dopamine release, and whether proteins are involved that are not found in "classical" synapses. The approach used here uses proximity ligation to tag proteins close to synaptic release sites by using three presynaptic proteins (ELKS, RIM, and the beta4-subunit of the voltage-gated calcium channel) as "baits". Fusion proteins containing BirA were selectively expressed in striatal dopaminergic neurons, followed by in-vivo biotin labelling, isolation of biotinylated proteins and proteomics, using proteins labelled after expression of a soluble BirAconstruct in dopaminergic neurons as reference. As controls, the same experiments were performed in KO-mouse lines in which the presynaptic scaffolding protein RIM or the calcium sensor synaptotagmin 1 were selectively deleted in dopaminergic neurons. To control for specificity, the proteomes were compared with those obtained by expressing a soluble BirA construct. The authors found selective enrichments of synaptic and other proteins that were disrupted in RIM but not Syt1 KO animals, with some overlap between the different baits, thus providing a novel and useful dataset to better understand the composition of dopaminergic release sites.

      Technically, the work is clearly state-of-the-art, cutting-edge, and of high quality, and I have no suggestions for experimental improvements.

      We thank the reviewer for this summary and for pointing out the high quality of the work.

      On the other hand, the data also show the limitations of the approach, and I suggest that the authors discuss these limitations in more detail. The problem is that there is very likely to be a lot of non-specific noise (for multiple reasons) and thus the enriched proteins certainly represent candidates for the interactome in the presynaptic network, but without further corroboration it cannot be claimed that as a whole they all belong to the proteome of the release site.

      We fully agree with the reviewer. Most importantly, we have changed the final section from “Conclusions” to “Summary of conclusions and limitations” (lines 501-518) to summarize the limitations with equal weight to the conclusions. In the revised manuscript, we also included many specific additional points in this respect throughout the discussion and the results: many hits could be noise (lines 458, 478-479), thresholding affects the inclusion of proteins in the release site dataset (lines 208-215), the seven-day time window could deliver interactors from the soma to the synapse (lines 493-495), specific oddities (for example histones, lines 482-485), iBioID does not deliver an interactome per se but is simply based on proximity (lines 505-508), and several more. We also clearly state that each specific hit needs follow-up studies (lines 501-503: ” Each protein will require validation through morphological and functional characterization before an unequivocal assignment to dopamine release sites is possible.”), and a similar statement was added on lines 374-375.

      Reviewer #2 (Public Review):

      The Kaiser lab has been on the forefront in understanding the mechanism of dopamine release in central mammalian neurons. assessing dopamine neuron function has been quite difficult due to the limited experimental access to these neurons. Dopamine neurons possess a number of unique functional roles and participate in several pathophysiological conditions, making them an important target of basic research. This study here has been designed to describe the proteome of the dopamine release apparatus using proximity biotin labeling via active zone protein domains fused to BirA, to test in which ways its proteome composition is similar or different to other central nerve terminals. The control experiments demonstrating proper localization as well as specificity of biotinylation are very solid, yielding in a highly enriched and well characterized proteome data base. Several new proteins were identified and the data base will very likely be a very useful resource for future analysis of the protein composition of synapse and their function at dopamine and other synapses.

      We thank the reviewer for this positive assessment of our work.

      Major comment:

      The authors find that loss of RIM leads to major reduction in the number of synaptically enriched proteins, while they did not see this loss of number of enriched proteins in the Syt1-KO's, arguing for undisrupted synaptome. Maybe I missed this, but which fraction of proteins and synaptic proteins are than co-detected both in the Syt1 and control conditions when comparing the Venn diagrams of Fig2 and Fig 3 Suppl. 2? This analysis may provide an estimate of the reliability of the method across experimental conditions.

      We thank the reviewer for proposing to be clear in the comparison of the control and Syt-1 cKODA data. A direct comparison of hit numbers is included on lines 323-324, with 37% overlap between control and Syt-1 cKODA (vs. 15% between control and RIM cKODA). A direct mapping of this overlap is included in Fig. 4E. We think that this direct comparison is complicated by a number of factors, as outlined below, and did our best to include these complications in the discussion, including the last section (lines 501-518).

      First, to assess overall similarity, the initial comparison should be to assess axonal proteins identified in the BirA-tdTomato samples. These datasets are quite similar, with 671 (control) and 793 (Syt-1 cKODA) proteins detected, and a high overlap of 601 proteins. We think that this indicates that the experiment per se is quite reproducible. The comparison of the release site proteome between control and Syt-1 cKODA is more complicated. We think that the main point of this comparison is that the overall number of hits is quite similar, with 450 hits in the Syt-1 cKODA proteome and 527 hits in the control proteome, and we now show that this similarity holds across multiple thresholds (lines 298-301; ≥ 1.5: Syt-1 cKODA 602 hits, control 991, ≥ 2.0: 450/527, ≥ 2.5: 252/348). Detailed analyses of overlap reveals that known active zone proteins such as Bassoon, CaV2 channels, RIMs, and ELKS proteins are present in both proteomes, but the overlap is partial and incomplete with 191 proteins found in both proteomes. As discussed throughout and summarized on lines 501-518, the reasons for this partial overlap may be manifold. Trivially, it could be explained by noise or non-saturation (“incompleteness”) of the proteome. We also think that the Syt-1 proteome is not expected to be identical because there is a strong release deficit in these mice. If Syt-1 has a dopamine vesicle docking function (which it does at conventional synapses [4]), this could influence the proteome. We note that protein functions in the dopamine axon are not well established, but inferred from studies of classical synapses.

      We have scrutinized the manuscript to not express that the control and Syt-1 cKODA proteomes are identical; we know they are not and discuss the example of α-synuclein specifically (Fig. 6, lines 347-362). Rather, the striking part is that the extent of the proteomes with high hit number, much higher than RIM cKODA, are similar. Specific hits have to be assessed in a detailed way, one hit at a time, in future studies, as expressed unequivocally on lines 501-503).

      Reviewer #3 (Public Review):

      In this study Kershberg et al use three novel in vivo biotin-identification (iBioID) approaches in mice to isolate and identify proteins of axonal dopamine release sites. By dissecting the striatum, where dopamine axons are, from the substantia nigra and VTA, where dopamine somata are, the authors selectively analyzed axonal compartments. Perturbation studies were designed by crossing the iBioID lines with null mutant mice. Combining the data from these three independent iBioID approaches and the fact that axonal compartments are separated from somata provides a precise and valuable description of the protein composition of these release sites, with many new proteins not previously associated with synaptic release sites. These data are a valuable resource for future experiments on dopamine release mechanisms in the CNS and the organization of the release sites. The BirA (BioID) tags are carefully positioned in three target proteins not to affect their localization/function. Data analysis and visualization are excellent. Combining the new iBioID approaches with existing null mutant mice produces powerful perturbation experiments that lead and strong conclusions on the central role of RIM1 as central organizers of dopamine release sites and unexpected (and unexplained) new findings on how RIM1 and synaptotagmin1 are both required for the accumulation of alpha-synuclein at dopamine release sites.

      We thank the reviewer for assessing our paper, for summarizing our main findings, and for expressing genuine enthusiasm for the approach and the outcomes.

      It is not entirely clear how certain decisions made by the authors on data thresholds may affect the overall picture emerging from their analyses. This is a purely hypothesis-generating study. The authors made little efforts to define expectations and compare their results to these. Consequently, there is little guidance on how to interpret the data and how decisions made by the authors affect the overall conclusions. For instance, the collection of proteins tagged by all three tagging strategies (Fig 2) is expected to contain all known components of dopamine release sites (not at all the case), and maybe also synaptic vesicles (2 TM components detected, but not the most well-known components like vSNAREs and H+/DA-transporters), and endocytic machinery (only 2 endophilin orthologs detected). Whether or not a more complete collection the components of release sites, synaptic vesicles or endocytic machinery are observed might depend on two hard thresholds applied in this study: (a) "Hits" (depicted in Fig 2) were defined as proteins enriched {greater than or equal to} 2-fold (line 178) and peptides not detected in the negative control (soluble BirA) were defined as 0.5 (line 175). How crucial are these two decisions? It would be great to know if the overall conclusions change if these decisions were made differently.

      We agree with the reviewer that the thresholding decisions are important and have now better incorporated the rationale for these decisions in the manuscript.

      Two-fold enrichment threshold. As outlined in the response to point 1 in the editorial decision letter, we now include figure supplements to illustrate the composition of the control proteome if we apply 1.5- or 2.5-fold enrichment thresholds (Fig. 2 – figure supplements 1 and 2) instead of the 2.0-fold threshold used in Fig. 2. This leads to more or less hits (991 and 348, respectively) compared to the 2.0-fold threshold (527 hits). It is noteworthy that the SynGO-overlap is the highest with the 2.0 threshold (37% vs. 31% at 1.5 and 33% at 2.5, Fig. 2 – figure supplement 3), justifying this threshold experimentally in addition to what was done in previous work [1,2]. These data are now described on lines 208-215 of the manuscript. When we apply these different thresholds to RIM and Syt-1 cKODA datasets, the finding that RIM ablation disrupts release site assembly persists. The following hit numbers were observed in the mutants at the 1.5, 2.0 and 2.5 enrichment thresholds, respectively: RIM cKODA 268, 198 and 82 hits; Syt cKODA 602, 450 and 252 hits. Hence, the extent of the release site proteome remains much smaller after RIM ablation independent of the enrichment threshold, bolstering the conclusion that RIM is an important scaffold for these release sites. This is included in the revised manuscript on lines 298301.

      Undetected peptides in BirA-tdTomato. We did not express this well enough in the manuscript. The undetected proteins were set to 0.5 such that a protein that was detected with a specific bait but not with BirA-tdTomato could be illustrated with a specific circle size, not to determine inclusion in the analyses. If the average peptide count across repeats with a specific bait was 1, this resulted in inclusion in Fig. 2 and consecutive analyses with the smallest circle size. Hence, this decision was made to define circle size. It did not affect inclusion in Fig. 2 beyond the following two points. If one were to further decrease it, this might result in including peptides that only appeared once as a single peptide for some of the experiments, which we wanted to avoid. If one would set it higher (to 1), this artificial threshold would be equal to proteins that were actually detected experimentally multiple times, which we wanted to avoid as well. We have now clarified this on lines 165-167 and lines 1119-1121.

      Expected proteins. In general, interpreting our dataset with a strong prior of expected proteins is difficult. The literature on release site proteins specifically characterized for dopamine is limited. We have found Bassoon, RIM, ELKS and Munc13 to be present using 3D-SIM superresolution microscopy [5,6], and we indeed found these proteins in the data as discussed on lines 227-232 and lines 423-445 in the revised manuscript. The prediction for vesicular and endocytic proteins is complicated. Release sites are sparse [5,7], and vesicle clusters are widespread in the dopamine axon, in some cases filling most of the axon (for example, see extended vesicle clusters filling much of the dopamine axon in Fig. 7E of [5]). Furthermore, docking in dopamine axons has not been characterized, and it is unclear how frequently vesicles are docked. Hence, it is not clear whether vesicular proteins should be concentrated at release sites compared to the rest of the axon (the BirA-tdTomato proteome we use for normalization). Similar points can be made for proteins for endocytosis and recycling of dopamine vesicles. Within the dopamine system, it is unclear whether the recycling pathway is close to the exocytic sites. One consistent finding across functional studies is that depletion after activity is unusually long-lasting in the dopamine system, for tens of seconds, even after only mild stimulation [5,8–13]. Hence, endocytosis and RRP replenishment might be very slow in these axons. It is not certain that endocytic factors are predeployed to the plasma membrane, and if they are, it is unclear how close to release sites they would be. As such, we agree with the reviewer that the proteome we describe is a hypothesisgenerator. With the limited knowledge on dopamine release, predictions beyond the previously characterized proteins in dopamine axons are difficult to make.

      We thank the reviewer for suggesting to include a better analysis of different thresholds and for giving us the opportunity to clarify the other points that were raised.

      Given the good separation of the axonal compartment from the somata (one of the real experimental strengths of this study), it is completely unexpected to find two histones being enriched with all three tagging strategies (Hist1h1d and 1h4a). This should be mentioned and discussed.

      We agree with the reviewer and have addressed this point in the manuscript. This could either reflect noise, or there could be more specific reasons behind it. The manuscript now states on lines 482-485: “It is surprising that Hist1h1d and Hist1h4a, genes encoding for the histone proteins H1.3 and H4, were robustly enriched (Fig. 2A). These hits might be entirely unspecific, or their co-purification could be due to biotinylation of H1 and H4 proteins (Stanley et al., 2001). It is also possible that there are unidentified synaptic functions of some of the unexpected proteins.” Ultimately, we do not know why these proteins are enriched, and we state clearly in the section “Summary of conclusions and limitations” that each new hit has to be validated in future studies (lines 501-503).

      It would also help to compare the data more systematically to a previous study that attempted to define release sites (albeit not dopamine release sites) using a different methodology (biochemical purification): Boyken et al (only mentioned in relation to Nptn, but other proteins are observed in both studies too, e.g. Cend1).

      We agree with the reviewer that Boyken et al, 2013 [14] is an important resource for our paper and for the assessment of the proteomic composition of release sites. We have now introduced links and citations to this paper multiple times (for example, on lines 231, 241, 430, 443, 481) and have expanded the discussion of overlap between these proteomes, including on Cend1 (lines 479482).

      We think that a systematic comparison with Boyken et al, 2013 [14] is complicated because (1) so little is known about dopamine release mechanics and (2) because the approach is very different between the two papers. In respect to (1), most prominently, it is not certain how frequently vesicles are docked in the dopamine axon. Only ~25% of the varicosities contain these release sites, and vesicle docking has not been characterized in striatal dopamine axons to the best of our knowledge. Hence, how a docking site at a classical synapse compares to a dopamine release site remains unclear at the outset. For point (2), the key difference is that “within dataset normalizations” are very different in these two studies. In our iBioID dataset, we normalize to soluble proteins defined as proximity to BirA-tdTomato. In ref. [14], the authors express enrichment over “light”, regular synaptic vesicles purified with the same approach. This has a major impact on the proteome that strongly influences a direct comparison of hits, because there are large differences in the normalization. While each normalization makes sense for the respective paper, it complicates direct comparison.

      With these points in mind, we have compared hits across both datasets class-by-class. For some classes, the datasets have reasonable overlap for ≥ 2-fold enriched proteins: for example for active zone proteins (3 of 7 hits in [14] appear in our control proteome) and adhesion and cell surface proteins (8 of 18). For other classes, the overlap is limited: for example for nucleotide metabolism/protein synthesis (0 of 16 hits in [14] appear in our dataset) and cytoskeletal proteins (5 of 29). We hope the reviewer agrees, that given these factors, the analyses and discussion needed for a systematic comparison goes beyond the scope of our paper. We have instead added a number of references to Boyken et al., 2013 [14], as outlined above, when direct comparison is meaningful.

    1. eLife assessment

      This fundamental study from Gold and colleagues substantially advances our understanding of the synaptic targeting of a major postsynaptic protein kinase, CaMKII, which is the basis for the persistence of excitatory synaptic strength in synaptic plasticity. The evidence supporting the claims of the authors is convincing, with cell biological, biochemical, as well as structural biological approaches. This work will be of interest to cell and computational biologists working on learning/memory.

    2. Reviewer #1 (Public Review):

      The manuscript by Curtis et al. reports the interaction between CaMKII and alpha-actinin-2. The authors found that the interaction was elevated after NMDA receptor activation in dendritic spines. In addition, this study reveals NMDA receptor binding to CaMKII facilitates alpha-actinin-2 access to the CaMKII regulatory segment, indicating that the NMDA receptor is involved in this interaction. The authors identified the EF1-4 motifs mediated this interaction, and overexpression of this motif inhibited structural LTP. Moreover, biochemical measurements of affinities from various combination of protein fragments including autoinhibited CaMKII 1-315, regulatory segments of CaMKII, and the EF-hand motif reveals that autoinhibited CaMKII has limited access to alpha-actinin-2. The authors also solved the structure of the interaction, supporting their finding in neurons at the molecular level. The authors claim that the interaction between CaMKII and alpha-actinin-2 is essential for structural LTP through cooperative action by the NMDA receptor and actin cytoskeleton.

      Overall, the experiments are well-designed and the results are largely convincing and well-interpreted. But some aspects of the experiments need to be clarified.

      1. Time resolution of the interaction analysis appears to be poor, as calcium elevation in a dendritic spine would be at milli-second order. What is the time window to interact alpha-actinin-2 with CaMKII during NMDA receptor activation or LTP?<br /> 2. The authors analyzed the binding of CaMKII and alpha-actinin-2 with partial fragments. It remains to be unknown whether CaMKII can form a protein complex with GluN2B and alpha-actinin-2 in a single CaMKII protomer.<br /> 3. Besides synaptic localization, the effect of the interaction on the enzymatic activity of CaMKII is not known.<br /> 4. Although the authors quantify the effect of the EF-hand disruptor by measuring numbers of the dendritic spine by its shape, the specificity of the EF-hand disruptor needs to be clarified.

    3. Reviewer #2 (Public Review):

      Gold and his colleagues first ectopically expressed aACTN2 constructs with various deletions and determine the spatial proximity to CaMKII by PLA. Chemical LTP induced by brief glycine application in hippocampal cultures strongly augmented the PLA puncta density in spines (postsynaptic sites). This interaction specifically depended on the 4 EF hands near the C-terminus of aACTN. At the same time expression of the 4 EF hands (plus the C-terminal PDZ ligand) impaired the formation of larger mushroom spines under unstimulated conditions and the increase in mushroom spines seen after chemLTP when compared to non-transfected conditions or transection of the EF hands with a point mutation (L854R) that disrupted binding to CaMKII.

      To further define the interaction between aACTN and CaMKII the authors then solved a crystal structure formed by the aACTN EF3/4 and regulatory segment of CaMKII. This structure confirmed the role of L854 in the interaction. It also explained earlier results that phosphorylation of threonine in position 306 but not of threonine 305 of the CaMKII regulatory domain impaired aACTN binding as T306 but not T305 is engaged in critical interactions. This contrasts with Ca/CaM binding to CaMKII, which engages both threonines and is blocked by the phosphorylation of either residue. Consistently, earlier structures of Ca/CaM with the CaMKII regulatory domains showed respective differences to the new aACTN-CaMKII structure.

      Additional analysis of these data indicated that the association of the regulatory domain with the kinase domain occludes access to aACTN EF3/4. This is an important finding because it implies that only active CaMKII like T286 autophosphorylated CaMKII or bound to GluN2B would be able to effectively interact with aACTN in intact cells.

      Finally, and remarkably, binding was augmented by a protein fragment of the GluN2B C-terminus that contains the binding site for CaMKII even when Ca/CaM was still present. This result suggests that with GluN2B present aACTN can bind to CaMKII even though in the absence of GluN2B Ca/CaM occludes this binding. This finding opens up new research directions.

    4. Reviewer #3 (Public Review):

      This manuscript builds upon prior work showing that alpha-actinin-2 binds to the regulatory domain of the major postsynaptic protein kinase, CaMKII. The authors report the structure of a complex between the relevant domain in alpha-actinin-2 and a peptide based on the CaMKII regulatory domain. Data are presented indicating that the interaction of the NMDA receptor GluN2B subunit with the CaMKII catalytic domain stabilizes the complex with alpha-actinin-2. Furthermore, the authors present proximity ligation assay (PLA) data obtained in cultured neurons demonstrating that NMDA receptor activation strongly enhances the colocalization of CaMKII with alpha-actinin-2. Data obtained using mutated proteins indicate that this co-localization is mediated by the interaction characterized structurally.

      Strengths:

      Significant strengths of this work are:<br /> 1. The high-quality structures of the complex that are reported.<br /> 2. Integration of these findings with the much better-studied complex of CaMKII and GluN2B.<br /> 3. The convincing PLA analyses show that NMDA receptor activation increases CaMKII colocalization with alpha-actinin-2.<br /> 4. The careful comparisons of data from these new studies with data reported in previous publications.

      Weaknesses:

      Despite the significant strengths of the work, there are some gaps/weaknesses.<br /> 1. Although there is abundant published evidence that activated CaMKII colocalizes with NMDA receptors, the evidence for the involvement of GluN2B in the CaMKII-alpha-actinin-2 complex in neurons is lacking.<br /> 2. The evidence supporting a role for the EF1 and EF2 domains of alpha-actinin-2 in binding to CaMKII is not very convincing.<br /> 3. CaMKII autophosphorylation at multiple sites plays an important role in regulating the subcellular localization of CaMKII, but the role of autophosphorylation is not explored here.

      Taken to together the manuscript describes novel data that provide a significant extension to prior work, and the data convincingly, but perhaps only partially, support an interesting proposed model for the control of CaMKII targeting in spines.

      This more sophisticated delineation of the mechanisms underlying CaMKII targeting synapses will be of interest to the broader field of investigators studying the molecular basis for the regulation of excitatory synaptic transmission, learning, and memory.

    1. Author Response

      Reviewer #2 (Public Review):

      In this paper, Xiao et al. suggest that PASK is a driver for stem cell differentiation by translocating from the cytosol to the nucleus. This phenomenon is dependent on the acetylation of PASK mediated by CBP/EP300, which is driven by glutamine metabolism. Furthermore, this study showed that PASK interferes/weakens the Wdr5-APC/C interaction, where PASK interacts with Wdr5, resulting in repression of Pax7, leading to stem cell differentiation.

      There exist huge interest in maintaining adult stem cells and ES cells in their pluripotent form and the work painstakingly perform several experiments to present that PASK is a good target to achieve that goal.

      However, the work on the paper relies mostly on data from C2C12 cells as adult muscle stem cell models, in vivo experimental data, and primary myoblasts from mice. Using these models makes the story contextual in muscle stem cells. Authors have not tried to extrapolate similar claims in other adult stem cell models. This severely restricts the claim to muscle stem cells even though PASK is required for the onset of embryonic and adult stem cell differentiation in general. Their work could be much strengthened if it is also tried on mesenchymal stem cells as these cells are also as metabolically active as muscle cells.

      We thank reviewers for their enthusiasm for our studies using PASKi. We have previously shown that PASKi prevented differentiation of 10T1/2 cells into adipogenic lineage (Kikani et al, Elife, 2016). We used stem cells from embryonic (ESC) and adult (MuSCs) origin to show broad application of PASKi in preserving self-renewal independent of stem cell origin. We believe that PASK function to be conversed across different stem cell paradigms; and our results in this manuscript would provide framework to further study PASK in other stem cell paradigms.

      Reviewer #3 (Public Review):

      This manuscript entitled "PASK relays metabolic signals to mitotic Wdr5-APC/C complex to drive exit from selfrenewal" by Xiao et al presents an interesting story on the role of PASK in the control of muscle stem cell fate by controlling the decision between self-renewal and differentiation. While the biochemistry presented is fairly compelling, the experiments revolving around the myogenic cells are lacking in quality and data.

      Major concerns:

      1) The isolation method used by this group to isolate muscle stem cells is inappropriate for the experiments used and may contribute to the misinterpretation of some of the results. It is simply a preplating method that results in a very heterogenous cell population in terms of cell type, comprised of numerous fibroblasts. While preplating can be used to isolate muscle stem cells and culture them as myoblasts, it takes days of growth and multiple rounds of passaging that are not used in this paper in order to get a more pure population of myogenic cells. This would also explain the high number of Pax7 negative cells in their primary myoblast experiments (~50% in some conditions) as they are most likely fibroblasts, which the authors could show by staining for fibroblast markers. The increase in Pax7 cells in certain conditions could also simply be due to the loss of contaminating cell types due to the treatment. Every single experiment that was performed on myoblasts must be redone using a more appropriate cell isolation method (i.e. FACS) or by culturing these isolated cells for a much longer period of time to eventually get a more pure cell population. As it stands, none of the data from the primary myoblast experiments are trustworthy.

      We agree – and thus, we have reproduced our results using two different methods of purifying MuSCs from mice, as indicated above. We took care to stain each isolation method with vimentin (a marker for fibroblasts) to ensure the purity of our preparation. Data are included in the Essential revisions section.

      2) The authors possess a genetic mouse model where PASK is knocked out. However, the mouse model is never described and the paper that is referenced also does not describe it. Please detail your mouse model.

      3) The majority of experiments are performed on C2C12 cells. While C2C12s are adequate for biochemistry and proof of concepts, when it comes to biological significance primary myoblasts should be used. While the authors try to explain this use by claiming that primary myoblasts undergo precocious differentiation that can be avoided by using an appropriate growth media (F10, 20% FBS, 1% P/S, 5ng/mL of bFGF).

      Kindly see the response for this comment in the Essential revision section.

      4) The authors possess a genetic mouse model, yet performed RNA-Seq on C2C12 myoblasts that were either untreated or treated with a PASK inhibitor. It would be much more informative and valuable to sequence the primary myoblasts from WT and PASK KO mice, thereby providing a more biologically relevant model.

      We used C2C12 for several reasons for initial transcriptome analysis using PASKi and validated the results from that analysis in primary myoblasts. (1) C2C12 cells are an excellent model for performing biochemical pathway characterization, including discovering new substrate targets for PASK, finding PASK interacting partners, and measuring the biochemical activity of PASK under various conditions. Thus, it would form the basis for a longer-term study of the signaling functions of PASK in one cell system (myoblasts), which can be validated and compared with the primary cell system. (2) PASKi treatment can acutely inhibit PASK catalytic activity without the genetic loss of its protein level. For many enzymatic proteins, catalytic inhibition could have a different biological effect compared with genetic loss of protein (Weiss et al.; Nat Chem Biol. 2007 Dec; 3(12): 739–744.). Thus, we chose the PASKi and C2C12 myoblasts system to study the kinase activitydependent effect on the myoblast transcriptome. However, throughout the manuscript, we used PASKi, PASK siRNA, and PASKKO primary cells to cross-validate all our data. We believe the conditional loss of PASK in MuSCs specific manner will be a great model to repeat the RNA-seq analysis in the future and compare the data obtained with PASKi in cultured myoblasts.

      5) The KO mouse model is rarely used and the cells isolated from it would be very useful in determining the biological role of PASK in muscle cells. The authors should isolate WT and KO cells and perform basic muscle functional experiments such as EDU incorporation for proliferation, and fusion index for differentiation to see whether the loss of PASK has an effect on these cells.

      We have published the characterization of myogenesis phenotype of PASKKO model in our previous manuscript (Kikani et al, 2016). Thus, we erred by not redoing those experiment in the previous version. We have now reproduced those results and markedly extended the chacterization of PASKKO cells in vitro, including BrdU incorporation, myogenesis, Pax7 heterogeneity, Myogenin expression and PASK subcellular distribution using WT cells. We have also characterized regeneration phenotype of PASKKO mice. We thank the reviewer for helping strengthen the biological context of our manuscript.

      6) The authors never look at quiescent muscle stem cells and early activated muscle stem cells in terms of PASK protein expression and dynamics. The authors should isolate EDL myofibers and stain for PASK and PAX7 at 0, 24, 48, and 72-hour post isolation. This would allow the authors to quantify the changes in PASK expression and cell localization, as well as confirm the number of muscle stem cells in WT and KO mice, during quiescence and during the process of muscle stem cell activation, proliferation, and differentiation in a near in vivo context.

      As described in Figure 1-figure supplement 2A, PASK is not expressed in quiescent MuSCs. Therefore, we do not anticipate a functional role of PASK in initial activation of QSC. We do not propose that PASK plays a role in the maintenance of the QSC state or the exit and initial activation of MuSCs following muscle injury. PASK is transcriptionally activated in proliferating myoblasts during regeneration (Kikani et al, elife 2016) and upon isolation of MuSCs (Figure S1D). Therefore, we specifically focus on studying the biochemical functional role of PASK signaling in activated (proliferating) myoblasts isolated from mice or during early regeneration. We have ongoing studies examining the precise temporal kinetics of PASK transcription regulation in Pax7+ MuSCs as they are activated, and to identify its upstream transcriptional regulators. However, we respectfully suggest that these avenues are outside of the purview of this current manuscript that specifically explores the metabolic pathway that establishes progenitor population from activated myoblasts.

      7) Contrary to their claim, MyoD is not a stemness/self-renewal gene.

      We agree, and have corrected the text.

      8) The authors state that PASK is necessary for exit from self-renewal and establishment of a progenitor population, but this is a vast overstatement. In the genetic KO mouse model, the mice are able to regenerate their muscle after injury, therefore PASK cannot be a necessary protein for the formation of progenitor cells.

      During the muscle regeneration, we observed a significant inhibition of the early regenerative response in PASKKO mice, marked by severely reduced levels of eMHC. Concomittantly, we observed increased numbers of Pax7+ MuSCs at Day 5 of regeneration compared with WT muscles. We have extensively shown requirement of PASK for myogenin induction in vitro and in vivo (Kikani et al, 2016, Kikani et al, 2019). Based on these evidence, we propose that PASK is necessary for the exit from Pax7+ self-renewing stem cells and generation of Myog+ committed progenitor populations.

      9) In numerous figure panels, the y-axis represents the # of cells, rather than a percentage or ratio. This is uninformative as the number of cells will never be the same between conditions and experiments. These panels need to be replaced with a more appropriate y-axis.

      We have updated the axes to % cells where appropriate.

    2. eLife assessment

      This study advances the understanding of metabolic regulation underpinning self-renewal of stem cells. The authors report that glutamine-dependent acetylation of the kinase PASK regulates its nuclear localization. Evidence is provided that nuclear PASK binds and disrupts Wdr5 association with the anaphase-promoting complex/cyclosome and is a trigger for the activation of myogenic programs in cultured cells. The study will be of interest to an audience in the areas of stem cells, regeneration and metabolic signalling.

    3. Reviewer #1 (Public Review):

      The study provides mechanistic insight into molecular events occurring at the onset of differentiation mediated by the kinase PASK. Specifically, the work focuses on the multiple steps that converge on post-translational modifications of PASK and its translocation to the nucleus during myogenesis. The authors present evidence that glutamine-fueled, CPB/EP300-mediated acetylation of PASK is required for its nuclear translocation. This allows (nuclear) PASK to interact with Wdr5 and consequently disrupt its association with the anaphase-promoting complex/cyclosome and inhibit Pax7 transcription, marking the onset of muscle differentiation. The conclusions are supported by an analysis of the effects of glutamine modulation on differentiation and maintenance of stemness in primary muscle stem cells; PASK localization in myoblasts and primary muscle stem cells as well as detailed biochemistry with modified forms of PASK to interrogate molecular interactions. C2C12 myoblast cells and primary muscle stem cells are cellular systems employed in the study with observations confirmed in cells derived from mice with genetic ablation of PASK. The study provides molecular detail on events linking glutamine metabolism to the transcriptional control of lineage differentiation, through the regulation of PASK. The analysis of these events in other systems would be of value to understanding their broader applicability.

    4. Reviewer #2 (Public Review):

      In this paper, Xiao et al. suggest that PASK is a driver for stem cell differentiation by translocating from the cytosol to the nucleus. This phenomenon is dependent on the acetylation of PASK mediated by CBP/EP300, which is driven by glutamine metabolism. Furthermore, this study showed that PASK interferes/weakens the Wdr5-APC/C interaction, where PASK interacts with Wdr5, resulting in repression of Pax7, leading to stem cell differentiation.

      There exist huge interest in maintaining adult stem cells and ES cells in their pluripotent form and the work painstakingly perform several experiments to present that PASK is a good target to achieve that goal.

      However, the work on the paper relies mostly on data from C2C12 cells as adult muscle stem cell models, in vivo experimental data, and primary myoblasts from mice. Using these models makes the story contextual in muscle stem cells. Authors have not tried to extrapolate similar claims in other adult stem cell models. This severely restricts the claim to muscle stem cells even though PASK is required for the onset of embryonic and adult stem cell differentiation in general. Their work could be much strengthened if it is also tried on mesenchymal stem cells as these cells are also as metabolically active as muscle cells.

    5. Reviewer #3 (Public Review):

      This manuscript entitled "PASK relays metabolic signals to mitotic Wdr5-APC/C complex to drive exit from self-renewal" by Xiao et al presents an interesting story on the role of PASK in the control of muscle stem cell fate by controlling the decision between self-renewal and differentiation. While the biochemistry presented is fairly compelling, the experiments revolving around the myogenic cells are lacking in quality and data.

      Major concerns:

      1. The isolation method used by this group to isolate muscle stem cells is inappropriate for the experiments used and may contribute to the misinterpretation of some of the results. It is simply a preplating method that results in a very heterogenous cell population in terms of cell type, comprised of numerous fibroblasts. While preplating can be used to isolate muscle stem cells and culture them as myoblasts, it takes days of growth and multiple rounds of passaging that are not used in this paper in order to get a more pure population of myogenic cells. This would also explain the high number of Pax7 negative cells in their primary myoblast experiments (~50% in some conditions) as they are most likely fibroblasts, which the authors could show by staining for fibroblast markers. The increase in Pax7 cells in certain conditions could also simply be due to the loss of contaminating cell types due to the treatment. Every single experiment that was performed on myoblasts must be redone using a more appropriate cell isolation method (i.e. FACS) or by culturing these isolated cells for a much longer period of time to eventually get a more pure cell population. As it stands, none of the data from the primary myoblast experiments are trustworthy.<br /> 2. The authors possess a genetic mouse model where PASK is knocked out. However, the mouse model is never described and the paper that is referenced also does not describe it. Please detail your mouse model.<br /> 3. The majority of experiments are performed on C2C12 cells. While C2C12s are adequate for biochemistry and proof of concepts, when it comes to biological significance primary myoblasts should be used. While the authors try to explain this use by claiming that primary myoblasts undergo precocious differentiation that can be avoided by using an appropriate growth media (F10, 20% FBS, 1% P/S, 5ng/mL of bFGF).<br /> 4. The authors possess a genetic mouse model, yet performed RNA-Seq on C2C12 myoblasts that were either untreated or treated with a PASK inhibitor. It would be much more informative and valuable to sequence the primary myoblasts from WT and PASK KO mice, thereby providing a more biologically relevant model.<br /> 5. The KO mouse model is rarely used and the cells isolated from it would be very useful in determining the biological role of PASK in muscle cells. The authors should isolate WT and KO cells and perform basic muscle functional experiments such as EDU incorporation for proliferation, and fusion index for differentiation to see whether the loss of PASK has an effect on these cells.<br /> 6. The authors never look at quiescent muscle stem cells and early activated muscle stem cells in terms of PASK protein expression and dynamics. The authors should isolate EDL myofibers and stain for PASK and PAX7 at 0, 24, 48, and 72-hour post isolation. This would allow the authors to quantify the changes in PASK expression and cell localization, as well as confirm the number of muscle stem cells in WT and KO mice, during quiescence and during the process of muscle stem cell activation, proliferation, and differentiation in a near in vivo context.<br /> 7. Contrary to their claim, MyoD is not a stemness/self-renewal gene.<br /> 8. The authors state that PASK is necessary for exit from self-renewal and establishment of a progenitor population but this is a vast overstatement. In the genetic KO mouse model, the mice are able to regenerate their muscle after injury, therefore PASK cannot be a necessary protein for the formation of progenitor cells.<br /> 9. In numerous figure panels, the y-axis represents the # of cells, rather than a percentage or ratio. This is uninformative as the number of cells will never be the same between conditions and experiments. These panels need to be replaced with a more appropriate y-axis.

    1. Author Response

      Reviewer #1 (Public Review):

      […] Overall, the results from these analyses are convincing and valuable, but still do not seem to be a big leap from their Unger 2021 paper […]. The methodology that they established should be described more clearly so that it can be shared with the research community. For example, they say cells how many donors were recruited for this experiment? are there differences in efficiency in B cell differentiation by individual?

      Also, it would be important to assay for antibodies in the culture media. How would you suggest to improve the culture system to be used to model diseases?

      We appreciate the reviewer's queries and the points raised. In response to the first set of comments, the reviewer has correctly observed that the methodology of the assay itself as employed in this paper is not new or superior to our previously published data in (Unger et al., Cells 2021), where we described a minimalistic in vitro system for efficient differentiation of human naive B cells into antibody-secreting cells (ASCs). However, the current study aims to elucidate a comprehensive evaluation of the phenotype of the cells in the in vitro system and their relationships in potential differentiation pathways. In addition, we aimed to elucidate how the detailed gene expression profiles of the differentiating cells in vitro compare to in vivo observed counterparts. In this way, we were able to uncover an antibody secreting cell phenotype in vivo that was not observed before and could only be uncovered due to our full transcriptome knowledge of these cells. In addition, we present novel findings that demonstrate that this culture system not only enables efficient ASCs generation but also recapitulates the entire in vivo B cell differentiation pathway, as evidenced by the presence of germinal-center (GC)-like and pre-memory B cells in the culture. These results have not been previously reported in the literature for human B cells in culture and represent a significant contribution to the field of human B cell biology.

      In regards to the reviewer's inquiry about the cell culture protocol, its reproducibility, donors variability, and additional experimental applications, we refer to three additional recent publications from our group that have adopted the same in vitro B cell differentiation system and have provided extensive analysis of the immunoglobulin production, intracellular signaling pathways, as well as comparison with other culture systems in the field (Marsman et al., Cells 2020; Marsman et al., Eur. J. Immunol. 2022; Marsman et al., Front. Immunol. 2022). On top pf this, we now realize that the section that describes the culture system (MATERIAL AND METHODS - “In vitro naive B cell differentiation cultures”) was a bit too concise and we thank the reviewer for mentioning it. We have extended now on it and corrected an inconsistency at lines 125-127: “After six days, activated B cells were collected and co-cultured with 1 × 104 9:1 wild type (WT) to CD40L-expressing 3T3 cells that were irradiated and seeded one day in advance (as described above), together with IL-4 (100 ng/ml) and IL-21 (50 ng/ml; Invitrogen) for five days.”

      As for the application of our in vitro system in disease modeling, as requested by the reviewer, this would require modifying the culture conditions to mimic the disease-specific biology background (if known). For instance, by inhibiting or enhancing specific transcriptional pathways that are known to be associated with the disease in question. However, it would also require the presence of antigen-specific B cells in the pool of naive B cells included in the culture, which can be difficult to achieve due to their low frequency. Alternatively, the system could be used to study antigen-specific recall responses using antigen-specific memory B cells as starting material. Our group has evaluated this approach in a recent publication (Marsman et al., Front. Immunol 2022).

      [..] B cell differentiation may also influence to cell cycle regulation. Rather than normalize its effect, can authors analyze effect of cell cycle in B cell differentiation? [...]

      We very much agree with the reviewer and know that the cell cycle plays a significant role in B cell differentiation output trajectories (Zhou et al, Front Immunol. 2018; Duffy et al., Science 2012). Preparing the manuscript, we have in fact performed a parallel analysis in which we compared both cell cycle regressed- and not cell cycle regressed-based clustering and marker gene selection. Concerning the clustering, other clusters were obtained using the not cell-cycle-regressed dataset compared to the cell-cycle-regressed dataset (figure below). However, when overlaying the clusters obtained with the cell cycle-regressed dataset, the extra clusters were the same cell population but now split based on cycling and not cycling cells: cluster 2 is now divided into the cycling cluster “c”, and the not-cycling cluster “d” while cluster 4 and 5 are now divided into the cycling clusters “e” and the not-cycling cluster “f”. A comprehensive examination of the expression of the top 50 genes associated with antibody-secreting cells in the (non)cycling clusters 4 and 5 reveals that these genes are expressed at a higher level in (non)cycling cluster 5 as compared to cluster 4. This suggests that the cells within cluster 5 are more advanced in their differentiation, regardless of their cell cycle state. This finding has led us to the decision to present the data that has undergone cell cycle regression in the manuscript. Should the reviewer so desire, we are very willing to include additional supplementary figures to the manuscript that include the un-regressed representation.

      Figure legend: A-C) UMAP projection of single-cell transcriptomes of in vitro differentiated human naive B cells without cell cycle regression. Each point represents one cell, and colors indicate graph-based cluster assignments identified without cell-cycle regression (A), with cell cycle regression (B) or with cell cycle regression and additional subdivision in cycling and not cycling cells (C). D) Dotplot showing the top 50 differentially expressed genes in cycling and not-cycling cells from cluster 4 and 5. Point size indicate percentage of cell in the cluster expressing the gene, color indicates average expression

    2. eLife assessment

      In this work, Verstegen and colleagues established an in vitro system and describe human B cell differentiation pathways via germinal center B cells towards plasma cells by performing single-cell analysis of in vitro stimulated human B cells. The study provides solid evidence toward establishment of in vitro model for B cell differentiation. This study may be valuable in differentiation of primary naive B cells into ASC ex vivo and will be of interest for immunologists with emphasis in B cell biology.

    3. Reviewer #1 (Public Review):

      The authors of this study used SMART-seq to study differentiating B cells. Then they performed extensive in silico analyses to validate that a subset of the cells mimicked human antibody-secreting cells. For example, they compared gene expression profile of each cluster in B cell developmental trajectory (Figs 1, 2), investigated gene enrichment in ASC-like cluster (Fig 3), adopted independent dataset (Fig 3), and compared gene expression signatures of their cells to those of GC ASCs (Fig 4). Overall, the results from these analyses are convincing and valuable, but still do not seem to be a big leap from their Unger 2021 paper and therefore making this study preliminary.

      The methodology that they established should be described more clearly so that it can be shared with the research community. For example, they say cells how many donors were recruited for this experiment? are there differences in efficiency in B cell differentiation by individual?

      Also, it would be important to assay for antibodies in the culture media. How would you suggest to improve the culture system to be used to model diseases?

      At the beginning the largest contributing factor for cell culstering was cell cycle. But B cell differentiation may also influence to cell cycle regulation. Rather than normalize its effect, can authors analyze effect of cell cycle in B cell differentiation? For example, identify sub-clusters shown in supple Fig 1g.

    4. Reviewer #2 (Public Review):

      In this work, Verstegen and colleagues try to delineate human B cell differentiation trajectories by using in vitro differentiation culture of human naive B cells. The authors adopted a protocol of B cell stimulation with CD40L-expressing fibroblasts and IL-4/IL-21, and cultured B cells were analyzed by single-cell transcriptome analysis. Five distinct clusters were identified with features of memory B cells, germinal center-like B cells, ASCs, pre-ASCs, or post-GC B cells. This work provides a precise description of gene expression profiles of activated B cell populations and some insight into the pathways of effector B cell differentiation. This work will be a solid basis for human B cell study using in vitro culture of target B cell populations, providing an excellent experimental protocol.

    1. Author Response

      Reviewer #1 (Public Review):

      Doostani et al. present work in which they use fMRI to explore the role of normalization in V1, LO, PFs, EBA, and PPA. The goal of the manuscript is to provide experimental evidence of divisive normalization of neural responses in the human brain. The manuscript is well written and clear in its intentions; however, it is not comprehensive and limited in its interpretation. The manuscript is limited to two simple figures that support its concussions. There is no report of behavior, so there is no way to know whether participants followed instructions. This is important as the study focuses on object-based attention and the analysis depends on the task manipulation. The manuscript does not show any clear progression towards the conclusions and this makes it difficult to assess its scientific quality and the claims that it makes.

      Strengths:

      The intentions of the paper are clear and the design of the experiment itself is simple to follow. The paper presents some evidence for normalization in V1, LO, PFs, EBA, and PPA. The presented study has laid the foundation for a piece of work that could have importance for the field once it is fleshed out.

      Weakness:

      The paper claims that it provides compelling evidence for normalization in the human brain. Very broadly, the presented data support this conclusion; for the most part, the normalization model is better than the weighted sum model and a weighted average model. However, the paper is limited in how it works its way up to this conclusion. There is no interpretation of how the data should look based on expectations, just how it does look, and how/why the normalization model is most similar to the data. The paper shows a bias in focusing on visualization of the 'best' data/areas that support the conclusions whereas the data that are not as clear are minimized, yet the conclusions seem to lump all the areas in together and any nuanced differences are not recognized. It is surprising that the manuscript does not present illustrative examples of BOLD series from voxel responses across conditions given that it is stated that it is modeling responses to single voxels; these responses need to be provided for the readers to get some sense of data quality. There are also issues regarding the statistics; the statistics in the paper are not explicitly stated, and from what information is provided (multiple t-tests?), they seem to be incorrect. Last, but not least, there is no report of behavior, so it is not possible to assess the success of the attentional manipulation.

      We appreciate the reviewer’s feedback on providing more information so that the scientific quality of our work can be assessed. We have now added a new figure including BOLD responses in different conditions, as well as how we expected the data to look and the interpretations. To provide extra evidence for data quality and reliability, we have included BOLD responses of different conditions for odd and even runs separately in the supplementary information.

      In order to avoid any bias in presentation, we have now visualized the results from all areas with the same size and in a more logical order. However, we have also modified all results to include only those voxels in each ROI that were active for the stimuli presented in the main task based on the comment of one of the reviewers. According to the current results, there is no difference in the efficiency of the normalization model in different regions, which we have reported in the results section.

      Regarding the statistics, we have corrected the problem. We have performed ANOVA tests, have corrected all results for multiple comparisons, and have added a statistics subsection in the methods section to explicitly explain the statistics.

      Finally, we have added the report of the reaction time and accuracy in the results section and the supplementary information. As stated, average performance was above 86% in all conditions, confirming that the participants correctly followed the instructions and that the attentional manipulation was successful.

      We hope that the reviewer would find the manuscript improved and that the new analyses, figures, and discussions would address the reviewer’s concerns.

      Reviewer #2 (Public Review):

      My main concern is in regards to the interpretation of these results has to do with the sparseness of data available to fit with the models. The authors pit two linear models against a nonlinear (normalization) model. The predictions for weighted average and summed models are both linear models doomed to poorly match the fMRI data, particularly in contrast to the nonlinear model. So, while I appreciate the verification that responses to multiple stimuli don't add up or average each other, the model comparisons seem less interesting in this light. This is particularly salient of an issue because the model testing endeavor seems rather unconstrained. A 'true' test of the model would likely need a whole range of contrasts tested for one (or both) of the stimuli, Otherwise, as it stands we simply have a parameter (sigma) that instantly gives more wiggle room than the other models. It would be fairer to pit this normalization model against other nonlinear models. Indeed, this has been already been done in previous work by Kendrick Kay, Jon Winawer and Serge Dumoulin's groups. So far, may concern above has only been in regards to the "unattended" data. But the same issue of course extends to the attended conditions. I think the authors need to either acknowledge the limits of this approach to testing the model or introduce some other frameworks.

      We thank the reviewer for their feedback. We have taken two approaches to answer this concern. First, we have included simulations of neural population responses to attended and unattended stimuli. The results demonstrate that with our cross-validation approach, the normalization model is only a better fit if the computation performed at the neural level for multiple-stimulus responses is divisive normalization. Otherwise, the weighted sum or the weighted average models are better fits to the population response when the neurons respectively sum or average responses. These results suggest that the normalization model provides a better fit to the data because the underlying computation performed by the neurons is divisive normalization, not because of the model’s non-linearity.

      In a second approach, we tested a nonlinear model, which was a generalization of the weighted sum and the weighted average models with an extra saturation parameter (with even more parameters than the normalization model). The results demonstrated that this model was also a worse fit than the normalization model.

      Regarding the reviewer’s comment on testing for a range of contrasts, as we have emphasized now in the discussion, here, we have used single-, multiple-, attended- and unattended-stimulus conditions to explore the change in response and how the normalization model accounts for the observed changes in different conditions. While testing for a range of contrasts would also be interesting, it would need a multi-session fMRI experiment to test for a range of contrasts with isolated and paired stimulus conditions in the presence and absence of attention. Moreover, the role of contrast in normalization has been investigated in previous studies, and here we added to the existing literature by exploring responses to multiple objects, and investigating the role of attention. Finally, since the design of our experiment includes presenting superimposed stimuli, the range of contrasts we can use is limited. Low-contrast superimposed stimuli cannot be easily distinguished, and high-contrast stimuli block each other.

      We hope that the reviewer would find the manuscript improved and that the new models, simulations, analyses, and discussions would address the reviewer’s concerns.

      Reviewer #3 (Public Review):

      In this paper, the authors model brain responses for visual objects and the effect of attention on these brain responses. The authors compare three models that have been studied in the literature to account for the effect of attention on brain responses to multiple stimuli: a normalization model, a weighted average model, and a weighted sum model.

      The authors presented human volunteers with images of houses and bodies, presented in isolation or together, and measured fMRI brain activity. The authors fit the fMRI data to the predictions of these three models, and argue that the normalization model best accounts for the data.

      The strengths of this study include a relatively large number of participants (N=19), and data collected in a variety of different visual brain regions. The blocked design paradigm and the large number of fMRI runs enhance the quality of the dataset.

      Regarding the interpretation of the findings, there are a few points that should be considered: 1) The different models that are being studied have different numbers of free parameters. The normalization model has the highest number of free parameters, and it turns out to fit the data the best. Thus, the main finding could be due to the larger number of parameters in the model. The more parameters a model has, the higher "capacity" it has to potentially fit a dataset. 2) In the abstract, the authors claim that the normalization model best fits the data. However, on closer inspection, this does not appear to be the case systematically in all conditions, but rather more so in the attended conditions. In some of the other conditions, the weighted average model also appears to provide a reasonable fit, suggesting that the normalization model may be particularly relevant to modeling the effects of attention. 3) In the primary results, the data are collapsed across five different conditions (isolated/attended for preferred and null stimuli), making it difficult to determine how each model fares in each condition. It would be helpful to provide data separately for the different conditions.

      We thank the reviewer for their feedback.

      Regarding the reviewer’s concern about the number of free parameters, we have introduced a simulation approach, demonstrating that with our cross-validation approach, a model with a higher number of parameters is not a good fit when the underlying neural computation does not match the computation performed by the model. Moreover, we have now included another nonlinear model with 5 parameters that performs worse than the normalization model. Besides, we have used the AIC measure in addition to cross-validation for model comparison, and the AIC measure confirms the previous results.

      Regarding the difference in the efficiency of the normalization model across conditions, after selecting the voxels that were active during the main task in each ROI (done according to the suggestion of one of the reviewers to compensate for the difference in size of localizer and task stimuli), we observed that the normalization model was a better fit for both attended and unattended conditions. However, since the weighted average model results were also close to the data in unattended conditions, we have discussed the unattended condition separately and have discussed the relevance of our results to previous reports of multiple-stimulus responses in the absence of attention.

      Finally, concerning model comparison for different conditions, we have calculated the models’ goodness of fit across conditions for each voxel. The reason for calculating the goodness of fit in this manner was to evaluate model fits based on their ability in predicting response changes with the addition of a second stimulus and with the shifts of attention. Since correlation is blind to a systematic error in prediction for all voxels in a condition, calculating the goodness of fit across voxels would lead to misinterpretation. We have now included a figure in the supplementary information illustrating the method we used for calculating the goodness of fit.

      We hope that the reviewer would find the manuscript improved and that the new analyses, simulations, figures, and discussions would address the reviewer’s concerns.

    2. eLife assessment

      The authors state that there is scant experimental evidence of divisive normalization of neural responses in the human brain. They used fMRI BOLD response to high-level stimuli to explore normalization in V1, object-selective (LO and pFs) and category-selective regions (EBA and PPA) as well effects of attention on cortical responses. Specifically, the authors first test the degree to which BOLD responses to body parts and houses exhibit responses predicted by a non-linear normalization model, compared to two linear models (weighted sum and weighted average). They find that responses, when considering responses to one vs two stimuli, are best fit with the normalization model. They then suggest that object-based attention effects can be better accounted for by a normalization model of attention, compared to attention variants of the aforementioned models. The paper could potentially be an important contribution to the fields of perceptual and cognitive neuroscience, but the conclusions are not sufficiently supported by the data at this stage. Several theoretical and methodological concerns limit the conclusions of this study.

    3. Reviewer #1 (Public Review):

      Doostani et al. present work in which they use fMRI to explore the role of normalization in V1, LO, PFs, EBA, and PPA. The goal of the manuscript is to provide experimental evidence of divisive normalization of neural responses in the human brain. The manuscript is well written and clear in its intentions; however, it is not comprehensive and limited in its interpretation. The manuscript is limited to two simple figures that support its concussions. There is no report of behavior, so there is no way to know whether participants followed instructions. This is important as the study focuses on object-based attention and the analysis depends on the task manipulation. The manuscript does not show any clear progression towards the conclusions and this makes it difficult to assess its scientific quality and the claims that it makes.

      Strengths:<br /> The intentions of the paper are clear and the design of the experiment itself is simple to follow. The paper presents some evidence for normalization in V1, LO, PFs, EBA, and PPA. The presented study has laid the foundation for a piece of work that could have importance for the field once it is fleshed out.

      Weakness:<br /> The paper claims that it provides compelling evidence for normalization in the human brain. Very broadly, the presented data support this conclusion; for the most part, the normalization model is better than the weighted sum model and a weighted average model. However, the paper is limited in how it works its way up to this conclusion. There is no interpretation of how the data should look based on expectations, just how it does look, and how/why the normalization model is most similar to the data. The paper shows a bias in focusing on visualization of the 'best' data/areas that support the conclusions whereas the data that are not as clear are minimized, yet the conclusions seem to lump all the areas in together and any nuanced differences are not recognized. It is surprising that the manuscript does not present illustrative examples of BOLD series from voxel responses across conditions given that it is stated that that it is modeling responses to single voxels; these responses need to be provided for the readers to get some sense of data quality. There are also issues regarding the statistics; the statistics in the paper are not explicitly stated, and from what information is provided (multiple t-tests?), they seem to be incorrect. Last, but not least, there is no report of behavior, so it is not possible to assess the success of the attentional manipulation.

    4. Reviewer #2 (Public Review):

      My main concern is in regards to the interpretation of these results has to do with the sparseness of data available to fit with the models. The authors pit two linear models against a nonlinear (normalization) model. The predictions for weighted average and summed models are both linear models doomed to poorly match the fMRI data, particularly in contrast to the nonlinear model. So, while I appreciate the verification that responses to multiple stimuli don't add up or average each other, the model comparisons seem less interesting in this light. This is particularly salient of an issue because the model testing endeavor seems rather unconstrained. A 'true' test of the model would likely need a whole range of contrasts tested for one (or both) of the stimuli, Otherwise, as it stands we simply have a parameter (sigma) that instantly gives more wiggle room than the other models. It would be fairer to pit this normalization model against other nonlinear models. Indeed, this has been already been done in previous work by Kendrick Kay, Jon Winawer and Serge Dumoulin's groups. So far, may concern above has only been in regards to the "unattended" data. But the same issue of course extends to the attended conditions. I think the authors need to either acknowledge the limits of this approach to testing the model or introduce some other frameworks.

    5. Reviewer #3 (Public Review):

      In this paper, the authors model brain responses for visual objects and the effect of attention on these brain responses. The authors compare three models that have been studied in the literature to account for the effect of attention on brain responses to multiple stimuli: a normalization model, a weighted average model, and a weighted sum model.

      The authors presented human volunteers with images of houses and bodies, presented in isolation or together, and measured fMRI brain activity. The authors fit the fMRI data to the predictions of these three models, and argue that the normalization model best accounts for the data.

      The strengths of this study include a relatively large number of participants (N=19), and data collected in a variety of different visual brain regions. The blocked design paradigm and the large number of fMRI runs enhance the quality of the dataset.

      Regarding the interpretation of the findings, there are a few points that should be considered: 1) The different models that are being studied have different numbers of free parameters. The normalization model has the highest number of free parameters, and it turns out to fit the data the best. Thus, the main finding could be due to the larger number of parameters in the model. The more parameters a model has, the higher "capacity" it has to potentially fit a dataset. 2) In the abstract, the authors claim that the normalization model best fits the data. However, on closer inspection, this does not appear to be the case systematically in all conditions, but rather more so in the attended conditions. In some of the other conditions, the weighted average model also appears to provide a reasonable fit, suggesting that the normalization model may be particularly relevant to modeling the effects of attention. 3) In the primary results, the data are collapsed across five different conditions (isolated/attended for preferred and null stimuli), making it difficult to determine how each model fares in each condition. It would be helpful to provide data separately for the different conditions.

    1. eLife assessment

      This manuscript describes a mouse model of a human mitofusin 2- related lipodystrophy, generated by knockin of Mfn2 R707W, and reports data suggesting adipocyte-specific effects involving the integrated stress response, mTorc signaling, and epithelial-mesenchymal transition pathways. The data will be important for understanding how mitochondria can be affected in tissue-specific manner to contribute to metabolic disease.

    2. Reviewer #1 (Public Review):

      The article by Mann et al. describes a knockin (KI) mouse model of mitofusin 2- related lipodystrophy, in mice carrying MFN2 R707W. The mice recapitulate some but not all aspects of the human phenotype, as summarized in Table 2. The phenotypic characterization is extensive and is generally well done. There was an adipose-specific alteration of mitochondrial morphology, accompanied by activation of the integrated stress response and reduced adipokine secretion. These findings are consistent with the human phenotype. The alteration in fat distribution that is present in humans with this mutation was not observed, and the mice did not have the insulin resistance seen in humans. The transcriptome analyses revealed a reduced epithelial-mesenchymal transition (EMT) in the KI mice, suggesting possible involvement of TGF-beta related pathways. There was also upregulation of the mTorc signaling pathway, suggesting that a possible therapeutic approach in humans may involve the mTORC1 inhibitor sirolimus. The reason for the largely adipose -specific effect of the mutation remains unexplained. As well, the hypothesis that changes in EMT pathways reflect altered activity of TGF-beta pathways must remain somewhat speculative at this point. Notwithstanding these weaknesses, the manuscript provides an important advance in understanding this lipodystrophy (and potentially other lipodystrophies), and the model that has been generated will enable further studies to further characterize the pathophysiology.

    3. Reviewer #2 (Public Review):

      This study generated a valuable preclinical model of patients with Mfn2-related lipodistrophy (R707W). Such a mouse model enables the understanding the pathogenic mechanism causing this lipodistrophy and testing specific therapeutic approaches for these patients.

      The strengths are the thorough phenotypic characterization of the mice and the clear decrease in circulating leptin and adiponectin levels in the absence of changes in fat mass observed in Mfn2 R707W/R707W mice. This partially recapitulates one of the key phenotypes of human patients with these mutations.

      The major weakness is the conclusion that the integrated stress response is activated in white adipose tissue is not supported by the data and the phenotype. The ISR caused by primary insults to mitochondria was defined as a response that decreases the translation of mitochondrial proteins, thus decreasing mitochondrial respiratory function via ATF4 without engaging ATF5 (Quiros et al., JCB 2016). In addition, the increase in ATF4 caused by phosphorylation of eif2alpha is in ATF4 translation and translocation to the nucleus, not in ATF4 transcription. It is a possibility that it is a selective increase in ER stress that is responsible for defective leptin secretion, as Mfn2 R707W/R707W adipose tissue shows no mitigation of mitochondrial function as expected from ATF4-ISR activation.

    4. Reviewer #3 (Public Review):

      Mann and colleagues have generated a knock-in mouse model carrying a recently identified mutation in the Mfn2 gene that leads to a syndrome of severe upper body adipose overgrowth in humans (Mfn2R707W). The goal was to gain a better mechanistic understanding on how this mutation leads to such a dramatic phenotype in humans. The authors consistently demonstrate how the knock-in mutation leads to abnormalities in mitochondrial shape, mtDNA content, as well as in the abundance of some mitochondrial proteins, most notably in brown adipose tissue. The authors detect some stress response signatures, which could explain the decreased leptin and adiponectin levels observed in the knockin mice.

      The authors have to be praised for their effort in trying to provide mechanistic insights to such a rare condition. This work constitutes a real tour de force in the characterization of Mfn2R707W mice. The path, however, was full of surprises. On one side, the knockin mouse model fails to recapitulate multiple aspects of the human syndrome. This is, of course, beyond the control of the researchers, but somehow tells us that there are some elements missing in our understanding of the effects of this Mfn2 mutation at the cellular level (not just organismal), and on why it impacts so much adipose tissues. A second layer of complexity is that the authors find an interesting connection between Mfn2R707W, the integrated stress response and a severe decrease in the expression of leptin and adiponectin. However, whether these elements have any causal role in the human syndrome or in the phenotypes observed in the mice, remains an open question.

    1. eLife assessment

      This study describes an interesting approach using PEGylated isoprenaline to selectively activate beta-adrenergic receptors in the surface sarcolemma of ventricular myocytes. While the concept is compelling, and the core of an interesting and impactful study is presented, the results are preliminary and incomplete at this stage, and would benefit from more rigorous validation of the approach. The work will be of interest to cardiac cell biologists and pharmacologists.

    2. Reviewer #1 (Public Review):

      In this study, Barthe et al. developed an approach to selectively activate beta-adrenergic receptors in the sarcolemma of ventricular myocytes. The approach involved the linking of a 5Kd PEG chain to the beat agonist isoprenaline. This prevents the agonist from entering transverse tubules. Using this approach, the authors find that activation of beta-adrenergic receptors in the surface sarcolemma of ventricular myocytes leads to lower cytosolic cAMP levels but longer-lasting effects on EC coupling than when TT receptors were activated.

      Strengths of the study:<br /> 1) The PEG-ISO, size exclusion approach is very interesting and useful.<br /> 2) The observation that activation of beta-adrenergic receptors in the surface sarcolemma of ventricular myocytes leads to lower cytosolic cAMP levels, but longer-lasting effects on EC coupling than when TT receptors were activated is interesting.<br /> 3) The observation that beta-adrenergic receptors in the TT lead to stronger nuclear activation of nuclear cAMP/PKA signaling is interesting.

      Weaknesses of the study:<br /> 1) There seems to be a paucity of mechanistic insights into the study.<br /> 2) It is unclear what would be the ideal control for these experiments. Would the addition of the PEG chain, by itself, alter the binding of and activation of beta-adrenergic receptors regardless of their location?<br /> 3) The novelty of the findings is unclear, as other studies have suggested differential effects of beta-adrenergic receptors in membrane compartments.

      Impact on the field:<br /> 1) PEG-ISO may become a useful strategy to selectively activate surface sarcolemmal beta-adrenergic receptors.

    3. Reviewer #2 (Public Review):

      Barthé et al. present a manuscript examining membrane-domain specific signaling by βAR stimulation in cardiomyocytes. Specifically, the authors seek to use a size exclusion approach using PEGylated-isoproterenol to allow only surface sarcolemmal βAR receptor stimulation without T-tubule βAR stimulation. This innovative approach was advanced using confocal microscopy to determine the accessibility of the PEGylated substrates to the T-tubule network. The authors show comparable responses of L-type Ca channels, Ca transients, and contraction using equipotent doses of PEG-Iso and Iso, but differences in nuclear and cytoplasmic cAMP responses based on FRET reporters.

      Strengths<br /> 1. The size exclusion strategy using PEGylation technology is well rationalized and well supported by the physicochemical characterization of PEGylated Iso. This represents a novel strategy to decipher cardiomyocyte cell surface signaling from T-tubule network signaling resulting from the stimulation of β-adrenergic receptors. This approach can be used to study the compartmentalization of various signaling pathways in cardiomyocytes as well as in other cell types that exhibit complex cytoarchitecture. The authors use multiple cAMP FRET sensors as well as assay a number of relevant physiological cellular responses to assess the effect of Iso vs. PEGylated Iso which are informative.

      Weaknesses<br /> 1. The authors' evidence that PEG-FITC does not penetrate the TT network is not convincing as presented in Figure 1. A single confocal image from one cell showing a lack of fluorescence (Figure 1A) could be due to an outlier cell or lack of penetration to more central regions of the cell where images are taken from. More convincing would be a confocal Z-scan series comparing PEG-FITC and FITC in ARVM. Some form of quantification of T-tubule network density from multiple cells would provide even more robust evidence, similar to the many studies that have done this characterization in models of dilated cardiomyopathy showing a loss of TT network. This exclusion of PEG-FITC provides the critical foundation for the paper and it is somewhat unanticipated given the large dimensions of the t-tubules relative PEG-Iso, so strong data here are particularly important.

      2. The conclusion on line 160 that 'the maximal efficacy of PEG-Iso was significantly lower by 30% than that of Iso,' may be overstated. What approach was used to conclude significantly differently as this implies a statistical comparison? Were the concentration-response curves fit to determine maximal responses? In the examples given, the responses are continuing to increase at the highest concentrations tested, so it is difficult to simply compare the responses to the highest doses tested.

      3. For experiments using adenovirus delivery of FRET-based sensor, the culture of ARVM is required which may impact the biology. Such culture is known to result in changes in cell structure and physiology with loss of the TT network over time. It is essential for the authors to demonstrate that under the conditions of their FRET experiments, the cells continue to exhibit a robust TT network.

      4. As pointed out by the authors, the interpretation of OSM/TTM adrenergic receptor functions in this study is limited by the fact that the relative contributions of β-adrenergic receptor subtypes had not been assessed. This particularly complicates the interpretation of their results in that the authors demonstrate in Figure 2 that PEGylation increases the Ki for Iso for β1 receptors by 700-fold whereas the increase for β2 receptors is about 200-fold. Thus, the relative contribution of β1 and β2 receptors to a 'comparable' dose of Iso and PEGylated Iso will potentially be different. Could that difference in relative β1/β2 receptors be the cause of the different 'efficacy of nuclear and cytoplasmic' cAMP changes between the two tested ligands in Figure 8 and supplemental Figure 3? This would fundamentally alter the conclusions of the paper.

      5. The equipotent doses of Iso and PEG-Iso were initially defined based on their ability to elevate global [cAMP]i. The authors then further demonstrated that such equipotent doses of Iso and PEG-Iso also had equal effects on ICa,L amplitude, Ca2+ transient parameters, and cellular contractility (shortening), presumably because they raised global [cAMP]i to the same levels. These findings seem to defy the importance of nanodomain organization and local [cAMP]i in the regulation of LTCCs, Ca2+ cycling proteins, and contractile machinery. The authors argued that "Since OSM contributes to ~60% of total cell membrane in ARVMs, either β-ARs and ACs are more concentrated in OSM than TTM, or they are in large excess over what is needed to activate PKA phosphorylation of proteins involved in EC coupling. Also, cAMP produced at OSM must diffuse rapidly in the cytosol in order to activate PKA phosphorylation of substrates located deep inside the cell, such as LTCCs in TTM" (lines 336-341). Although this argument may be valid at high concentrations of Iso and PEG-Iso when PKA activation is saturated, it also implies that discrepancy could be detectable at lower (non-saturating) doses of Iso and PEG-Iso. Thus, additional experiments using lower Iso and PEG-Iso doses are required to support this notion.

      6. The size excluded compartment for PEG-Iso proposed by the authors is the TT network, but this ignores other forms of sarcolemmal nanodomains such as caveolae, which include β2 receptors and AC, and may exhibit similar if not great sensitivity to the size exclusion approaches pioneered by the authors.

    4. Reviewer #3 (Public Review):

      The manuscript by Barthe et al compares the effects derived from the application of isoprenaline (Iso) or isoprenaline covalently linked to PEG (PEG-Iso) on adult rat ventricular myocytes (ARVM). Iso is a well-characterized β-AR agonist and the authors work under the assumption that PEGylation of Iso prevents it from accessing the T-tubules. Therefore, due to its larger size, PEG-Iso is only able to activate β-ARs located on the outer surface membrane (OSM), and any additional effect observed by Iso stimulation is attributed to the activation of β-ARs located in T-tubules. First, the authors determined that the affinity of PEG-Iso for β-ARs is about 100 times lower than the one of Iso. Then, they analyze the effects of Iso (10 nM) and PEG-Iso (1 µM) on calcium channel currents, contractility, calcium transients, and cytosolic and nuclear PKA activity. They only found a stronger effect of Iso on nuclear pKA activity. Therefore they conclude that, while OSM β-ARs stimulation mainly results in positive inotropy and lusitropy, T-tubules ARs stimulation mainly results in increased nuclear pKA activity.

      Overall the manuscript is well written and the findings are biologically important from the perspective of understanding the mechanism of β-AR stimulation as well as in assigning the functional contribution of β-ARs in the OSM and in the T-tubules. However, the major conclusion is not strongly supported by the data. The interpretation of the results is all based on the assumption that PEG-Iso is excluded by the T-tubules, but no experiment presented here rigorously demonstrates this.

      1. The only indication that PEG-Iso may be excluded by the T-tubules is one confocal image in which FITC or PEG-FITC were applied on ARVM. No experiment has been performed to assess if PEG-Iso is indeed not able to enter the T-tubules.<br /> The treatment of ARVM with neuraminidase made the T-tubules accessible to PEG-FITC. If the authors could demonstrate that neuraminidase treatment followed by PEG-Iso would result in similar nuclear pKA activity as Iso, this would strengthen their conclusion.<br /> 2. The fact that PEG-Iso treatment resulted in a lower increase of intracellular cAMP (Figure 3) could also be due to the activation of a smaller fraction of β-ARs, independent of their localization.

    1. eLife assessment

      This study thoroughly characterizes the morphology of an interesting folded membrane structure that links the epidermis to the cuticle in C. elegans. This structure, here named the meiosome, has been noted by several previous researchers. The study would be strengthened by providing additional support to the notion that the VHA-5::GFP transgenic reporter, used by the authors, faithfully labels the meisosome, and by stronger evidence that meiosomes indeed serve as attachment platforms between the cuticle and the epidermis.

    2. Public Review

      Reviewer #1 (Public Review):

      1) “In fact, it is not surprising that the collagen mutants display a detached cuticle, because the extracellular domains of MUP-4 and MUA-3 (the transmembrane receptors of apical hemidesmosomes that are primarily responsible for tethering the epidermis to the cuticle) both contain vWFA collagen-binding domain (Hong et al., JCB 2001; Bersher et al., JCB 2001). Hence loss of certain collagens in the cuticle directly affects cuticle-epidermis attachment due to defective ligand-receptor interactions is a much more plausible explanation.”

      We agree with the reviewer that a specific molecular interaction likely mediates the attachment of the cuticle to the epidermis, not only in the area above the hemidesmosomes, but also in the area of the meisosomes. The collagens that potentially associate with MUP-4 and/or MUA-3 in the muscle regions have not been identified, nor in the main epidermal region, where the putative receptor is not known. We have modified the text accordingly.

      “Likewise, it is more resonable to propose that lack of certain collagens in the cuticle directly affects cuticle stiffness, rather than working indirectly through epidermal meisosomes.”

      We agree with the reviewer that the loss of specific structural components of the cuticle could well affect stiffness directly, especially if the furrows are affected; non-furrow collagen mutants do not show this phenotype. An analogy might be the increased stiffness that corrugation provides. We have modified the text accordingly. Our future research aims precisely at modelling these physical aspects.

      2) “VHA-5::GFP does not co-localize with fluorescent markers for MVB, recycling endosomes and autophagolysosomes. By claiming this, the authors made a huge assumption that the overexpressed VHA-5::GFP fusion protein can only possibly associate with four types of organelles (meisosomes, MVB, recycling endosomes and autophagolysosomes) but not any other known or to-be-identified subcellular structures. In addition, a previous study did report that VHA-5 is localized in several other places besides the apical membrane stacks (Liegeois et al., JCB 2006).”

      The reviewer cites the Liegeois paper that we mention above, which, in our opinion, and that of reviewer 2 (“VHA-5 is well known to localise to the apical membrane stacks (Liegeois 2006) and could be served as marker of apical membrane structure”), provides extremely strong support for our position. In Liegeois et al., 2006, there is a quantification of immunogold staining that shows that >85% of VHA-5 is found in meisosomes (Fig S5D). By providing the results of co-localisation analyses with 3 cytoplasmic vesicular markers, we simply wanted to illustrate the specificity of the signal to the non-initiated. Importantly, we now provide strong evidence that VHA-5::GFP marker co-localises with apical plasma membrane macrodomains revealed by both a PH domain of PLCδ and a CAAX marker. As our ultrastructural analyses demonstrate that meisosomes are composed by apical membrane folds, this again is wholly consistent with VHA-5 being a bonafide marker of meisosomes.

      Reviewer #2 (Public Review):

      The reviewer questioned the need to give another name to the “apical membrane stacks”. We made this proposition after consultation with a broad community of researchers in the field. We believe that this simpler name provides a link to an analogous structure in yeast, the eisosome, also at the interface between the aECM and the cell.

      The reviewer wrote, “The major problem of this paper is that there is not much new information”, that it was known, for example, that “"furrowless" dpy mutants result in complete disorganization of the epidermis”. In addition to demonstrating that the furrowless Dpy mutants have very particular and specific phenotypes, without affecting the presence of hemidesmosomes (PMID: 33033182), nor different vesicular markers (FIgure 6S2), we would like to point out that reviewer #1 commented, “the work presented by Aggad et al. is rich in novelty”, and Reviewer #3, “The major strengths of the paper are the novelty”. We have re-written and reorganised the text and hope Reviewer #2 appreciates the novelty more in the revised version.

    3. Reviewer #1 (Public Review):

      In this work, Aggad et al. focused on the multi-folded membrane structure (termed meisosomes) located between the apical extracellular matrix and the epidermal cells of the C. elegans. The authors performed detailed analysis on the morphology and 3D distribution of the meisosomes at different developmental stages of the C. elegans skin. They also investigated factors affecting the biogenesis and reorganization of the meisosomes, as well as the involvement of meisosomes in cuticle synthesis and maintenance. The meisosomes are particularly intriguing membrane structures connecting the epidermis to the extracellular matrix, which potentially have vital functions but were given very little attention before this study. Therefore, the work presented by Aggad et al. is rich in novelty and may greatly benefit the related fields if the main conclusions stand. However, the authors' claims are not very well-supported by the data due to improper use of reporters and mutants, as well as some flaws in experimental design.

      1. One major problem with this manuscript is the investigation about meisosome functions. Instead of generating knockdown animals or mutants that directly and specifically disrupt meisosome structures, the authors used several cuticular collagen mutants, which harbor multiple complex cuticular and epidermal defects. Therefore, the main conclusions drawn from the analysis using collagen mutants, such as "meisosomes may play an important role in attaching the cuticle to the underlying epidermal cell" or "furrow collagens are required for stiffness potentially as they are essential for the presence of normal meisosomes" do not stand well. In fact, it is not surprising that the collagen mutants display a detached cuticle, because the extracellular domains of MUP-4 and MUA-3 (the transmembrane receptors of apical hemidesmosomes that are primarily responsible for tethering the epidermis to the cuticle) both contain vWFA collagen-binding domain (Hong et al., JCB 2001; Bersher et al., JCB 2001). Hence loss of certain collagens in the cuticle directly affects cuticle-epidermis attachment due to defective ligand-receptor interactions is a much more plausible explanation. Likewise, it is more resonable to propose that lack of certain collagens in the cuticle directly affects cuticle stiffness, rather than working indirectly through epidermal meisosomes. In a word, this study did not answer the long-standing question since the 1980s: what are the primary functions of the apical membrane stacks (AKA meisosomes) in the C. elegans epidermis?

      2. Another problem with this manuscript is the representation of meisosome structures by VHA-5::GFP reporter alone from Figure 3 to Figure 7. The authors claim that VHA-5::GFP is a meisosome-specific marker, but only provided indirect and superficial evidence to support this claim: 1) VHA-5::GFP signal is distributed in the same general epidermal area as the majority of meisosomes (so are many other membrane organelles in the C. elegans epidermis);2. VHA-5::GFP does not co-localize with fluorescent markers for MVB, recycling endosomes and autophagolysosomes. By claiming this, the authors made a huge assumption that the overexpressed VHA-5::GFP fusion protein can only possibly associate with four types of organelles (meisosomes, MVB, recycling endosomes and autophagolysosomes) but not any other known or to-be-identified subcellular structures. In addition, a previous study did report that VHA-5 is localized in several other places besides the apical membrane stacks (Liegeois et al., JCB 2006). In a word, there is no solid, direct evidence showing that VHA-5::GFP can specifically represent meisosomes and faithfully visualize meisosome morphology in the C. elegans epidermis. There are also no alternative approaches for meisosome morphological analysis to back up the results obtained from VHA-5::GFP reporter. Therefore, most of the data from Figure 3-7 can only be interpreted as the influence of various factors on the distribution patterns of VHA-5::GFP, not just meisosomes.

    4. Reviewer #2 (Public Review):

      The manuscript by Aggad et al., describes an interesting folded structure that links the epidermis to the cuticle in C. elegans. They analyzed the structure by TEM and tomography and found groups of parallel folds in both L4 and adult animals. They show VHA-5 localizes to this structure and have used VHA-5::GFP transgenic reporter to investigate differently cuticle furrow-related genes by RNAi. It is an important step to describe the character of this structure, which the authors named "meisosomes". However, the structure has been reported and well defined as "apical membrane stacks" in previous studies and reviewed by a few articles (Liegeois et al., 2006, Hyenne et al., 2015, Chisholm and Xu, 2012, Cohen and Sundaram 2020). It is very confusing that the authors want to change the name of this structure.

      The major problem of this paper is that there is not much new information. It is already known that these stacks exist, VHA-5 localizes to the stacks, cuticle damage induces AMPs, "furrowless" dpy mutants result in complete disorganization of the epidermis, defective cuticle structure causes abnormalities via gene expression, etc. The function of these stacks remains unknown. Another issue is the transgenic reporter of VHA-5::GFP, which is not endogenously expressed, and its puncta intensity only reflects the protein distribution but not the stack structure.

    5. Reviewer #3 (Public Review):

      This study by Aggad, Pujol, and colleagues provides some exciting new insights into a largely overlooked organelle/structure present in C. elegans epidermial cells, the "meiosome". Although noted by several previous researchers, this folded-membrane structure was never fully characterized. In particular, the authors provide an important and thorough characterization of meiosome morphology during development. The authors also provide data suggesting that meiosomes may function to provide attachment points between the epidermis and overlying cuticle, although this portion was less clear cut. In addition, the authors show that certain cuticle collagens can affect the morphology and position of meiosomes in addition to the formation of molting-associated actin cables. Some of these latter results, which suggest an 'outside-in' type of patterning regulation, run counter to certain previous models.

      The major strengths of the paper are the novelty of describing a 'new organelle' and the thoroughness and clarity of the morphological analysis. The various EM studies were particularly well done and likely required a good deal of technical development, which may be of use to others in the field. One clear weakness is that it's not currently clear if the reported cuticle detachment defect is due to altered meiosomes, to the altered cuticle composition, or perhaps both, and thus the exact function(s) of meiosomes is left open. Other concerns include the use of extrachromosomally expressed VHA-5::GFP as a meiosome-specific marker. Although this could certainly be the case, it wasn't proven.

    1. Author Response

      Reviewer #1 (Public Review):

      In this manuscript, Braet et al provide a rigorous analysis of SARS-CoV-2 spike protein dynamics using hydrogen/deuterium exchange mass spectrometry. Their findings reveal an interesting increase in the dynamics of the N-terminal domain that progressed with the emergence of new variants. In addition, the authors also observe an increase in the stabilization of the spike trimeric core, which they identify originates from the early D614G mutation.

      Overall this is a timely and interesting exploration of spike protein dynamics, which have so far remained largely unexplored in the literature.

      What I find a bit missing in this manuscript is a link between how the identified changes in protein dynamics lead to increased viral fitness. While there are some possibilities listed in the discussion, I think these should be elaborated upon further. In addition, it should also be discussed how understanding the changes in the spike protein dynamics could have implications for the development of small molecule inhibitors for the virus.

      We have included information in the introduction and conclusion to make the connection more clearly between our observations, function, and viral fitness of spike protein. We have also connected specific mutations to observed function. We have re-organized the discussion for increased clarity and to improve the correlation of our observations to viral fitness.

      Reviewer #2 (Public Review):

      The study systematically looks at dynamic differences across variants longitudinally and the authors appropriately only limit their analyses to peptides that are conserved across the different variants.

      There are some concerns listed below, particularly related to the ensemble heterogeneity that is reported and need considerable revision.

      1) The authors explain that cold-temperature treatment of the S trimer ectodomain constructs has been shown to lead to instability and heterogeneity. They also show this with a comparison of untreated vs. 3-hour 37 ℃ treated samples. I'm confused as to why "During automated HDXMS experiments protein samples were stored at 0 ℃". Will this not cause issues in protein heterogeneity, where the longer the protein sits at 0 ℃ the more potential heterogeneity there will be, and thus greatly confound the analysis?

      We thank the reviewer for highlighting this point. We have carefully examined and reevaluated our analysis of both wild -type and variant spike HDXMS. During automated HDXMS experiments, protein samples are indeed maintained at 0 ℃, in between runs and replicates for fixed periods of time (4 h per replicate). In the case of WT S, we did observe conformational heterogeneity between replicates (Figure 2- figure supplement 6), as correctly pointed out by the reviewer. We have repeated analysis of WT S without 0 ℃ incubation in automated HDXMS experiments. In the revised manuscript, Figure 2 shows the more homogenous conformation of WT S, when not incubated at 0 ℃ in between replicates. Extension of these analyses to D614G (Figure 2- figure supplement 7) and all subsequent variants that each contain D614G, showed almost no conformational heterogeneity.

      We have included a detailed description (lines 237-244) of the revised manuscript to describe in greater detail effects of 0 ℃ incubation on HDXMS of WT S.

      Our results revealed that WT S was more sensitive to cold denaturation as described previously [Costello et al. 2021] where the reported half-life for conformational transitions after 0 ℃ incubation was 17 hours. We had not anticipated conformational heterogeneity revealed by deuterium exchange when using an automated HDXMS setup. Upon further review, we see a significant ensemble shift in trimer stalk peptides for the second and third replicates which sat at 0 ℃ for 4 and 8 hours respectively. This is only observed in WT but not any of the other variant S samples. We thank the reviewer for pointing this out and strengthening our conclusions.

      2) The authors presume that the bimodal spectra that are observed reflect EX1 kinetics, however, there can be multiple reasons for an apparent bimodal distribution in the spectra. I agree that some of the spectra indicate that more than a single species is present, but what the two populations represent is murky. In Figure 2D, the apparent size of the highly deuterated population gets larger going from the 60 sec to the 600-sec spectra, as expected for an EX1 transition. However, in Figure 3D the WT highly deuterated population gets smaller going from the 60-sec to the 600-sec spectra. Were bimodal examples observed beyond those shown in Figure 2?

      We agree with the reviewer. The appearance of bimodal spectra in deuterium exchange of S protein peptides in WT S are not a result of EX1 kinetics alone. We have revised the explanation for the presence of the bimodal spectra. These are largely a consequence of automated HDXMS workflows, that included 0° C incubations for short periods of time in between replicates. We report new experiments where we have eliminated 0 °C incubations by incubating at 20 °C between replicates and observed a lot lower conformational heterogeneity.

      Consequently, the shifts in bimodal spectra in figure 3D for WT S are also likely a consequence of automated HDX MS experiments with 0 ℃ incubation. We have carried out new experiments without 0 ℃ incubation, and these are shown in a revised figure 3. Even without 0 ℃ incubation, we do see bimodal spectra for certain peptides [figure 2 – S5]. These reflect an ensemble of prefusion and splayed conformations of WT S. Lack of baseline resolution precludes application of HDexaminer to resolve spectral envelopes quantitatively.

      3) How were the spectra that appeared broadened analyzed? There is no description of this in the methods, and the only data shown for this is in table 1. The left/right percentages are reported without any description of how they were obtained. Are these solely from a single spectrum? The most alarming issue is that Table 1B reports 9.4% for the right population of the 988-998 peptide, but the corresponding spectra in Figure 3D doesn't seem to have any highly deuterated population at all.

      We agree with the reviewer. We have removed HD examiner analysis of spectral broadening. Some of the spectral broadening was a consequence of 0 ℃ incubation in automated HDX analyses. These have been revised in new supplemental figures for wild -type HDX MS. Baseline resolution precludes effective quantitation of spectral envelopes, Figure 2-figure supplement 5 highlights qualitatively the spectral broadening for the reader’s benefit.

      4) The authors state on page 12: "Replicate analysis of stabilized S trimers with incubation at 4C prior to deuterium exchange (see methods) showed a time-dependent reversal of stabilization as reported previously (Costello et al., 2022), most evident at the same peptides." Is this data shown anywhere? If not then it should be included somewhere, possibly in table 1 as I would expect the cold treatment to offset the left/right population sizes.

      We note that this statement was misleading and have revised the text. The time-dependent reversal of stabilization has previously been described (Costello et al., 2022 paper) and is not part of this study.

      5) The authors state that peptide 899-913 'exhibits a slow conformational interconversion (time scale ~ 15-30 min)'. Where did this estimated rate come from? From the data shown and the limited number of time points, I don't think there is sufficient sampling of this conformational transition to really narrow down the exact timescale, especially since the ratio of left/right populations is so dependent on the pre-treatment of the sample prior to deuterium exchange. (See 1st comment)

      We thank the reviewer. The heterogeneity in deuterium exchange is attributable to the variable 0 °C incubation times in our automated HDXMS workflow. We have removed any explanations of conformational interconversion occurring in our experimental timescales.

      6) The woods plots presented in the Supporting information: (Figures 2-S4, 2-S5, 3-S4, 4-S2, 5-S2, 6-S2) are not conventional Woods plots. Normally the plots would indicate a global threshold for what is deemed to be significant based on the overall error in the dataset. From what I gather the authors used error within an individual peptide to establish significance for each specific peptide, which would be okay, but the authors don't describe the number of replicates or how the p-value was calculated. I would strongly recommend that the authors instead rely on a hybrid significance testing approach, as described recently: (PMID 31099554). What's really alarming with the current approach is that several of the Woods plots shown have data points found to be significantly different that are right at zero on the y-axis.

      We thank the reviewer. We have replaced all of the Woods plots with volcano plots. We have now applied a hybrid significance testing approach as recommended by the reviewer.

      7) Table 1: The summary of the peptides with observed bimodal behavior should include data from the replicates, particularly for assessment of how consistent the left/right population sizes are across replicates. Instead of just a percentage, the table should report an average and the standard deviation from the replicate measurements. Furthermore, the table should also include peptides that are overlapping with those presented. Based on Figure 2-figure supplement 1, there are at least two other peptides that cover the 899-913 region. These additional peptides should show a similar trend with bimodal profiles and will be important for showing how reproducible the apparent EX1 kinetics are in the dataset.

      All available replicates and overlapping peptides should be analyzed to ensure that these percentages reported are consistent across the data. It is also odd that the authors choose to use the 3+ charge state of the WT, but the 2+ for the D614G mutant. If both charge states were present, then both of them should be analyzed to ensure the population distributions are consistent within different charge states.

      We thank the reviewers for their suggestion. We have removed Table 1 since bimodal spectra are not resolvable for quantitation as described previously. We instead show spectra of overlapping peptides in these regions for interpretation by the reader.

      We show charge states that provide highest intensity for the peptides (Figure 2-figure supplement 5, Figure 3-figure supplement 3, Figure 4-figure supplement 3, Figure 5-figure supplement 3, Figure 6-figure supplement 3).

      8) The method for calculating p-values used to assess the significance of a difference in observed deuterium uptake is not described. The manuscript mentions technical replicates, but no specific information as to how many replicates were collected for each time point. These details should be included as they are also part of the summary table that is recommended for the publication of HDX data.

      We have utilized hybrid significance testing as suggested by the reviewers to determine significance as outlined by Hageman et al. We have included this in table S3 and in the text.

    2. eLife assessment

      This fundamental and timely study provides insights into the structural dynamics of several relevant mutant forms of SARS-CoV-2 spike protein, including the most recent omicron variant. The hydrogen/deuterium-exchange studies provide compelling evidence for the stabilization of the spike stalk in conjunction with increased dynamics of the N-terminal domain, where binding to the ACE2 receptor occurs. These results have profound implications for the development of small molecule inhibitors of the spike protein-ACE2 receptor interaction.

    3. Reviewer #1 (Public Review):

      In this manuscript, Braet et al provide a rigorous analysis of SARS-CoV-2 spike protein dynamics using hydrogen/deuterium exchange mass spectrometry. Their findings reveal an interesting increase in the dynamics of the N-terminal domain that progressed with the emergence of new variants. In addition, the authors also observe an increase in the stabilization of the spike trimeric core, which they identify originates from the early D614G mutation.

      Overall this is a timely and interesting exploration of spike protein dynamics, which have so far remained largely unexplored in the literature.<br /> What I find a bit missing in this manuscript is a link between how the identified changes in protein dynamics lead to increased viral fitness. While there are some possibilities listed in the discussion, I think these should be elaborated upon further. In addition, it should also be discussed how understanding the changes in the spike protein dynamics could have implications for the development of small molecule inhibitors for the virus.

    4. Reviewer #2 (Public Review):

      The study systematically looks at dynamic differences across variants longitudinally and the authors appropriately only limit their analyses to peptides that are conserved across the different variants.

      There are some concerns listed below, particularly related to the ensemble heterogeneity that is reported and need considerable revision.

      1) The authors explain that cold-temperature treatment of the S trimer ectodomain constructs has been shown to lead to instability and heterogeneity. They also show this with a comparison of untreated vs. 3-hour 37 C treated samples. I'm confused as to why "During automated HDXMS experiments protein samples were stored at 0 degrees". Will this not cause issues in protein heterogeneity, where the longer the protein sits at 0 C the more potential heterogeneity there will be, and thus greatly confound the analysis?

      2) The authors presume that the bimodal spectra that are observed reflect EX1 kinetics, however, there can be multiple reasons for an apparent bimodal distribution in the spectra. I agree that some of the spectra indicate that more than a single species is present, but what the two populations represent is murky. In Figure 2D, the apparent size of the highly deuterated population gets larger going from the 60 sec to the 600-sec spectra, as expected for an EX1 transition. However, in Figure 3D the WT highly deuterated population gets smaller going from the 60-sec to the 600-sec spectra. Were bimodal examples observed beyond those shown in Figure 2?

      3) How were the spectra that appeared broadened analyzed? There is no description of this in the methods, and the only data shown for this is in table 1. The left/right percentages are reported without any description of how they were obtained. Are these solely from a single spectrum? The most alarming issue is that Table 1B reports 9.4% for the right population of the 988-998 peptide, but the corresponding spectra in Figure 3D doesn't seem to have any highly deuterated population at all.

      4) The authors state on page 12: "Replicate analysis of stabilized S trimers with incubation at 4C prior to deuterium exchange (see methods) showed a time-dependent reversal of stabilization as reported previously (Costello et al., 2022), most evident at the same peptides." Is this data shown anywhere? If not then it should be included somewhere, possibly in table 1 as I would expect the cold treatment to offset the left/right population sizes.

      5) The authors state that peptide 899-913 'exhibits a slow conformational interconversion (time scale ~ 15-30 min)'. Where did this estimated rate come from? From the data shown and the limited number of time points, I don't think there is sufficient sampling of this conformational transition to really narrow down the exact timescale, especially since the ratio of left/right populations is so dependent on the pre-treatment of the sample prior to deuterium exchange. (See 1st comment)

      6) The woods plots presented in the Supporting information: (Figures 2-S4, 2-S5, 3-S4, 4-S2, 5-S2, 6-S2) are not conventional Woods plots. Normally the plots would indicate a global threshold for what is deemed to be significant based on the overall error in the dataset. From what I gather the authors used error within an individual peptide to establish significance for each specific peptide, which would be okay, but the authors don't describe the number of replicates or how the p-value was calculated. I would strongly recommend that the authors instead rely on a hybrid significance testing approach, as described recently: (PMID 31099554). What's really alarming with the current approach is that several of the Woods plots shown have data points found to be significantly different that are right at zero on the y-axis.

      7) Table 1: The summary of the peptides with observed bimodal behavior should include data from the replicates, particularly for assessment of how consistent the left/right population sizes are across replicates. Instead of just a percentage, the table should report an average and the standard deviation from the replicate measurements. Furthermore, the table should also include peptides that are overlapping with those presented. Based on Figure 2-figure supplement 1, there are at least two other peptides that cover the 899-913 region. These additional peptides should show a similar trend with bimodal profiles and will be important for showing how reproducible the apparent EX1 kinetics are in the dataset.<br /> All available replicates and overlapping peptides should be analyzed to ensure that these percentages reported are consistent across the data. It is also odd that the authors choose to use the 3+ charge state of the WT, but the 2+ for the D614G mutant. If both charge states were present, then both of them should be analyzed to ensure the population distributions are consistent within different charge states.

      8) The method for calculating p-values used to assess the significance of a difference in observed deuterium uptake is not described. The manuscript mentions technical replicates, but no specific information as to how many replicates were collected for each time point. These details should be included as they are also part of the summary table that is recommended for the publication of HDX data.

    5. Reviewer #3 (Public Review):

      The authors use hydrogen-deuterium exchange mass spectrometry (HDXMS) to assess the dynamics of several relevant mutant forms of SARS-CoV 2 Spike protein including the most recent Omicron variant. The Spike protein is heavily glycosylated and is a trimer so is a very difficult protein to study by HDXMS. The authors confirm the glycosylation sites, which can't be covered by the HDXMS experiment, yet they still manage to cover nearly 50% of the sequence revealing many interesting changes in dynamics in the prevalent circulating mutant forms. The beautiful HDXMS data reveal consistent trends as SARS-CoV2 mutates to survive including stabilization of the stalk and increased dynamics of the N-terminal domain where ACE2 receptor binding occurs. The authors incubate the protein at 37C and discover additional stabilization of the trimer occurs under these conditions explaining a lot of conflicting data in the literature done at different temperatures. These results have profound implications for the development of small molecule inhibitors of the Spike protein-ACE2 interaction.

    1. eLife assessment

      This paper reports a bet hedging strategy in bacteria based on chromosomal duplications and rearrangements that confer advantages in certain growth conditions. The work is of fundamental importance for understanding the role of genetic and biological variation in bacteria. The experimental work is exceptionally strong and convincing. The paper will be of interest to a broad audience including bacteriologists, geneticists and evolutionary biologists.

    2. Reviewer #1 (Public Review):

      This is an exceptional paper that investigates a 208.6 kb region of the Burkholderia thailandensis chromosome that had previously been thought to excise itself and form extrachromosomal circles. Through a series of elegant experiments , the authors conclusively show that (i) the 208.6 kb region in fact forms tandem duplications, (ii) the region can switch between duplicated and non-duplicated forms via RecA-mediated homologous recombination, and (iii) duplication provides a selective advantage in biofilms. The data are of uniformly high quality and the conclusions are fully supported by the data. The significance of the work is high because it identifies a novel form of phase variation in bacteria that represents a bet-hedging strategy to facilitate growth in diverse environments.

    3. Reviewer #2 (Public Review):

      This beautiful study identifies a genetic mechanism controlling colony morphology differences in Burkholderia thailandensis. There is a large region of the genome which can be duplicated or triplicated in a RecA-dependent recombination process, leading to phenotypic changes. In addition to colony morphology differences in cells with one, two, or three copies of the region, other phenotypes like biofilm formation are impacted. This appears to be an unstable genetic change since some of the colony types can interconvert to others after restreaking. The authors are commended for the development of elegant genetic approaches to study and carefully prove the existence of the copy number variation of this genomic region. These approaches will be of great use to the field in studying copy number variation in bacteria far beyond Burkholderia or colony morphology/biofilm formation. Bacteriology has for decades focused on average measurements of a culture, and this study helps usher the field to a new future where we appreciate and measure the behaviors of individual populations of cells within the same culture.

    4. Reviewer #3 (Public Review):

      This paper shows that RecA-mediated recombination between two insertion sequence elements can drive the duplication of a large (~200 kb) region that leads to a growth advantage in biofilms, but a disadvantage during planktonic growth. The experiments presented are incisive and definitive. While IS elements are more commonly implicated in gene inactivation, this paper reveals that they can provide a benefit by driving a reversible genome modification in the form of a large-scale duplication. The paper should appeal to readers interested in mechanisms of genome evolution, phase variation, biofilms, and bacterial pathogenesis. The final model is convincing and also lays the foundation for future studies aimed at identifying which gene(s) in the duplicated region are ultimately responsible for the biofilm growth benefit. The paper also serves to correct this lab's prior interpretation of related data in which they concluded that the genomic region being investigated excised and circularized. They very nicely lay out what led them to conclude this previously and how their new data led to a revised model, as well as many additional, important new insights. To be clear, there were no issues with the prior data, just the interpretation/model. So in my view, this is exactly how science should unfold - new data can and should lead to revised models. I applaud the authors for laying this trajectory out in such a straightforward, open manner.

    1. Author Response

      Reviewer #1 (Public Review):

      Major points:

      1) How STC1 controls changes in MSCs' ability for hampering CAR-T cell-mediated anti-tumor responses is unclear.

      In this study, we demonstrated that the presence of STC1 is critical for MSCs to exert their immunosuppressive role by inhibiting cytotoxic T cell subsets, activating key immune suppressive/escape related molecules such as IDO and PD-L1, and crosstalking with macrophages in the TME. These immunosuppressive functions of MSC could be significantly hampered when the STC1 gene was knockdown. Considering that staniocalcin-1 is glycoprotein hormone that is secreted into the extracellular matrix in a paracrine manner, we would conclude that the role of STC-1 is not to alter the function of MSCs intracellularly. Rather, it facilitates the immunosuppressive capabilities of MSCs through extracellular secretion into the TME as a pleiotropic factor, thus impacting the functioning of T cells, cancer cells and other immune cells.

      The reviewer's question is well taken, and we have added the points mentioned above to the Discussion section to ensure a more comprehensive conclusion. Moreover, a recent study published in Cancer Cell, which was suggested by the other reviewer, is consistent with our results. It has provided further mechanistic information on how stanniocalcin-1 impacts immunotherapy efficacy and T cell activation. The reference has been cited and discussed as shown below.

      "In this model, activated macrophages or stress signals during CAR-T therapy may prompt MSCs to secret staniocalcin-1 into the extracellular matrix of TME, serving as a pleiotropic factor to negatively impact the function of T cells and stimulate the expression of molecules that inactivate immune responses, ultimately providing an immunosuppressive effect of MSC." (page 22, highlighted). "In line with our study, it was recently reported that stanniocalcin-1 negatively correlates with immunotherapy efficacy and T cell activation by trapping calreticulin, which abrogates membrane calreticulin-directed antigen presentation function and phagocytosis [50]." (Page 20, highlighted)

      2) Is ROS important? It is not tested directly.

      ROS plays an important role during immune response, which are released by neutrophils and macrophages. Not only do they act as key mediators of the adaptive immune response, but they also have the ability to modulate the activation of B-cells and T-cells. In our study, we suggest that ROS may be involved in NLRP3 inflammasome activation and the expression and secretion of STC1. Although we did not pursue this line of inquiry further as it was beyond the scope of our paper, we have included additional relevant research in Discussion and a reference is provided.

      "It has been proved that the expression and secretion of STC1 in multiple cell lines can be stimulated by external stimuli, including cytokines and oxidative stress [26]." (Page 21, highlighted)

      3) The changes in CD8 and Treg are not convincing. Moreover, it is not tested how these changes can be elicited by the presence of MSCs.

      We have included additional in vivo data to assess the levels of Treg cells and CD8+ in this revised manuscript. This not only confirms the alterations of CD8 and Treg, but also offers additional line of evidence to further analyze the influence of MSCs on CAR-T in vivo. The findings are presented in Figure 4B, and the corresponding discussion can be found on Page 17 (highlighted).

      Reviewer #2 (Public Review):

      Major points:

      1) STC-1 is expressed and secreted by many human cancer cells. This should be discussed in the introduction or discussion with more inter-related background info on both its regulation in cancer cells and secretion pattern into TME. It is important because you state that the STC-1 secreted by MSC has such strong functions, then how about those produced and secreted by cancer cells? Are those also stimulated by macrophages or other components in TME? Do they have possible functions in helping cancer cell to escape the immune surveillance mechanisms?

      Thanks for the suggestion. We have added more details about the regulation and secretion of STC-1 in cancer cells (see below). The information is added to both the introduction and discussion (highlighted on pages 4 and 21), and all the above questions are addressed.

      "It was proved that STC1 is involved in several oxidative and cancer-related signaling pathways such as NF-κB, ERK, and JNK pathways [26,27]. The expression and secretion of STC1 in cancer tissue can be stimulated by external stimulus including external cytokines and oxidative stress [26]. Under hypoxia conditions, STC1 could be modulated by HIF-1 to facilitate the reprogramming of tumor metabolism from oxidative to glycolytic metabolism [28]. STC1 was also reported to participate in the process of epithelial-to-mesenchymal transition (EMT), which is associated with tumor invasion and the reshape the tumor microenvironment, as well as increasing therapy resistance [29]." (Page 4)

      "It has been proved that the expression and secretion of STC1 in multiple cell lines can be stimulated by external stimuli including cytokines and oxidative stress [26]." (Page 21)

      2) In Figure 4B, using a single marker of IL-1β to show the immune suppressive capability of MSC in vivo is not sufficient, staining for CD4+ and CD8+ should also be included to demonstrate whether MSC could modulate T cell compositions, which can give more direct evidence about MSC's impacts on CAR-T cell.

      The above experiments were done as suggested, and the data were presented in figure 4B. Explanations of the results are shown on page 17 Results section and page 21 Discussion section (highlighted).

      3) One of the major risks associated with CAR-T therapy is an excessive immune response that causes cytokine release syndrome. MSCs have been used in clinics as a way to suppress immune response including post-CAR-T. What does the author think about using MSC with STC-1 knockout? Can it still help reduce toxicity while maintaining CAR-T efficacy? This might be a potential application.

      This is definitely an interesting idea. Based on the data presented in the current study, it is clear that knockdown of STC-1 would abrogate the immune-suppressive impact of MSC, and therefore affect CAR-T efficacy. However, whether the presence of MSC can help reduce cytokine release syndrome when losing the function of STC-1 requires further study. We agree with the reviewer, and we had briefly discussed this possibility at the very end of the discussion as shown below (Page 22, highlighted).

      "… the findings we presented here are no doubt that would have potential clinical applications toward improving the efficiency of CAR-T therapy as well as reducing the excessive toxicity by modulating the level of STC1 in TME".

      4) There was a recent study published in Cancer Cell (Lin et al. Stanniocalcin 1 is a phagocytosis checkpoint driving tumor immune resistance. 2021), and they also reported that STC1 negatively correlates with immunotherapy efficacy and patient survival. It should be cited, and in fact, it provided support to the authors' present study with completely different experimental settings.

      Thanks for providing this important information. It is an excellent study and consistent with our findings. The reference was added and discussed on page 20 (highlighted) as shown below.

      "In line with our study, it was recently reported that stanniocalcin-1 negatively correlates with immunotherapy efficacy and T cell activation by trapping calreticulin, which abrogates membrane calreticulin-directed antigen presentation function and phagocytosis [50]"

    2. eLife assessment

      This study provides potentially important insights into the role of mesenchymal stem cells in CAR-T therapy, and suggest that the STC1 gene could be a key factor in influencing the efficacy of this treatment. This finding has the potential to improve current therapeutic strategies based on cell therapy and may indicate new biology related to how mesenchymal stem cells affect the immune state within the tumor microenvironment. Further research is necessary to clarify the signaling pathways, but the data presented by the authors are generally well-supported and convincing.

    3. Reviewer #1 (Public Review):

      This work aims to understand whether MSCs support the resistance in tumor cells upon CAR T cell treatment and whether the expression of STC1 in MSCs contributes to those changes. Overall, the in vivo data is interesting. However, the mechanistic understandings are correlated and based on many assumptions. Furthermore, the differences in Treg changes presented in Figure 2 are not convincing. It is also not clear the underlying mechanisms by which the presence of MSCs leads to these changes.

      Major points:

      1. How STC1 controls changes in MSCs' ability for hampering CAR T cell-mediated anti-tumor responses is unclear.

      2. Is ROS important? It is not tested directly.

      3. The changes in CD8 and Treg are not convincing. Moreover, it is not tested how these changes can be elicited by the presence of MSCs.

    4. Reviewer #2 (Public Review):

      Zhang et al. addressed an intriguing question - whether the presence of mesenchymal stem cells (MSCs) could influence the efficacy of CAR-T therapy. After observing that CAR-T cytotoxicity was strongly inhibited by MSCs by modulating certain correlated immune response pathways, the authors sought to uncover the underlying mechanisms by examining the interaction between MSCs and macrophage, immune escaping mechanisms, and oxidative stress. Notably, the authors discovered that a single gene, STC1, played a major role in reversing the suppression when it was knocked down/out. Although more research is necessary to clarify the signaling pathways, the data presented by the authors were generally well-supported and convincing.

      Major points:

      1. STC-1 is expressed and secreted by many human cancer cells. This should be discussed in the introduction or discussion with more inter-related background info on both its regulation in cancer cells and secretion pattern into TME. It is important because you state that the STC-1 secreted by MSC has such strong functions, then how about those produced and secreted by cancer cells? Are those also stimulated by macrophages or other components in TME? Do they have possible functions in helping cancer cell to escape the immune surveillance mechanisms?

      2. In Figure 4B, using a single marker of IL-1β to show the immune suppressive capability of MSC in vivo is not sufficient, staining for CD4+ and CD8+ should also be included to demonstrate whether MSC could modulate T cell compositions, which can give more direct evidence about MSC's impacts on CAR-T cell.

      3. One of the major risks associated with CAR-T therapy is an excessive immune response that causes cytokine release syndrome. MSCs have been used in clinics as a way to suppress immune response including post-CAR-T. What does the author think about using MSC with STC-1 knockout? Can it still help reduce toxicity while maintaining CAR-T efficacy? This might be a potential application.

      4. There was a recent study published in Cancer Cell (Lin et al. Stanniocalcin 1 is a phagocytosis checkpoint driving tumor immune resistance. 2021), and they also reported that STC1 negatively correlates with immunotherapy efficacy and patient survival. It should be cited, and in fact, it provided support to the authors' present study with completely different experimental settings.

    1. Author Response

      Reviewer #1 (Public Review):

      This theoretical (computational modelling) study explores a mechanism that may underlie beta (13-30Hz) oscillations in the primate motor cortex. The authors conjecture that traveling beta oscillation bursts emerge following dephasing of intracortical dynamics by extracortical inputs. This is a well written and illustrated manuscript that addressed issues that are both of fundamental and translational importance.

      We are pleased by the reviewer’s judgement about the importance of the question that we consider and about the presentation of our manuscript.

      Unfortunately, existing work in the field is not well considered and related to the present work. The rationale of the model network follows closely the description in Sherman et al (2016). The relation (difference/advance) to this published and available model needs to be explicitly made clear. Does the Sherman model lack emerging physiological features that the new proposed model exhibits?

      We view the work of Sherman et al (2016) and ours as complementary. Sherman et al propose a model of a single E-I module, using the terminology of our manuscript, that is much more detailed than ours since it approximately accounts for the layered structure of the cortex using two layers of multi-compartment spiking neurons, each comprising 100 excitatory neurons and 35 inhibitory neurons. This allows a detailed comparison of the model with local MEG signals. We used a much simpler description and only describe the population behavior of local E and I neurons populations in each module. However, contrary to Sherman’s model, this allows us to address the spatial aspect of beta oscillations which is the main target of our work. Our simple description of a local E-I module allows us to consider several hundred E-I modules with a spatially-structured connectivity and to analyze the spatio-temporal characteristics of beta activity. We have now described the relation of our work with Sherman et al (2019) in the discussion section (lines 540-547).

      The authors may also note the stability analysis in: Yaqian Chen et al., “Emergence of Beta Oscillations of a Resonance Model for Parkinson’s Disease”, Neural Plasticity, vol. 2020, https://doi.org/10.1155/2020/8824760

      We thank the reviewer for pointing out this paper that had escaped our notice. It presents the stability analysis of a single E-I module with propagation delay (and instantaneous synapses). At the mathematical level, the analysis brings little as compared to the much older article of Geisler et al., J Neurophys (2005) that we cite. However, the model specifically proposes to describe beta oscillations in the motor cortex as arising from the interaction between excitatory and inhibitory neurons, as we do. Therefore, we included this reference as well as a reference to the previous work of Pavlides et al., PLoS Comp Biol (2015) where the model was developed.

      The model-based analysis of the traveling nature of the beta frequency bursts appears to be the most original component of the manuscript. Unfortunately, this is also the least worked out component. The phase velocity analysis is limited by the small number (10 x 10) of modeled (and experimentally recorded) sites and this needs to be acknowledged.How were border effects treated in the model and which are they?

      We thank the reviewer for these points which gave us the opportunity to clarify them and improve our manuscript. As described in Methods: Simulations (line 847 and seq.) and shown in Fig. S2 (Fig. S10 in the original submission), we actually simulated our model on a 24 × 24 grid and did all our measurements in a central 10×10 grid to take into account that the electrode covers only part of the motor cortex. In addition to minimize border effects, we added on each side of the 24×24 grid two rows of E-I modules kept at their (non-oscillating) fixed points of stationary activity, as depicted in Fig. S2. In order to address the concern of the reviewer, and to check that indeed border effects had a minimal impact on our results, we have performed a new set of simulations on a 24×24 grid with periodic boundary conditions. The results are shown in the new supplementary Fig. S9 and are indistinguishable from those reported in the main text and figures. In particular, the proportion of the different wave types and the wave speeds are unaffected by this change of boundary conditions. A paragraph has been added in the revised version (lines 371-378) to discuss this point.

      How much of the phase velocities are due to unsynchronized random fluctuations? At least an analysis of shuffled LFPs needs to be performed.

      The phase velocities are indeed due to unsynchronized random fluctuations (coming from the finite number of neurons in each of our modules as well as, and more importantly, from the uncorrelated local external inputs). In order to check that the spatial-structure of connectivity was important, we followed the suggestion of the reviewer and also performed a new set of simulations to provide a further test. As proposed by the reviewer, after performing the simulations we shuffled in space the signal of the different electrodes and also did a parallel analysis where we shuffled the signal from different electrodes in the recording. We then reclassified the shuffled simulations/recordings in exactly the same way as the original ones. As shown in the new additional Fig. S16, this resulted in the full elimination of time frames classified as “planar waves” both in the model and in the experimental recordings. Additionally, it little modified the proportion of “synchronized” or “random” episodes which is intuitively understandable since shuffling does not change the nature of these states. In order to further assess the impact of connections between modules, we also decided to suppress them, namely to put their range l to zero. In order to avoid modifying the working point of a local module by this manipulation, we focused on the case without propagation delay. Without long-range connection, the local dynamics of each module is little modified. However, as shown in the new Fig. S18a, synchronization between neighboring modules is strongly decreased and the proportion of the different wave types is entirely changed: synchronized states and planar waves disappear and are replaced by random states. These results are described in two new paragraphs (lines 401-414 and lines 431-435).

      Is there a relationship between the localizations of the non-global external input and the starting sites of the traveling waves?

      This is also an interesting question that parallels some asked by the other reviewers and which we did our best to address. As described in the “Essential revisions” point 5) above, we aligned all “planar wave events” in space and time with the help of the spatio-temporal phase maps of the oscillations. We did find that planar waves were preceded by an increase in the global synchronization index σp, both in simulations and in experiments. In simulations this increase also corresponded to a shift of the global inputs away from their mean, as depicted in the new Fig. 4 in the main manuscript. However, no significant average spatio-temporal profile of the local inputs emerged when we used these temporal alignments. This is presumably due to the large variability of local inputs that can give rise to planar waves. We have described these results in the new section “Properties of planar waves and characteristics of their inputs”.

      In summary, this work could benefit from a widening of its scope to eventually inspire new experimental research questions. While the model is constructed well, there is insufficient evidence to conclude that the presented model advances over another published model (e.g. Sherman et al., 2016).

      As described in the “Essential revisions” and the discussion section of the manuscript, our work highlights a number of questions that can (and hopefully will) inspire new experimental research. We also hope that we have clarified above that our model complements Sherman et al.’s model and advances it as far as the spatial aspects of beta oscillations in motor cortex are concerned.

      Reviewer #2 (Public Review):

      Kang et. al., model the cortical dynamics, specifically distributions of beta burst durations and proportion of different kind of spatial waves using a firing rate model with local E-I connections and long range and distance dependent excitatory connections. The model also predicts that the observed cortical activity may be a result of non stationary external input (correlated at short time scales) and a combination of two sources of input, global and local. Overall, the manuscript is very clear, concise and well written. The modeling work is comprehensive and makes interesting and testable predictions about the mechanism of beta bursts and waves in the cortical activity. There are just a few minor typos and curiosities if they can be addressed by the model. Notwithstanding, the study is a valuable contribution towards developing data driven firing rate.

      We really appreciate the positive comments of the reviewer and thank her/him for them. We have done our best to correct the typos and to address the questions raised by the reviewer.

      1) The model beautifully reproduces the proportion of different kind of waves that can be seen in the data (Fig 3), however the manuscript does not comment on when would a planar/random wave appear for a given set of parameters (eg. fixed v ext, tau ext, c) from the mechanistic point of view. If these spatio-temporal activities are functional in nature, their occurrence is unlikely to be just stochastic and a strong computational model like this one would be a perfect substrate to ask this question. Is it possible to characterize what aspects of the global/local input fluctuations or interaction of input fluctuations with the network lead to a specific kind of spatio-temporal activity, even if just empirically ?

      This is an important question that parallels some asked by the other reviewers and which we did our best to address. As described in the “Essential revisions” paragraph above, we aligned all “planar wave events” either in phase or at their starting time points. We did find that planar waves were preceded by an increase in the global synchronization index σp, both in simulations and in experiments. In simulations this increase also corresponded to a shift of the global inputs away from their mean, as depicted in the new Fig. 4 in the main manuscript. When we used the same alignment to average spatio-temporal local inputs, we did not see the emergence of any significant patterns. This presumably reflects the high variability of local inputs able to produce a planar wave.

      Do different waves appear in the same trial simulation or does the same wave type persist over the whole trial? If former, are the transition probabilities between the different wave types uniform, i.e probability of a planar wave to transit into a synchronized wave equal to the probability of a random wave into synchronized wave?

      In the same trial simulation, different types of waves indeed successively appear. The curiosity of the reviewer led us to investigate this interesting point. Since time frames classified as random or synchronized are much more numerous than the planar (and radial) wave ones, it is much more probable that a planar wave transits into a synchronized or a random pattern than the reverse process (i.e., synchronized and random patterns preferentially transit into each other). Nonetheless, we considered questions related to the one of the reviewer. What are the states preceding a planar wave event? Given that a planar wave episode is preceded by a random (or synchronous) episode, is it more likely to be followed by a random or by a synchronous event? We actually find that the entry state is prominently a synchronized state. Furthermore, when the entry state is synchronized, the exit state is also synchronized much more often than would be expected by chance. This shows that most often, planar waves are created from an underlying synchronized persistent state. This has been described in the revised manuscript (lines 443-451).

      2) Denker et al 2018, also reports a strong relationship between the spatial wave category, beta burst amplitude, the beta burst duration and the velocity (Fig 6E - Denker et. al), eg synchronized waves are fastest with the highest beta amplitude and duration. Was this also observed in the model ?

      We had long exchanges with Michael Denker about his analysis since there are some differences between his code and what is described in Denker et al. (2017), possibly because of several typos in the Method section of Denker et al (2017). We have checked that the results of our code agree with his but there are some differences with the results obtained on the available datasets and those reported in Denker et al from other data sets. We have now provided the detailed statistics of the different wave types as obtained by our analysis in the simulation of model SN (Fig. S9) and SN’ (Fig. S11) and in the recordings for monkey L (Fig. S10) and monkey N (Fig. S12). In the recording data, the amplitude and speed of the synchronized and planar waves are comparable and higher than in the radial and random wave types. The duration of synchronized events is longer than the one of planar waves and of the other waves types. Comparable results are obtained in the simulations with nonetheless a few differences: the mean amplitude of planar waves is somewhat larger than those of synchronized states, the hierarchy of duration in the different states is respected but the duration themselves are longer in the simulations than in the recordings (about 40 % for the planar waves and almost two times longer for the synchronized states). We attribute these differences to the fact synchronization is slightly less effective in the recordings than in the model. Long synchronization episodes in the recordings are often cut-off by a few time frames where the synchronization index goes below the threshold value for a synchronized pattern. This happens rarely enough not to affect much the global statistics of the different states but it as a much more visible effect on the measured duration of the synchronized states.

      Reviewer #3 (Public Review):

      In this manuscript, the authors consider a rate model with recurrently connections excitatory-inhibitory (E-I) modules coupled by distance-dependent excitatory connections. The rate-based formulation with adaptive threshold has been previously shown to agree well with simulations of spiking neurons, and simplifies both analytical analysis and simulations of the model. The cycles of beta oscillations are driven by fluctuating external inputs, and traveling waves emerge from the dephasing by external inputs. The authors constrain the parameters of external inputs so that the model reproduces the power spectral density of LFPs, the correlation of LFPs from different channels and the velocity of propagation of traveling waves. They propose that external inputs are a combination of spatially homogeneous inputs and more localized ones. A very interesting finding is that wave propagation speed is on the order of 30 cm/s in their model which is consistent with the data but does not depend on propagation delays across E-I modules which may suggest that propagation speed is not a consequence of unmylenated axons as has been suggested by others. Overall, the analysis looks solid, and we found no inconsistency in their mathematical analysis.

      We thank the reviewer for his comments and for his expert review.

      However, we think that the authors should discuss more thoroughly how their modeling assumptions affect their result, especially because they use a simple rate-based model for both theory and simulations, and a very simplified proxy for the LFPs.

      In the revised manuscript, we have performed additional simulations to test different modeling assumptions as suggested by the reviewer and discussed further below.

      The authors introduce anisotropy in the connectivity to explain the findings of Rubino et al. (2006), showing that motor cortical traveling waves propagate preferentially along a specific axis. They introduce anisotropy in the connectivity by imposing that the long range excitatory connections be twice as long along a given axis, and they observe waves propagating along the orthogonal axis, where the connectivity is shorter range. Referring specifically to the direction of propagation found by Rubino et al, could the authors argue why we should expect longer range connections along the orthogonal axis? In fact, Gatter and Powell (1978, Brain) documented a preponderance of horizontal axons in layers 2/3 and 5 of motor cortex in non-human primates that were more spatially extensive along the rostro-caudal dimension as compared with the medio-lateral dimension, and Rubino et al. (2006) showed the dominant propagation direction was along the rostro-caudal axis. This is inconsistent with the modeling work presented in the current manuscript.

      This is an important comment and we thank the reviewer for pointing out these data in Gatter and Powell (1978). Since the experimental data show that planar wave propagation directions are anisotropically distributed, we have tried and investigated what the underlying mechanism of this anisotropy could be in the framework of our model. Anisotropy in connectivity is an obvious possibility. Given our result, and the data of Gatter and Powell, it appears however that it is not the underlying cause of the observed anisotropy direction in the motor cortex (in the framework of our model). We have thus investigated another possibility, namely that the local external inputs are anisotropically targeting the motor cortex, being more spread out along a given axis (lines 510-529 and new Fig. 5g-l). We find that planar waves propagate preferentially along the orthogonal axis. This leads us to conclude that the observed propagation anisotropy could be of consequence of the external input being more spread out along the medio-lateral axis. Data addressing this issue could be obtained using retroviral tracing techniques.

      The clarity and significance of the work would greatly improve if the authors discussed more thoroughly how their modeling assumptions affect their result. In particular, the prediction that external inputs are a combination of local and global ones relies on fitting the model to the correlation between LFPs at distant channels. The authors note that when the model parameter c=1, LFPs from distant channels are much more correlated than in the data, and thus have to include the presence of local inputs. We wonder whether the strong correlation between distant LFPs would be lower in a more biologically realistic model, for example a spiking model with sparse connectivity and a spiking external population, where all connections are distant dependent. While the analysis of such a model is beyond the scope of the present work, it would be helpful if the authors discussed if their prediction on the structure of external inputs would still hold in a more realistic model.

      This is a legitimate question that we indeed asked ourselves. In a previous work with a simpler chain model, we only considered finite size fluctuations. We found good agreement between our simplified description of finite size fluctuations and simulations of a spiking network with fully connected modules and sparse distance-dependent connectivity. This leads us to believe that our description of finite-size fluctuations is reliable in this setting. Assuming that it is the case, we find that with 104 neurons or more per module finite size noise is not strong enough to replace our local external inputs. Even with 2000 neurons per modules the intrinsic fluctuations the network is very synchronized (new Fig. S15e-g). With 200 neurons per module, the intrinsic fluctuations are strong enough to replace the fluctuating local inputs (Fig. S15a-d) but this is quite a low number. Our description of local noise would have to underestimate the fluctuation in a more sparsely connected network by a significant amount for agreement with the data to be obtained without local inputs. Moreover, it seems to us quite plausible that different regions of motor cortex receive different inputs but, of course, this can only settled by further experiments. Together with the new Fig. S15, we have added a paragraph to address this question in the manuscript (lines 379-400).

    2. eLife assessment

      This manuscript makes a valuable contribution to the field. The authors have developed a compelling network model to study mechanisms for the emergence of oscillations in the beta range in the primary motor cortex during movement preparation, and their propagation as traveling waves across the cortical sheet. The model is able to recapitulate several features of motor cortical activity acquired experimentally. Due to the recent results suggesting a functional role for traveling waves, it is of great interest to discover the mechanisms underlying such phenomena, and this work is an interesting step in that direction. However, the evidence for the reported new insights is incomplete at this stage, due to some weaknesses that remain to be addressed.

    3. Reviewer #1 (Public Review):

      This theoretical (computational modelling) study explores a mechanism that may underlie beta (13-30Hz) oscillations in the primate motor cortex. The authors conjecture that traveling beta oscillation bursts emerge following dephasing of intracortical dynamics by extracortical inputs. This is a well written and illustrated manuscript that addressed issues that are both of fundamental and translational importance. Unfortunately, existing work in the field is not well considered and related to the present work. The rationale of the model network follows closely the description in Sherman et al (2016). The relation (difference/advance) to this published and available model needs to be explicitly made clear. Does the Sherman model lack emerging physiological features that the new proposed model exhibits? The authors may also note the stability analysis in: Yaqian Chenet et al., "Emergence of Beta Oscillations of a Resonance Model for Parkinson's Disease", Neural Plasticity, vol. 2020, https://doi.org/10.1155/2020/8824760

      The model-based analysis of the traveling nature of the beta frequency bursts appears to be the most original component of the manuscript. Unfortunately, this is also the least worked out component. The phase velocity analysis is limited by the small number (10 x 10) of modeled (and experimentally recorded) sites and this needs to be acknowledged. How much of the phase velocities are due to unsynchronized random fluctuations? At least an analysis of shuffled LFPs needs to be performed. How were border effects treated in the model and which are they? Is there a relationship between the localizations of the non-global external input and the starting sites of the traveling waves?

      In summary, this work could benefit from a widening of its scope to eventually inspire new experimental research questions. While the model is constructed well, there is insufficient evidence to conclude that the presented model advances over another published model (e.g. Sherman et al., 2016).

    4. Reviewer #2 (Public Review):

      Kang et. al., model the cortical dynamics, specifically distributions of beta burst durations and proportion of different kind of spatial waves using a firing rate model with local E-I connections and long range and distance dependent excitatory connections. The model also predicts that the observed cortical activity may be a result of non stationary external input (correlated at short time scales) and a combination of two sources of input, global and local.

      Overall, the manuscript is very clear, concise and well written. The modeling work is comprehensive and makes interesting and testable predictions about the mechanism of beta bursts and waves in the cortical activity. There are just a few minor typos and curiosities if they can be addressed by the model. Notwithstanding, the study is a valuable contribution towards developing data driven firing rate.

      1) The model beautifully reproduces the proportion of different kind of waves that can be seen in the data (Fig 3), however the manuscript does not comment on when would a planar/random wave appear for a given set of parameters (eg. fixed v_ext, tau_ext, c) from the mechanistic point of view. If these spatio-temporal activities are functional in nature, their occurrence is unlikely to be just stochastic and a strong computational model like this one would be a perfect substrate to ask this question. Is it possible to characterize what aspects of the global/local input fluctuations or interaction of input fluctuations with the network lead to a specific kind of spatio-temporal activity, even if just empirically ? Do different waves appear in the same trial simulation or does the same wave type persist over the whole trial? If former, are the transition probabilities between the different wave types uniform, i.e probability of a planar wave to transit into a synchronized wave equal to the probability of a random wave into synchronized wave?

      2) Denker et al 2018, also reports a strong relationship between the spatial wave category, beta burst amplitude, the beta burst duration and the velocity (Fig 6E - Denker et. al), eg synchronized waves are fastest with the highest beta amplitude and duration. Was this also observed in the model ?

    5. Reviewer #3 (Public Review):

      In this manuscript, the authors consider a rate model with recurrently connections excitatory-inhibitory (E-I) modules coupled by distance-dependent excitatory connections. The rate-based formulation with adaptive threshold has been previously shown to agree well with simulations of spiking neurons, and simplifies both analytical analysis and simulations of the model. The cycles of beta oscillations are driven by fluctuating external inputs, and traveling waves emerge from the dephasing by external inputs. The authors constrain the parameters of external inputs so that the model reproduces the power spectral density of LFPs, the correlation of LFPs from different channels and the velocity of propagation of traveling waves. They propose that external inputs are a combination of spatially homogeneous inputs and more localized ones. A very interesting finding is that wave propagation speed is on the order of 30 cm/s in their model which is consistent with the data but does not depend on propagation delays across E-I modules which may suggest that propagation speed is not a consequence of unmylenated axons as has been suggested by others. Overall, the analysis looks solid, and we found no inconsistency in their mathematical analysis. However, we think that the authors should discuss more thoroughly how their modeling assumptions affect their result, especially because they use a simple rate-based model for both theory and simulations, and a very simplified proxy for the LFPs.

      The authors introduce anisotropy in the connectivity to explain the findings of Rubino et al. (2006), showing that motor cortical traveling waves propagate preferentially along a specific axis. They introduce anisotropy in the connectivity by imposing that the long range excitatory connections be twice as long along a given axis, and they observe waves propagating along the orthogonal axis, where the connectivity is shorter range. Referring specifically to the direction of propagation found by Rubino et al, could the authors argue why we should expect longer range connections along the orthogonal axis? In fact, Gatter and Powell (1978, Brain) documented a preponderance of horizontal axons in layers 2/3 and 5 of motor cortex in non-human primates that were more spatially extensive along the rostro-caudal dimension as compared with the medio-lateral dimension, and Rubino et al. (2006) showed the dominant propagation direction was along the rostro-caudal axis. This is inconsistent with the modeling work presented in the current manuscript.

      The clarity and significance of the work would greatly improve if the authors discussed more thoroughly how their modeling assumptions affect their result. In particular, the prediction that external inputs are a combination of local and global ones relies on fitting the model to the correlation between LFPs at distant channels. The authors note that when the model parameter c=1, LFPs from distant channels are much more correlated than in the data, and thus have to include the presence of local inputs. We wonder whether the strong correlation between distant LFPs would be lower in a more biologically realistic model, for example a spiking model with sparse connectivity and a spiking external population, where all connections are distant dependent. While the analysis of such a model is beyond the scope of the present work, it would be helpful if the authors discussed if their prediction on the structure of external inputs would still hold in a more realistic model.

    1. Author Response

      Reviewer #2 (Public Review):

      Weaknesses (major)

      1) Adding control groups (sham stimulation) to Experiment 5 and Experiment 8 would be needed to increase confidence that NITESGON's memory-enhancing effects do not depend on sleep but do depend on dopamine receptor activity.

      Thank you for highlighting this major weakness within our research; we will be sure to include control groups in future research if we conduct replication studies. Additionally, upon review of your comment, we have addressed the lack of control/sham groups in Experiment 5 and 8 in the Discussion section when acknowledging the limitations of the research.

      Please see the newly added text from the Discussion section on pages 21-22 below:

      “Moreover, it must also be acknowledged that Experiments 5 and 8 did not include a control-sham stimulation group, thus limiting the interpretation of these two experimental findings. Control-sham stimulation groups would increase our confidence in our findings that NITESGON’s memory-enhancing effects depend not on sleep but on DA receptor activity.”

      2) Task order in the interference study in Experiment 4 was randomized during the first visit for task training as well as during the memory test, however, the word-association and spatial navigation tasks used in Experiments 3 and 4 were not counterbalanced during training or memory testing. Thus, the authors cannot rule out the possibility of order effects.

      Upon reading your comment and reviewing the paper, we have decided to add a limitations paragraph to the paper which highlights the concern of Experiments 3 and 4 not being counterbalanced during training or memory testing. Additionally, the new section provides an explanation of how not counterbalancing Experiments 3 and 4 introduced the possibility of order effects being present in the results.

      Please see the new addition from the Discussion section on page 21 below:

      “When interpreting the current findings, it must be considered that some limitations exist within the research; limitations on experimental design are noted below, followed by a discussion of utilizing indirect proxy measures. The task order for Experiment 4 was randomized during the first visit for training and the recall-only memory test 7-days later; however, the word association and spatial navigation task used in Experiments 2 and 3 were not counterbalanced; therefore, the findings of Experiments 2 and 3 could have been impacted by a potential order effect.”

      3) It is unclear how Experiment 3 and Experiment 4 differ. Percent of words recalled is the measure of memory performance, however, there is not a clear measure of interference in Experiment 4 (i.e., words recalled during Memory task II that were from Memory task I).

      Thank you for highlighting the difficulty in distinguishing the differences between Experiment 3 and Experiment 4. To clarify what the differences are between Experiment 3 and Experiment 4, we explained in Experiment 4’s introductory paragraph that the object-location task used in Experiment 3 was replaced with a Japanese-English verbal associative learning task in Experiment 4.

      Please see the paragraph from the Experiment 4 subsection on page 10 below:

      “Experiments 2 and 3 revealed both retroactive and proactive memory effects 7-days after initial learning of the two tasks. To further explore if NITESGON is linked to behavioral tagging and evaluate if interference impacts NITESGON as the strong stimulus, Experiment 4 removed the object-location task used in Experiments 2 and 3 and replaced it with a Japanese-English verbal associative learning task similar to the Swahili-English verbal associative task. Considering how memory formation and persistence are susceptible to interference occurring pre-and post-encoding(37-39) and are heavily influenced by commonality amongst the learned and intervening stimuli(40); it is believed that conducting two consecutive, like-minded word-association (i.e., Swahili-English and Japanese-English) tasks will result in one’s consolidation process interfering with that of the other(41). Considering how our previous experiments suggest the effect obtained by NITESGON improves the consolidation of information via behavioral tagging, it is possible that NITESGON on the first task might help reduce the overall interference effect on the second task.”

      Additionally, we explained in further detail that comparing the percentage of correctly recalled word pairs on the second task 7-days after learning from the percentage of correctly recalled word pairs on the first task 7-days after learning was done to measure for an interference effect.

      Please see the adapted text from the Experiment 4 subsection on page 11 below:

      “Upon assessment for a potential interference effect, the active group displayed no significant difference in how many words participants were able to recall between the first and the second task (difference: .76 4.93) (F = .29, p = .60), whereas the sham group demonstrated the first task rendered an interference effect on the second task (difference: 5.16 5.99) (F = 14.11, p = .001).”

      Lastly, in the methods section describing how the interference effect was calculated was changed. The newly edited text better explains that the percentage of words pairs learned were subtracted from one another to measure the significance of interference one may have potentially had on the other.

      Please see the amended text in the Methods section on page 38 below:

      “In addition, an interference effect was calculated by subtracting the percentage of correctly recalled word pairs on the second task 7-days after learning from the percentage of correctly recalled word pairs on the first task 7-days after learning. This number gave a proxy of interference.”

      4) In Experiment 5 the learning and test phases for the two sleep groups were conducted at different times of day (sleep group: training at 8pm and testing the next morning at 8am, sleep deprivation group: training at 8am and testing at 8pm) which introduces the possibility of circadian effects between the two groups. Additionally, the memory test occurred at the 12h point for this experiment instead of the 7-day point. Therefore, the authors' conclusions are not addressed by this experiment, and it remains unclear whether the 7-day long-term memory effects of NITESGON are sleep-dependent.

      Upon reading your comment and reviewing the paper, we have decided to add a limitations paragraph to the paper which highlights the two sleep groups being conducted at different times of day and the memory test occurring at the 12-hour point as opposed to 7-days after initial learning. In addition to acknowledging these limitations, we have also provided explanations regarding what potential effects are introduced by having the sleep groups learn and test at different times of day, such as circadian effects between the two groups, and the memory tests occurring at 12-hours rather than 7-days after initial learning.

      Please see the new addition from the Discussion section on page 21 below:

      “Additionally, in Experiment 5, the learning and test phases for the two groups were conducted at different times of day (i.e., sleep group: training at 8 p.m. and testing at 8 a.m., sleep deprivation group: training at 8 a.m. and testing at 8 p.m.), thus introducing the potential for circadian effects between the two groups. Furthermore, the recall-only memory testing occurred at the 12-hour point rather than 7-days later, allowing us to conclude that the observed effect seen 12-hours later was not affected by sleep; however, it remains unclear whether the 7-day long-term memory effects of NITESGON are sleep-dependent.”

      Weaknesses (minor)

      1) Salivary amylase is being used as a proxy of noradrenergic activity; however, salivary amylase levels increase with stress as well, which impacts memory performance. It would be helpful if the authors addressed this and whether they measured other physiological indicators of stress/sympathetic nervous system activation.

      Upon review of your comment, we have edited the paper so that it includes text in the Discussion section that brings attention to the fact that stress can enhance salivary amylase and advises readers that this should be considered when interpreting results. We also add an additional measure which measure pupil size, a measure well-know for sympathetic measure. In addition we add also a VAS score to ask people about their stress levels.

      Please see the added new addition from page 22 below.

      “Although the use of indirect proxy measures, such as sAA for NA activity and sEBR for DA activity, enabled the tracking of LC-NA activity changes from baseline measurements and demonstrated the potential of an LC-DA relationship, caution must be advised when interpreting results considering these proxy measures are affiliated with limitations, such as being substantially variable, as well as the potential of other brain regions and monoamine neurotransmitters being associated with changes seen in sAA concentration levels(80), an enzyme that is provoked by both central parasympathetic and sympathetic nervous system activation, including acute stress responses(81). Additionally, although sEBR has been increasingly linked to DA, it has been defined as a more viable measure of striatal DA activity(52, 82). At the same time, some evidence suggests that sEBR and DA levels may be unrelated(83, 84), thus requiring further validation as a behavioral proxy measure.”

      2) Insufficient details of how the blinding experiment was conducted make it difficult to determine whether participants had awareness or subjective responses during the NITESGON stimulation. Adding physiological indicators of heart rate, skin conductance, and respiration would provide a better indicator of a sympathetic nervous system response. Additionally, a series of randomized stimulation and sham trials delivered to the participant would provide a more objective measure of the detectability of the stimulation.

      Thank you for your comment regarding the portion of the experiments that were included to determine the efficacy of the measures taken to ensure the experiments were well blinded. After reviewing the comment and reading over the paper, we were concerned that it was not clear enough to the reader that the efficacy of blinding was determined by having each participant of every experiment complete the same single-answer questionnaire after all NITESGON and testing had been experienced. Therefore, we edited the wording below to elucidate that there was not an individual blinding experiment but that there was a questionnaire for every participant in every experiment to help determine the efficacy of blinding for each experiment and the research.

      Please see the text from the Blinding section on pages 17-18 below:

      “Blinding. To determine if the stimulation was well blinded, all participants in Experiments 1-7 were asked to guess if they thought they were placed in the active or control group (i.e., what stimulation participants received compared to what participants expected). Our findings demonstrated that participants could not accurately determine if they were assigned to the active or sham NITESGON group in each experiment, suggesting that our sham protocol is reliable and well-blinded (see fig. 8).”

      Additionally, please see the text in the Methods section that has been reworded to clarify how the questionnaire of blinding was conducted on page 47 below:

      “Blinding: To determine if the stimulation for all experiments was well blinded, all participants who participated in Experiments 1-7 were asked to complete a single-response questionnaire after the conclusion of the NITESGON procedure. Here, participants were asked to guess if they thought they were placed in the active or control group. A χ2 analysis was used to determine if there was a difference between what stimulation participants received compared to what participants expected.”

      3) It would be appreciated if the authors could speak to the possible role of the amygdala in the memory-enhancing effects of NITESGON, as this region is a well-known modulator of many types of memory consolidation and is implicated in noradrenergic-related memory enhancement.

      Upon consideration of your comment, we added text providing the reader with insight into how NITESGON has activated the amygdala in previous research, similar to the VTA in the current study, and how the LC and amygdala were shown to be activated during emotionally arousing stimuli in another study. Furthermore, we have acknowledged that the amygdala is understood to have modulatory implications in long term memory and how future investigations are needed to establish the amygdala’s role with NITESGON.

      Please see the text from the Discussion section on page 20 below:

      “Additionally, it is well-known that the amygdala is not the final place of memory storage, but rather has major modulatory influences on the strength of a memory(74). Similar to the VTA in the current study, prior research has shown that the amygdala is activated during NITESGON but ceased post-stimulation; however, NITESGON was not accompanied by a task during the experiment(14). Moreover, a recent fMRI study spotlights the dynamic behavior of the LC during arousal-related memory processing stages whereby emotionally arousing stimuli triggered engagement from the LC and the amygdala during encoding; however, during consolidation and recollection stages, activity shifted to more hippocampal involvement(75). Considering the impact the VTA and amygdala can have on memory, future experimental investigations are needed to establish their role in the memory-enhancing effects of NITESGON.”

    2. eLife assessment

      This paper will be of fundamental interest to many sub-disciplines of neuroscience, ranging from cognitive neuroscientists to cellular neuroscience. It provides compelling and substantial brain and behavioral evidence of a novel intervention that can boost long-term memory. The key claims of the manuscript are generally well supported by the data, though the correlational nature of the data in different types of experiments raises some issues about interpretation.

    3. Reviewer #1 (Public Review):

      Luckey et al. used a sophisticated, multimodal approach to test the hypothesis that engaging LC-hippocampal pathways promote behavioral tagging processes in humans. To activate this mechanism in a causal manner, they apply transcutaneous electrical stimulation of the greater occipital nerve (NITESGON), a relatively novel and non-invasive technique for stimulating brainstem pathways linked to arousal-related neuromodulation. To test the behavioral tagging hypothesis, they use a variety of indirect methods, including pharmacology, EEG, fMRI, saliva assays, and eye-tracking to measure LC-related activity, hippocampal activity/connectivity, and potential dopamine states/release. At the behavioral level, they demonstrate that NITESGON stimulation during or after learning benefits long-term but not immediate associative memory. These long-term memory improvements were related to increased gamma power in the MTL. In another set of experiments, they show that NITESGON during associative learning promotes associative learning on a subsequent unrelated (object-location) or highly overlapping (paired word associates) task. Consistent with prior VNS and other NITESGON studies, they show robust evidence that this intervention leads to significant increases in salivary alpha-amylase, a putative marker of central noradrenergic activity. This increase in sAA was also correlated with long-term associative memory across several experiments using paired word associates. Using fMRI, they demonstrate resting-state increases in local hippocampal, LC, and VTA low-frequency fluctuations as well as increased rs-FC between the LC and hippocampus during and after stimulation. Finally, they show that NISTESGON does not enhance long-term associative memory in individuals taking a dopamine antagonist medication, implicating a potential dopamine mechanism in these stimulation-induced memory effects.

      This paper is impressive in scope and takes advantage of both causal and indirect methods to cross-validate their results. Behavioral tagging is a relatively nascent area of research in humans, and this paper provides compelling evidence for the role of noradrenergic activity (whether related to behavioral tagging or more general arousal-related consolidation processes) in facilitating memory encoding and consolidation. Beyond basic science research, these findings also have important clinical implications. In recent years, there has been intense interest in studying the LC's role in promoting healthy cognitive function and its involvement in AD-related neuropathology. The LC is one of the earliest sites of tau pathology and thereby represents an important target for clinical intervention in early AD. The current study advances our understanding of a non-invasive technique that may be used to bolster learning in both healthy populations and potentially in older individuals with AD.

      The key claims of the manuscript are generally well supported by the data. However, while the large number of studies is a significant virtue of this paper, it is also - at times - a potential weakness. There are many measures and pieces to this puzzle to assemble. While the multimodal approach is admirable and rigorous, the fit between some of these pieces is sometimes overstated. The correlational nature of the data helps cross-validate some of the predictions about the LC mechanisms involved in behavioral tagging. But the most compelling test of this hypothesis would be to link the LC/hipp/VTA fMRI data - arguably the most direct outcome measure in this study - to long-term memory performance and the other neurophysiological measures (e.g., sAA, blink rate, etc.). Many of the results are compelling but they are often observed in parallel studies. Thus, interpreting them as engaging a common mechanism is tenuous. This important shortcoming notwithstanding, there is still a strong replication in other findings (e.g., sAA-memory correlations) across experiments that lend support to some of the hypotheses.

      A related issue is that the reliability of these indirect measures of noradrenergic signaling and dopaminergic receptors, including salivary alpha-amylase and spontaneous eyeblink rate, is oversold. While this stimulation technique elicits parallel increases in many of the neurophysiological and behavioral measures, these patterns might not reflect the engagement of a shared underlying mechanism. It's an especially big stretch to interpret the eyeblink effects as relating to LC-DA, which cannot be verified using the current methods. In addition, the spatial resolution of the neuroimaging data is poorly suited for testing predictions about such a small brain structure. This represents a potential weakness of the paper, as the large smoothing kernel in the fMRI data may capture the contributions of other brainstem nuclei and regions activated by NITESGON. It is also worth noting that many of the individual differences findings are confounded by group clustering effects. That is, the between-group effects belie whether the same linear relationships exist in the sham and stimulation groups individually. This necessitates additional correlation analyses within groups to verify that stimulation doesn't decorrelate the relationship between physiological measures and performance.

      While the behavioral tagging predictions are intriguing and supported by some findings in the literature, they may not be entirely appropriate for this study. In short, I'm not fully convinced these data satisfy all assumptions of BT (see Dunsmoor et al., 2022 for an overview). Behavioral tagging is thought to be a process that stabilizes weak learning. While it's very difficult to operationalize the "strength" of a memory representation, I'm not sure if the current paired-associates paradigm yields weak learning. Participants have multiple opportunities to learn the memoranda, which casts some doubt as to whether these are weak memory representations. This possibility is supported by the generally high memory performance (~80% on average) during the immediate test and even accurate recall after 7 days.

      Behavioral tagging also does not make any explicit predictions about interference effects. Much of this theory centers upon the idea that arousing learning events lead to memory enhancements/benefits; but it does not speak directly as to whether these events confer protection from memory interference (and there was no baseline condition in Dunsmoor et al., 2015 to test any predictions regarding reduced retroactive interference for CS+ stimuli, for example). I find the protective effects of stimulation in Experiment 4 very interesting, and they speak to the importance of this technique as a memory intervention. However, I think this is an example of the authors relying too heavily on a behavioral tagging framework when these could simply reflect arousal-related (Nielson et al., 1996; 2014) and/or noradrenergic-related (e.g., McGaugh, 2013) consolidation benefits more broadly. In summary, I think it would strengthen the paper to walk back claims related to behavioral tagging specifically and address the possibility of alternative (but related) mechanisms.

      To summarize, the results of this study are very interesting and the project is very ambitious. There is much therapeutic potential for NITESGON to improve memory and this study represents an important advance towards achieving that goal. The work would primarily be improved by not relying on too many assumptions or inferences, and being more agnostic with respect to certain mechanisms (e.g., whether this is behavioral tagging or general consolidation mechanisms).

    4. Reviewer #2 (Public Review):

      Luckey et al. investigated the mechanisms by which non-invasive transcutaneous electrical stimulation of the greater occipital nerve (NITESGON) enhances long-term memory. They find that NITESGON applied during or after a word-association task enhances memory recall at a retrieval test 7 days later but not at an immediate test, suggesting NITESGON's memory-enhancing effect involves the consolidation process. They show that NITESGON applied during a second spatial memory task not only enhances later recall for that task, but also for an initial word-association memory task unpaired with stimulation administered before the second task. This highlights NITESGON's ability to retroactively strengthen memories and provides further evidence for behavioral tagging. Furthermore, the authors perform a series of in-depth experiments to examine the mechanisms by which NITESGON enhances memory consolidation. They show that NITESGON increases salivary a-amylase levels, a marker of endogenous noradrenergic activity, and spontaneous eye blink levels, a proxy for dopamine levels, both in support of locus coeruleus involvement. Resting-state fMRI results further suggest NITESGON induces increased communication between the locus coeruleus and hippocampus, suggesting a circuit-based mechanism by which NITESGON enhances memory consolidation. Interestingly, the data also indicate that NITESGON's memory-enhancing effect is not sleep-dependent but is dopamine-receptor-dependent.

      The conclusions of this paper are mostly well supported by the data, however, some of the key mechanistic findings lack the appropriate controls required for the authors' claims.

      Strengths<br /> 1) The manuscript is written in an easy-to-read manner with clarity for each of the individual experiments conducted.<br /> 2) The authors provide convincing evidence that NITESGON targets the memory consolidation process and enhances long-term but not short-term memory. This provides a unique non-invasive method for enhancing memory and has an important potential impact on neurocognitive disorders.<br /> 3) The manuscript provides convincing evidence that NITESGON increases LC-hippocampus connectivity as well as MTL gamma power, providing a circuit-based mechanism by which stimulation enhances memory.

      Weaknesses (major)<br /> 1) Adding control groups (sham stimulation) to Experiment 5 and Experiment 8 would be needed to increase confidence that NITESGON's memory-enhancing effects do not depend on sleep but do depend on dopamine receptor activity.<br /> 2) Task order in the interference study in Experiment 4 was randomized during the first visit for task training as well as during the memory test, however, the word-association and spatial navigation tasks used in Experiments 3 and 4 were not counterbalanced during training or memory testing. Thus, the authors cannot rule out the possibility of order effects.<br /> 3) It is unclear how Experiment 3 and Experiment 4 differ. Percent of words recalled is the measure of memory performance, however, there is not a clear measure of interference in Experiment 4 (i.e. words recalled during Memory task II that were from Memory task I).<br /> 4) In Experiment 5 the learning and test phases for the two sleep groups were conducted at different times of day (sleep group: training at 8pm and testing the next morning at 8am, sleep deprivation group: training at 8am and testing at 8pm) which introduces the possibility of circadian effects between the two groups. Additionally, the memory test occurred at the 12h point for this experiment instead of the 7-day point. Therefore, the authors' conclusions are not addressed by this experiment, and it remains unclear whether the 7-day long-term memory effects of NITESGON are sleep-dependent.

      Weaknesses (minor)<br /> 1) Salivary amylase is being used as a proxy of noradrenergic activity, however, salivary amylase levels increase with stress as well, which impacts memory performance. It would be helpful if the authors addressed this and whether they measured other physiological indicators of stress/sympathetic nervous system activation.<br /> 2) Insufficient details of how the blinding experiment was conducted make it difficult to determine whether participants had awareness or subjective responses during the NITESGON stimulation. Adding physiological indicators of heart rate, skin conductance, and respiration would provide a better indicator of a sympathetic nervous system response. Additionally, a series of randomized stimulation and sham trials delivered to the participant would provide a more objective measure of the detectability of the stimulation.<br /> 3) It would be appreciated if the authors could speak to the possible role of the amygdala in the memory-enhancing effects of NITESGON, as this region is a well-known modulator of many types of memory consolidation and is implicated in noradrenergic-related memory enhancement.

    1. Author Response

      Reviewer #1 (Public Review):

      In this manuscript, Cover et al. examine the role of thalamic neurons of the rostral intralaminar nuclei (rILN) that project to the dorsal striatum (DS) in mice performing a reinforced action sequence task. Using patch-clamp electrophysiology, they find that neurons from the three rILN (CM, PC, and CL) have similar electrophysiological properties. Using fiber photometry recordings of calcium activity from rILN neurons that project to DS, they show that these neurons increase in activity at the first lever press and reward acquisition in mice performing a lever pressing operant task. They additionally demonstrate that this action initiation and reward-related activity exists more generally in mice performing other movements or rewarded tasks. Building on their lab's previous work, the authors further find that by optogenetically activating or inhibiting these rILN-DS neurons, mice will increase or decrease task performance, respectively. Lastly, the authors show that a variety of cortical and subcortical areas have input to rILN-DS neurons suggesting that these neurons might act as an integrator of signals from such areas during task performance.

      • The authors beautifully show that the electrophysiological properties of CM, PC, and CL neurons are similar and go on to treat the rILN as one homogenous nucleus for functional fiber photometry recordings and optogenetic stimulations. It seems that these recordings and stimulations were only performed in CL, as indicated in the images (Fig. 2A, 4A). Is this the case, or were CM, PC, and CL neurons sampled? It would be helpful to clarify if DS projecting neurons from all rILN nuclei show the reported action initiation and reward acquisition activity or only CL neurons.

      The arrangement of the rILN nuclei presents a technical challenge for experiments attempting to selectively record from or manipulate a single nucleus in this grouping. Based on our findings that the three nuclei do not differ in electrophysiological properties, we approached the in vivo experiments with the intent to target the rILN as a unit. As the reviewer points out, the medial-lateral coordinate for optic fiber placement tended to align above the CL and PC nuclei. However, variability in fiber placement and spread of light within tissue resulted in inclusion of CM activity as well. Given the spread of light through tissue (Shin, et al., 2016; PMID: 27895987), it would be very difficult to confidently determine from histology which photometry recordings were primarily obtained from CL vs PC vs CM neuronal activity. We agree with the reviewer that these three nuclei may differently signal during reward-driven behavior. Our di-synaptic tracing study supports this possibility as it revealed unique afferent connectivity to rILNDS projecting neurons. We now mention this limitation of our approach in the discussion (lines 324 - 330).

      • Along similar lines, to what extent of rILN was targeted for optogenetic activation and inhibition? It seems that the authors implanted a total of 4 optic fibers, two on each side (please clarify in methods). What was the reasoning behind this? Please show that only rILN and not PF was activated/inhibited.

      We apologize for the confusion in our description of this method. For our optogenetic experiments, we infused viruses at four locations (bilateral striatum and rILN) and implanted only two fibers (bilateral rILN) to selectively target striatally-projecting rILN neurons. We have added clarification on this detail to the methods section.

      To prevent inadvertent modulation of Pf neurons, we used virus injection coordinates and volumes that prevented viral spread to the Pf and furthermore implanted the optic fibers in the more rostral regions of the rILN. We histologically confirmed viral expression and fiber placement for all mice and excluded any mice with viral spread to the Pf or off-target fiber placement. We include these criteria for post-hoc exclusion in the methods.

      • While AAV1 is becoming a popular tool for transsynaptic labeling, performing confirmatory patch-clamp recordings with optogenetic activation of inputs, would provide better evidence for the synaptic connection between upstream regions, such as ACC and OFC, and rILN neurons.

      We agree that electrophysiological confirmation of these inputs to the rILN would complement our tracing study. As our focus for this experiment was to specifically identify inputs that synapse on striatally-projecting rILN neurons, we interrogated putative afferents that were already established to project to the rILN. There are several studies that demonstrate the physiological circuits from some of these afferent projections to the rILN (without di-synaptic specificity), such as the SNr  rILN projection (Rizzi & Tan, 2019; PMID: 31091455).

      • In addition, the transsynaptic tracing experiments would benefit from showing the cell count quantifications in CM, PC, and CL. It seems that the authors have already performed this quantification for constructing their diagrams on the right. To make any point about the relative strength of afferent innervation to rILN-DS neurons showing such quantification would be necessary.

      Thank you for this suggestion, we now include cell counts for 2 cases per investigated afferent (Supplemental Table S2).

      • Why is the injection site for the retrograde cre-dependent tdTomato AAV (Fig. 5 middle left panels) showing expression? Is the cre coming through transsynaptic AAV1 from direct projections of each AAV1 injection site (AAV1 is not supposed to spread across a second synapse)? The diagrams suggest that not all regions (e.g. SUM or SC) have direct projections to DS.

      We apologize for this confusion. The tdTomato fluorophore expression observed in the striatum may arise from several possible circuit configurations. To survey just a couple: 1) tdTomato expression in the DS arises from direct projections from the afferent bypassing the thalamus (e.g. ipsilateral ACC→Striatum), which would result in labeled striatal somata (ACC pyramidal neurons delivering AAV1-cre to an MSN, and those local MSN collaterals retrogradely picking up rAAV-DIO-tdtomato) and ACC labeled axon terminals in the DS (ACC interneurons delivering AAV1-cre to DS-projecting ACC pyramidal neurons that pick up rAAV-DIO-tdtomato); 2) terminal projections arising from the labeled rILN neurons shown in the middle-right panels (i.e. ACC→rILN→Striatum).

      Reviewer #2 (Public Review):

      This manuscript details the role of the rILN to the DS pathway in the onset of operant behavior that promotes the delivery of a reward and in the ultimate acquisition of that reward. The strengths of the paper are in the detailed fiber photometry study that encompasses several behavioral domains that correlate to the signal observed in the rILN to DS pathway. I am especially interested in how the "encoding" shifts across time as the animals refine their behavior both in a temporal sense and in the magnitude of the signal. Further, the authors demonstrate then that this is dependent on action, as they do not observe signals in a Pavlovian behavioral task, but do observe reward-based signals in a "free consumption" task (the strawberry milk). The examination into devaluation also enhances the understanding of this pathway, even though there were no differences between a valued and devalued task. Finally, the authors examine bi-directional optogenetic manipulation of the pathway, and its impact on how the trials are completed, omitted, or incomplete. They find that manipulation alters the % completed trials and regulates trial omission. This paper really does not have any glaring weaknesses to point out, however, the physiological assessment does seem to have a few strong trends and even though the studies are well powered, and included both sexes, sex as a biological variable was not commented on that I could find. My estimation of the data doesn't suggest strong sex differences in any metric measured. Additionally, the data that included projections to the rILN were very interesting, and future studies looking into the physiology of these neurons, and/or how the physiology of these neurons adapt after operant training may be very interesting to understand plasticity within the adaptation across the training from FR1 to FR5 with time limits.

      Thank you for your review. We analyzed our data for sex differences but did not identify any significant differences between male and female subjects for any of the experiments.

    1. eLife assessment

      The authors present interesting information regarding the possibility of targeting the oncogenic K-Ras(G13C) mutant with nucleotide competitors. The experiments represent a solid support of the claims and show that this approach can work despite concerns about the high affinity of GTP and its high cellular concentration. These results will be of high interest for all working in the Ras field and in targeting oncogenes with small molecules. A weakness of the manuscript is the lack of direct physiological insights.

    2. Reviewer #1 (Public Review):

      Ras is the first discovered oncogene and KRAS is the most frequently mutated isoform. Recent studies led to the development of mutation specific inhibitors, especially against the KRASG12C mutant. However, unfortunately the patients treated with Adagrasib or others develop resistance due to further gain of function mutations and amplification of KRASG12C allele apart from mutations in the downstream signaling components. One of the oldest approaches to target Rho GTPases like RAS is to compete with the nucleotide binding of RAS and it has for a long time remained difficult owing to the picomolar affinity for GTD/GDP. Gray and colleagues in 2014 tried to overcome these issues by employing GDP derivatives that can undergo covalent reaction with disease specific mutations but Muller etal reported in their previous work (Sci.reports 2017) that the issue with these derivatives was with the loss of reversible affinities for beta modified derivatives for RAS of atleast 10000 fold compared to GDP and GTP. Here the authors present novel GDP derivatives different from Gray and colleagues and demonstrate that they could lock KRASG13C covalently, another important mutant of KRAS in an inactive form with a multiple set of biochemical, structural and cellular assays.

      However, the issue is a lack of evidence to demonstrate "target engagement" in cells and these derivatives need to be developed further as they cannot pass through cell membranes. The complete covalent modification of the compound is achieved at very high pH. Also its not clear if addition of edaGDP would disrupt KRASG13C and effector interaction directly.

    3. Reviewer #2 (Public Review):

      The authors have demonstrated a covalent strategy to target the oncogenic K-Ras(G13C) mutation, which is found in about 3,000 cancer patients in the US each year. G13C is a major contributor to G13 mutations, the next hotspot mutation after codon 12. Moreover, there is no approved therapy for G13 mutations and no published inhibitors of any KRAS G13 mutant proteins, making this a particularly important contribution to the rapidly expanding repertoire of RAS inhibitors. A striking difference in comparison to G12 mutations, mutations occurring at Codon 13 exhibit impaired pM-nucleotide binding affinity of K-Ras. This weaker nucleotide affinity offered the authors the opportunity to develop a nucleotide based inhibitor of a RAS protein. With the high nucleophilicity of cysteine mutation, G13C the authors set out to target this mutant oncogene.

      The authors developed several covalent molecules derived from GDP/GTP, the natural substrate of K-Ras's nucleotide binding pocket, interestingly, not through the oligophosphate chain (explored by Gray and co-workers in an earlier report) but the 2,3-diol of the ribose. This turned out to be a judicious choice for targeting G13C because of the closer proximity to the 2',3' rather than the phosphates. Previous work by Gray et. al. used the phosphate attachment point for the electrophile but this compromised binding affinity overall-whereas the relatively tolerant modifications at 2',3' led to higher affinity electrophilic ligands. This change led to much tighter binders and effective covalent modifiers through C13. With two co-crystal structures resolved, the authors unambiguously showed the covalent cross-linking between artificial G-nucleotides and K-Ras(G13C).

      It is not surprising that one of the major limitations of these GDP-based competitive ligands suffer from permeability issues. GDP or GTP analogs made in this study were not permeable through plasma membrane. The authors nicely worked around these limitations by delivering the fully modified proteins to the cells and measured cell signaling effects. Through electroporation the authors demonstrated the covalent adduct to be able to inhibit downstream signaling by compare introduction of K-Ras WT or K-Ras(G13C) or K-Ras(G13C) covalent adduct.

      A number of very intriguing aspects of the covalent adduct were noted which should guide others in the field, including that the adduct with eda-GTP could get hydrolysed to eda-GDP after the covalent modification of the protein--furthermore GAP stimulation of this adduct still occurred. By use of a non-hydrolyzable form of GTP (CP) this could be prevented and could be a very useful method for preventing hydrolysis after introduction in cells--an application Goody and coworkers applied to a previous covalent base adduct.

      Overall, the manuscript addresses an important problem relating to whether covalent small molecules can engage K-Ras(G13C) and provided two timely co-crystal structures for future research and development.

    4. Reviewer #3 (Public Review):

      Ras mutations are found in almost 25 percent of cancer patients. It has been difficult to directly target Ras proteins due to the lack of druggable pockets on the surface of the protein and the extremely high binding affinity of nucleotides to Ras proteins. Recently a mutant specific irreversible drug that targets the mutation G12C has been FDA approved. This drug binds to a shallow pocket on the surface of Ras and attacks the G12C mutation irreversibly. Another approach is to compete with the nucleotides bound to Ras. An attempt to generate nucleotide competitors that can take advantage of the G12C mutant has been proposed. Nevertheless, these published competitors had much lower affinities compared to endogenous nucleotides which would hinder the covalent modification in the presence of other nucleotides.

      To overcome this, the authors propose to introduce a warhead in the ribose ring. Indeed, this modification did not affect the reversible binding affinity of these nucleotides to Ras wild type, in comparison to GDP and GTP. This finding represents a new opportunity to target G13C ras by competing with the nucleotides in cells. The authors support their claims with the appropriate in vitro experiments. Nevertheless, these experiments were performed at non physiologically high pH (9.5) and those compounds were not able to cross the cellular membrane. Thus, it is too early to draw conclusions regarding the appropriateness of the approach and whether it will prove successful in cells or if it will have medical application.

    1. Author Response

      Reviewer #2 (Public Review):

      Wu Yang et al. investigated how exophers (large vesicles released from neuronal somas) are degraded. They find that the hypodermal skin cells surrounding the neuron break up the exophers into smaller vesicles that are eventually phagocytosed. The neuronal exophers accumulate early phagosomal markers such as F-actin and PIP2, and blocking actin assembly suppressed the formation of smaller vesicles and the clearance of neuronal exophers. They show the smaller vesicles are labeled with various markers for maturing phagosomes, and inhibiting phagosome maturation blocked the breakdown of exophers in to smaller vesicles. Interestingly, they discover that GTPase ARF-6, effector SEC-10/Exocyst, and the phagocytic receptor CED-1 in the hypodermis are required for efficient production of exophers by neurons.

      Strength

      The study clearly demonstrates that exophers are eliminated via hypodermal cellmediated phagocytosis. Exophers are broken down into smaller vesicles that accumulate phagocytic markers, and inhibiting this process shows that exophers are not resolved. The paper does a thorough examination of various markers and mutants to demonstrate this process.

      The hypodermal cells not only engulf these small vesicles, but they also play a role in the formation of exophers. Exopher production is reduced when ARF-6, SEC-10, or CED-1 are knocked down in the hypodermis. This is intriguing because phagocytosis is a critical step in the final elimination of cells, but in this unique situation, it appears that the neuron fails to extrude the exopher without phagocytes.

      Weakness

      Non-professional phagocytes engulfing cell corpses and many other types of cellular debris (e.g. degenerating axons) have been shown in multiple systems and the observations here are not surprising. Many of the markers used in the study are wellestablished phagocytic markers and do not bring forward a new technological advance.

      What's interesting is that the breakdown of exophers into smaller vesicles and eventual clearance follows a different sequence of events than macrophages. Exophers appear to undergo phagosomal fission before interacting with lysosomes. This would be difficult to appreciate by a general reader.

      While the paper has strengths, it appears that the message is not clear. The title suggests that the reader will learn about how ARF-6 and CED-1 control exopher extrusion. Although this observation is intriguing and maybe the main point of the paper, there does not appear to be a substantial amount of data to support this claim. The only data to back this up is in the final figure and the majority of the paper is focused on how hypodermal cells phagocytose exophers.

      The title has been revised.

      To show exopher secretion is dependent on the hypodermal cells-

      1) Could authors induce exopher production through other means? And test any involvement of CED-1? For example, authors note exopher production increases under stress conditions including expression of mutant Huntingtin protein. It would be intriguing if loss of CED-1 would be sufficient to block or reduce exopher production in that context and would highlight an exciting role for phagocytic cell types.

      We interpreted this question as an inquiry into whether the neuron intrinsic exopher inducer was relevant to reliance on hypodermal interaction for exophergenesis, given our use of aggregating mCherry as the inducer. Unfortunately, our Huntingtin expressor lines now display high levels of transgene silencing, precluding their use in this experiment. To address this concern, we switched to a low toxicity GFP expressing transgene from the Chalfie lab, uIs31[Pmec17::GFP]. We found that arf-6 mutations suppressed exophers in this background as effectively as they did in previous mCherry experiments, indicating that our results are not dependent upon the particular transgene marking the touch neurons, or the specific protein they express (Fig 6E).

      2) It is not clear if the CED-1 localization to the exopher is due to CED-1 expression during phagocytosis or is it involved in the extrusion. Perhaps the basal level of CED-1 is important for the extrusion but the strong expression is important for recognition of the exopher.

      In the experiments we performed we used a constitutively expressed hypodermisspecific CED-1::GFP to show localization to exophers, so the recruitment of CED1::GFP in hypodermal membranes to the site where the neighboring neuron is producing an exopher is not caused by changes in expression, but rather is more likely to reflects protein recruitment. We now point this out more explicitly in the text. Added text: “Since the hypodermal CED-1DC::GFP we used is constitutively expressed, we attribute the exopher surrounding CED-1DC::GFP signal to CED-1 recruitment by exopher-surface signals."

      3) While the data with ttr-52 and anoh-1 alleles is compelling, do we know that exophers actually expose PS? Especially since at a certain point, the exopher is still attached to the neuronal soma. Is PS still exposed by exopher in CED-1 background?

      We are also very interested in this. Unfortunately, we have had difficulty obtaining sufficient MFGE8 PS-biosensor expression in the adult to test this question directly.

      4) What is the fate of a neuron that is unable to produce exophers? Could one look at lifespan of ALMR neuron in CED-1, ARF-6 or Sec-10 allele (potentially with specificity to hypodermis)?

      To address this question we measured the function of the mechanosensory touch neurons, using the classic gentle touch response assay in mCherry expressing animals, comparing controls to arf-6 and ced-1 mutants. For both arf-6 and ced-1 alleles, we found reduced response to gentle touch in older adults (Ad10), indicating a deficit in neuronal function. These results are consistent with exopher production maintaining neuronal health into old age, but interpretation is limited since neither ced-1 or arf-6 act specifically in exophergenesis and therefore also affect the animals in additional ways. Currently, there are no known genetic perturbations that act specifically in exophergenesis, so there is no better approach to do the analysis. We had already published similar results in our 2017 Nature paper that first described exophers, showing that gentle touch response is better preserved in a touch neuron HttQ128::CFP strain that produced a touch neuron exopher than in the same mutant background in which the touch neurons that had not produced an exopher.

    2. eLife assessment

      This manuscript will be of interest to a wide range of cell biologists interested in understanding cell-cell communication. The discovery that an engulfing cell can control the extrusion and degradation of large vehicles from its target cell is important and intriguing. The authors present compelling data that show that exophers (large neuronal extrusions proposed to discard toxic cargo) are taken up by adjacent hypodermal cells, split into smaller fragments, and eventually degraded by lysosome fusion. The authors identify a number of small GTPases and accessory components, as well as the phagocytic receptor (CED-1) and the likely eat-me signal (phosphatidylserine).

    3. Reviewer #1 (Public Review):

      In this manuscript, Wang et al provide a pathway required for the production and degradation of exophers - large neuronal extrusions proposed to discard toxic cargo. Exophers were fairly recently described by this group and have now been observed in mammalian neurons, suggesting a broad importance in neuronal health. How exophers were disposed of by surrounding tissues was not known. Here, the authors identify a pathway required for exopher degradation into small debris (starry night), and intriguingly, genes proposed to be required in the degrading cells (hypodermis) for exopher production in neurons.

      Strengths of the manuscript include significant new insights into a problem that had not been investigated in mechanistic detail, and the combined use of genetics and cell biology to sort genes into pathways involved in exopher production and degradation. Several differences are found between exopher and cell corpse disposal, highlighting the importance of the study. The findings should be of interest to a broad audience.

    4. Reviewer #2 (Public Review):

      Wu Yang et al. investigated how exophers (large vesicles released from neuronal somas) are degraded. They find that the hypodermal skin cells surrounding the neuron break up the exophers into smaller vesicles that are eventually phagocytosed. The neuronal exophers accumulate early phagosomal markers such as F-actin and PIP2, and blocking actin assembly suppressed the formation of smaller vesicles and the clearance of neuronal exophers. They show the smaller vesicles are labeled with various markers for maturing phagosomes, and inhibiting phagosome maturation blocked the breakdown of exophers in to smaller vesicles. Interestingly, they discover that GTPase ARF-6, effector SEC-10/Exocyst, and the phagocytic receptor CED-1 in the hypodermis are required for efficient production of exophers by neurons.

      Strength<br /> The study clearly demonstrates that exophers are eliminated via hypodermal cell-mediated phagocytosis. Exophers are broken down into smaller vesicles that accumulate phagocytic markers, and inhibiting this process shows that exophers are not resolved. The paper does a thorough examination of various markers and mutants to demonstrate this process.

      The hypodermal cells not only engulf these small vesicles, but they also play a role in the formation of exophers. Exopher production is reduced when ARF-6, SEC-10, or CED-1 are knocked down in the hypodermis. This is intriguing because phagocytosis is a critical step in the final elimination of cells, but in this unique situation, it appears that the neuron fails to extrude the exopher without phagocytes.

      Weakness

      Non-professional phagocytes engulfing cell corpses and many other types of cellular debris (e.g. degenerating axons) have been shown in multiple systems and the observations here are not surprising. Many of the markers used in the study are well-established phagocytic markers and do not bring forward a new technological advance.

      What's interesting is that the breakdown of exophers into smaller vesicles and eventual clearance follows a different sequence of events than macrophages. Exophers appear to undergo phagosomal fission before interacting with lysosomes. This would be difficult to appreciate by a general reader.

      While the paper has strengths, it appears that the message is not clear. The title suggests that the reader will learn about how ARF-6 and CED-1 control exopher extrusion. Although this observation is intriguing and maybe the main point of the paper, there does not appear to be a substantial amount of data to support this claim. The only data to back this up is in the final figure and the majority of the paper is focused on how hypodermal cells phagocytose exophers.

      To show exopher secretion is dependent on the hypodermal cells-

      1. Could authors induce exopher production through other means? And test any involvement of CED-1? For example, authors note exopher production increases under stress conditions including expression of mutant Huntingtin protein. It would be intriguing if loss of CED-1 would be sufficient to block or reduce exopher production in that context and would highlight an exciting role for phagocytic cell types.<br /> 2. It is not clear if the CED-1 localization to the exopher is due to CED-1 expression during phagocytosis or is it involved in the extrusion. Perhaps the basal level of CED-1 is important for the extrusion but the strong expression is important for recognition of the exopher.<br /> 3. While the data with ttr-52 and anoh-1 alleles is compelling, do we know that exophers actually expose PS? Especially since at a certain point, the exopher is still attached to the neuronal soma. Is PS still exposed by exopher in CED-1 background?<br /> 4. What is the fate of a neuron that is unable to produce exophers? Could one look at lifespan of ALMR neuron in CED-1, ARF-6 or Sec-10 allele (potentially with specificity to hypodermis)?

    5. Reviewer #3 (Public Review):

      In this paper, the authors examine the fate of exophers ejected from C. elegans neurons overexpressing a presumably aggregated mCherry protein. They show that exophers are taken up by adjacent hypodermal cells, split into smaller fragments, and eventually degraded by lysosome fusion. They identify a number of small GTPases and accessory components, as well as the phagocytic receptor (CED-1) and the likely eat-me signal (phosphatidylserine).

      The manuscript follows up on previous exopher work from some members of the current collaboration, and provides a detailed analysis of exopher fate, that will likely be useful for understanding similar events in other settings. The studies are well done, the images and data are convincing, and the interpretations are generally appropriate.

    1. eLife assessment

      The manuscript proposes a mechanism by which different S-adenosylmethionine (SAM) synthase enzymes exhibit specificity towards target sequences, thereby proposing a novel layer of control over H3K4 trimethylation (H3K4me3). Such specificity is demonstrated in the context of responses to heat stress for two Caenorhabditis elegans SAM synthase enzymes, supporting the existence and importance of this novel mechanism of epigenetic control.

    2. Reviewer #1 (Public Review):

      This study investigated the roles of sams-1 and sams-4, two enzymes that generate the major methyl donor SAM, in heat stress response and the associated molecular changes. The authors provided evidence that loss of sams-1 resulted in enhanced resistance to heat stress, whereas loss of sams-4 resulted in heightened sensitivity to heat stress. The authors further showed that whereas the basal level of the histone modification H3K4me3 in intestinal nuclei was substantially reduced in sams-1 loss-of-function mutants, H3K4me3 level greatly increased upon heat stress, and this increase depended on sams-4. Additional RNA-seq results revealed largely distinct heat stress-induced RNA expression changes in the sams-1 mutant and sams-4 knockdown worms. The authors further profiled genomic locations of H3K4me3 in sams-1 mutant and sams-4 knockdown worms. Unfortunately, the lack of sufficient technical detail made it difficult to evaluate the H3K4me3 profiling data.

      The paper provided several conceptual advances:<br /> - Uncovering interesting and opposing heat stress phenotype associated with the loss of two related SAM synthases. Thus, even though both SAMS-1 and SAMS-4 produce SAM, the source of SAM production appears to have distinct consequences on the organismal heat stress response.<br /> - Demonstration that SAMS-4 appeared able to compensate for the loss of SAMS-1 upon heat shock, resulting in restoration of the histone mark H3K4me3 in intestinal cells.<br /> - Revealing largely different gene expression changes upon heat shock in animals lacking sams-1 or sams-4. Thus, the gene expression profiles corroborated the differential heat stress response.

      This paper describes one of the first adaptations of CUT&TAG in C. elegans, which can be of high impact on the field. Unfortunately, the lack of experimental detail made it difficult to evaluate the quality of the CUT&TAG data and the consequent interpretations.

      Overall, the paper reported a number of interesting findings that will be of substantial interest to the field. However, the paper in its current form has substantial shortcomings, particularly related to the difficulty in evaluating the validity of H3K4me3 profiling data. The paper would also benefit from further discussion that attempts to reconcile some of the inconsistent results.

    3. Reviewer #2 (Public Review):

      In this manuscript titled "S-adenosylmethionine synthases specify distinct H3K4me3 populations and gene expression patterns during heat stress", the authors Godbole et al investigated how C. elegans SAM synthases, SAMS-1 and SAMS-4, affected gene expression, H3K4 trimethylation (H3K4me3), and the survival under heat stress. They found in this study that SAMS-4 was required for survival during heat shock. They reasoned that SAM supplied by SAMS-4 but not SAMS-1 might be responsible for generating H3K4me3 under heat shock and claimed that the two SAM synthases differentially affected histone methylation and thus gene expression in the heat shock response. This study suggested a stress-responsive mechanism by which the specific isozyme of SAM synthetase provided a specific pool of cellular SAM for H3K4me3. Overall, this study is interesting but descriptive. Lacking necessary controls and mechanistic details weakened the significance of this work.

      Strengths: Very interesting survival phenotypes in the loss of different SAM synthetases; technical success in CUT&tag in C. elegans.

      Weaknesses: No clear conclusion can be drawn about whether and how SAM synthetases affect H3K4me3.

    4. Reviewer #3 (Public Review):

      The manuscript " S-adenosylmethionine synthases specify distinct H3K4me3 populations and gene expression patterns during heat stress " by Godbole et al proposes a novel mechanism by which different S-adenosylmethionine (SAM) synthase enzymes exhibit specificity towards target sequences, thereby providing a layer of control over H3K4 trimethylation (H3K4me3) in Caenorhabditis elegans. The authors detail an extensive investigation of the function of two C. elegans SAM synthase enzymes, SAMS-1 and SAMS-4. They provide evidence that mutation or knockdown of these two enzymes affected gene expression of distinct gene sets and that loss of these enzymes has opposite effects on survival under heat stress. These differential effects are linked to differential effects on histone modification H3K4me3 of specific target gene sets. It is unclear from this work how exactly this specificity may be achieved and some of the data regarding the role of other components of the methylation machinery are somewhat superficial and confusing. Nevertheless, the study suggested a novel mechanism by which H3K4me3 of specific gene sets may be controlled and this mechanism is novel and potentially important.

    1. Author Response

      Reviewer 2 (Public Review):

      The authors’ coarse-grained mathematical model is based upon proteome partitioning constraints. Similar models have been developed in the past, although the authors do an excellent job distinguishing their work. The interdependence among growth rate, growth yield, and carbon transport (together with the comparatively few state variables) makes the proposed model an attractive general framework for predictive metabolic engineering and strain optimization in bio-manufacturing.

      Strengths:

      1) The recognition that the constant biomass concentration (1/beta) can be used to recast the growthrate versus growth yield trade-off in terms of a growth rate versus carbon uptake trade-off (lines 147-155, Eq. 2), and coupling of the growth- and carbon uptake-rates through proteome partitioning, are powerful ideas. They transform the traditional (false) dichotomy of a negative correlation between growth and yield into a feasible space of growth-yield combinations (e.g. Figs 2BC).

      2) The authors calibrate the model for E. coli (BW25113) grown in glycerol/glucose, batch/continuousculture (lines 157-164), then apply the model to an impressive variety of E. coli strains. This is not typically done with semi-mechanistic models and elevates the authors’ approach by implying that their model is sufficiently-general so as to apply across strains, yet sufficiently-constrained so as to provide quantitative predictions.

      Weaknesses:

      1) The tension between generality and constraint leads to some category errors where strain-specific empirical invariants are taken as general strain-independent operating conditions. This happens at least twice: a minor case involving the growth-rate threshold for acetate overflow, and a serious case where the magnitude of the ’housekeeping’ proteome fraction φq is taken to be strain- and condition-independent.

      a) (lines 82-86) The growth-rate threshold for the acetate overflow switch in E. coli was observedin ’studies with a single strain in different conditions’ [i.e. different carbon sources in batch]. The interpretation provided in the references cited (lines 83-84) is that the threshold is a manifestation of a tipping point between carbon uptake rate and the costs of energy generation. The carbon uptake rate is implicitly strain-dependent; there is no reasonable expectation that all strains growing in glucose will be fermenting (or all respiring). The conclusion (line 84) that ’the model predicted no correlation between growth rate and acetate secretion rate in the case of different strains growing in the same environment’ is tautological when the carbon uptake rate (vmc) is used by the authors to distinguish among strains. This error is easily fixed by simply changing the wording, but it serves to illustrate how constraints operating at the strain level can be tacitly (and erroneously) applied at the genus level.

      The emphasis we put on the comparison between batch growth on glucose of different strains vs batch growth in different environments of a single strain may have been misleading. The point we wanted to make was that the occurrence of fermentation (acetate overflow) during fast growth on glucose is not a necessary consequence of intrinsic physical constraints on metabolism, but the consequence of strain-specific regulatory mechanisms. This is demonstrated by the existence of E. coli strains that do not ferment while growing on glucose, but that have essentially the same metabolic capacities as strains that do. When we started this study, we did expect (perhaps naively) that growth on glucose at a high rate necessarily comes with low yield due to the higher relative acetate overflow, that is, the ratio of the acetate secretion and glucose uptake rates (Supplementary Figure 4 in the revised manuscript).

      In the new version of the manuscript, we have modified the analysis of the glucose uptake and acetate secretion data, by plotting them against growth rate and growth yield in separate 2D plots, as suggested by Reviewer 1. This has led to a perspective that is more in line with the comment of this reviewer that the model explores different ways in which a carbon uptake rate can be converted into a growth rate, depending on the selected resource allocation strategy, and that this gives rise to trade-offs between growth rate and growth yield. In the context of this analysis, we do come back to the original point we wanted to make, but phrased differently (and hopefully more clearly this time).

      Changes in manuscript: The comparison between batch growth on glucose of different strains and batch growth on different carbon sources of a single strain is less emphasized. We have rewritten the section and rephrased our claims accordingly throughout the paper (notably in the Abstract, Introduction, and Discussion).

      b) The second example of this strain-genus confusion is more serious, and perhaps is enough to unravel the model. One of the strengths of the current framework is that although there are four degrees of freedom via the proteome allocation parameters, the model is sufficiently-constrained that the behavior can be meaningfully projected onto lower-dimensional observables like growth rate and yield (e.g. Figs 2BC).

      One of the main constraints in the model that allows this meaningful projection is the assumption that the fraction of ’housekeeping’ proteins φq is constant irrespective of strain and growth conditions (line 172) and that these proteins carry flux synthesizing non-protein macromolecules (lines 141-142). Neither of these claims is supported by the references provided.

      The ’housekeeping’ fraction φq was inferred in Scott et al. 2010 (line 172) from a nearly-growthmedium-independent maximum in the RNA/protein ratio under translation limitation of strain MG1655. The magnitude of that intercept is highly strain-dependent and can vary nearly 2-fold, especially in ALE strains. Furthermore, subsequent proteomic data (e.g. Hui et al. 2015 cited by the authors) has clarified that this ’housekeeping’ fraction is, for the most part, composed of growth-rate independent offsets in the metabolic proteins.

      The origin of these offsets is thought to be related to substrate-saturation (Eqs. 1 and 2 of Dourado et al. 2021 cited by the authors) and consequently, these offsets (and by extension most of φq) carry no flux. Substrate saturation is perhaps at the root of the discrepancy in the Fig. 4 fits that necessitates adjustment of the catalytic constants (line 338). It is not correct to say that ’external substrate concentration S is assumed constant’ (bottom p. 25) therefore the catabolic rate vmc is an environment-dependent [i.e. substrate-concentration-independent] parameter. The ’mc’ proteins include carbon uptake and metabolism (e.g. Fig 1, or Table 2) so that intracellular changes in S could arise from strain differences thereby affecting vmc and the magnitude of the ‘housekeeping’ fraction.

      It is not clear to me how the predictive power of the model will be affected by relaxing the constant φq assumption and replacing it with the more justifiable assumption that all metabolic proteins contribute some small fraction to φq based upon substrate saturation.

      The reviewer criticizes two assumptions made in the construction and analysis of the model: (i) the fraction of housekeeping proteins is constant irrespective of strain and growth conditions, and (ii) the housekeeping proteins carry flux because they synthesize macromolecules other than proteins. Below, we summarize how we have tried to clarify these assumptions and which additional work we have performed to build model variants relaxing the assumptions.

      We identified the housekeeping protein category with the Q-sector in the original paper of Scott et al. [13], which was misleading. The Hwa group indeed defines the Q-sector as not carrying flux [7], whereas we do allow this for the housekeeping protein category. Our housekeeping protein category, which we refer to as ”other proteins” or ”residual proteins” (Mu) in the new version of the manuscript, consists of all proteins not labelled as proteins in the categories of ribosomes and translation-affiliated proteins (R), enzymes in central carbon metabolism (Mc), or enzymes in energy metabolism (Mer+Mef). Mu carries flux, because it includes (among other things) the machinery for DNA and RNA synthesis (DNA polymerase, RNA polymerase, ...). When plotting the proteome fraction of this category determined from the data of Schmidt et al. [12], we found that the fraction remains approximately constant over a large range of growth conditions. This motivated the simplifying assumption to keep the proteome fraction for Mu constant in the simulations.

      The reviewer is right, however, that this may not be the case when considering a variety of E. coli strains growing on glucose, especially the strains resulting from laboratory evolution experiments. We have therefore redone the simulations while allowing the Mu category to vary, by a percentage corresponding to experimentally-observed variations of this category over the range of growth conditions considered by Schmidt et al. [12] (Supplementary Figure 1). In comparison with the original results, the relaxation of this condition enlarges the attainable range of growth rates by about 10%, but the overall shape of the cloud of rate-yield phenotypes remains the same. These new simulation results are shown in the main figures of the revised manuscript.

      In parallel, we have developed a model variant that includes a Q category in the sense of Scott et al., defined by the (growth-rate independent) offsets of the linear relations between growth rate and protein fractions [7]. We have retained an Mu category of other proteins in the model, interpreted as consisting of the growth-rate dependent fraction of other proteins, including the molecular machinery responsible for the synthesis of other macromolecules. Whereas the Mu category carries a flux, this is not the case for the Q category. We have calibrated the model variant from the same data as the original model, and predicted the admissible rate-yield phenotypes. While the cloud of predicted rate-yield phenotypes is slightly displaced in comparison with the reference model, the overall qualitative shape is the same. We explain this robustness by the fact that, despite the different interpretation of the protein categories, the models are structurally very similar and calibrated from data for the same reference strain. This gives rise to different values of the catalytic constants, which compensate for the differences in protein concentrations. Note that more data are needed for the calibration of the model with the Q category, because it requires estimation of the growth-rate-independent proteome fraction for all individual protein categories. In particular, in addition to carbon limitation, conditions of nitrogen and sulfur limitation are necessary [7]. In the absence of such data, additional assumptions need to be made, as we have explained in the new version of the manuscript.

      We could not find a discussion of the relation between substrate saturation and growth-rate independent offsets in proteomics data in the paper by Dourado et al. [2]. In the revised version of the manuscript, however, we have exploited their idea to compare substrate saturation for different predicted and observed rate-yield phenotypes. As a prerequisite, this has required a refinement of the estimation of the half-saturation constants during model calibration, for which we have used the dataset of Km values collected by Dourado et al. [2]. The finding that high-rate, high-yield growth comes with high substrate saturation, indicating an efficient utilization of proteomic resources, has been given more emphasis in the revised manuscript. Note that each resource allocation strategy will give rise to a different concentration of metabolites, and therefore to a different level of substrate saturation of the enzymes.

      The reviewer is right that the phrase ”the external substrate concentration S is assumed constant” is not correct for batch growth, although it approximately holds for continuous growth in a chemostat. In the case of balanced growth in batch, the external substrate concentration S is much higher than the half-saturation constant ), so that the kinetic equation for the macroreaction can be approximated by vmc = mc es, where es = kmc. In the revised manuscript, we have explicitly distinguished between these two situations (batch and continuous growth). Note that S is not the intracellular, but the extracellular concentration of substrate.

      Changes in manuscript: We have better explained the meaning of the residual protein category Mu and corrected the misleading identification of this category with the Q-sector of Scott et al. [13] in the section Coarse-grained model with coupled carbon and energy fluxes and in Appendix 1. In new subsections of Appendix 1 and Appendix 2, we discuss the construction and calibration of a model variant with an additional growth-rate independent protein category corresponding to the Q-sector of Scott et al.. In the Discussion, we explain that the rate-yield predictions obtained from this model and the reference model are essentially the same, indicating the robustness of the model predictions.

      We have redone all simulations using a resource allocation parameter for the housekeeping protein fraction Mu that is allowed to vary within experimentally-determined bounds (Coarsegrained model with coupled carbon and energy fluxes and Methods). The bounds are determined from the data of Schmidt et al. [12], as shown in the new Supplementary Figure 1. These simulations also include refined estimates for the half-saturation constants in the metabolic macroreactions.

      In the final Results section, Resource allocation strategies enabling fast and efficient growth of Escherichia coli, we develop the point that higher saturation of enzymes and ribosomes is key to high-rate, high-yield growth of E. coli, in agreement with observations from other recent studies [2, 5, 9]. In Appendix 1, we emphasize that S is the extracellular substrate concentration and we distinguish between simplifications of vmc for batch and continuous growth.

    1. Author Response

      Reviewer #1 (Public Review):

      Castelán-Sánchez et al. analyzed SARS-CoV-2 genomes from Mexico collected between February 2020 and November 2021. This period spans three major spikes in daily COVID-19 cases in Mexico and the rise of three distinct variants of concern (VOCs; B.1.1.7, P.1., and B.1.617.2). The authors perform careful phylogenetic analyses of these three VOCs, as well as two other lineages that rose to substantial frequency in Mexico, focusing on identifying periods of cryptic transmission (before the lineage was first detected) and introductions to and from the neighboring United States. The figures are well presented and described, and the results add to our understanding of SARS-CoV-2 in Mexico. However, I have some concerns and questions about sampling that could affect the results and conclusions. The authors do not provide any details on the distribution of samples across the various Mexican States, making it hard to evaluate several key conclusions. Although this information is provided in Supplementary Data 2, it is not presented in a way that enables the reader to evaluate if lineages were truly predominant in certain regions of the country, or if these results are attributable purely to sampling bias. Specifically, each lineage is said to be dominant in a particular state or region, but it was not clear to me if sampling across states was even at all-time points. For example, the authors state that most B.1.1.7 genome sampling is from the state of Chihuahua, but it is not clear if this was due to more sequenced samples from that region during the time that B.1.1.7 was circulating, or if the effects of B.1.1.7 were truly differential across the country. The authors do mention sequencing biases several times, but need to be more specific about the nature of this bias and how it could affect their conclusions. It is surprising to see in this manuscript that the B.1.1.7 lineage did not rise above 25% prevalence in the data presented, despite its rapid rise in prevalence in many other parts of the world. This calls into question if the presented frequencies of each lineage are truly representative of what was circulating in Mexico at the time, especially since the coordinated sampling and surveillance program across Mexico did not start until May 2021.

      We thank the reviewer for the constructive comments. We recognize the need to better explain how the sequencing efforts in the country were set up and carried out, and this has now been clarified throughout the main text (L43-51, L95-105). A new figure comparing the overall cumulative proportion of genomes generated per state between 2020-2021 is now available as Supplementary Figure 1 c. The cumulative proportion of genomes sampled across states per lineage of interest, and corresponding to the period of circulation of the given lineage, were originally provided as maps in Figures 2-4. This has been further clarified in the Results section and in the corresponding figure legends. We also now provide additional maps representing the geographic distribution of the clades identified per lineage, integrating in the figures the information previously available in Supplementary Data 2, Supplementary Figures 4 and 5. As a note, for our analyses, we used the total cumulative genome data available from the country (and not only that generated by CoViGen-Mex, representing one third of the SARS-CoV-2 genomes from Mexico). This is expected to improve any sampling biases related to the scheme adopted by CoViGenMex, and is now clearly stated in the main text.

      However, we believe that there has been a misunderstanding related to the genome sampling scheme adopted by CoViGen-Mex, as ‘coordinated sampling and surveillance program across Mexico did not start until May 2021’. Although it is true that further improvements were implemented after this date (enabling genome sampling and sequencing to become more homogenous across the country), the overall virus genome sequencing in Mexico was already sufficient from February 2021. This is represented by the cumulative number of viral genomes sequenced throughout 2020-2021 (both by CoViGen-Mex and other contributing institutions) correlating to the number of cases officially reported in the country during this time (see Supplementary Figure 1 a). This has now been clarified in the Results section (L94-105). Therefore, we hold that “SARS-CoV-2 sequencing in Mexico has been sufficient to explore the spatial and temporal frequency of viral lineages across national territory, and now to further investigate the number of lineage-specific introduction events, and to characterize the extension and geographic distribution of associated transmission chains, as we present in this study” (L102-105). In this context, “a more homogenous sampling across the country is unlikely to impact our main findings, but could i) help pinpoint additional clades we are currently unable to detect, ii) provide further details on the geographic distribution of clades across other regions of the country, and iii) deliver a higher resolution for the viral spread reconstructions we present” (discussed in L466-470).

      For the B.1.1.7 lineage in Mexico, we have clarified the issue raised as follows: “during its circulation period, most B.1.1.7 genomes from Mexico were generated from the state of Chihuahua, with these representing the earliest B.1.1.7-assigned genomes from the country. However, our phylodynamic analysis revealed that only a small proportion of these grouped within a larger clade denoting an extended transmission chain (C2a), with the rest falling within minor clusters, or representing singleton events. Relative to other states, Chihuahua generated an overall lower proportion of viral genomes throughout 2020-2021. Thus, more viral genomes sequenced from a particular state does not necessarily translate into more well-supported clades denoting extended transmission chains, whilst the geographic distribution of clades is somewhat independent to the genome sampling across the country.” (L202-211). Again, these observations are supported by a sufficient overall genome sampling from Mexico.

      We would further like to make clear that “our results confirm that the B.1.1.7 lineage reached an overall lower sampling frequency of up to 25% (relative to other virus lineages circulating in the country), as was noted prior to this study (for example, see Zárate et al. 2022)” (L189-193). As similar observations were independently made for other Latin American countries such as Brazil, Chile, and Peru (some with better genome representation than others, like Brazil https://www.gisaid.org/), it is possible that “the overall epidemiological dynamics of the B.1.1.7 in Latin America may have substantially differed from what was observed in the USA and UK. Such differences could be partly explained by competition between cocirculating lineages, exemplified in Mexico by the regional co-circulation of B.1.1.7, P.1 and B.1.1.519. Nonetheless, the lack of a representative number of viral genomes for most of these countries prevents exploring such hypothesis at a larger scale, and further highlights the need to strengthen genomic epidemiology-based surveillance across the region” (now discussed in L372-379). We hope the reviewer considers that the issues raised have now been resolved.

      Reviewer #2 (Public Review):

      The authors use a series of subsampling methods based on phylogenetic placement and geographic setting, informed by human movement data to control for differences in sampling of SARS-CoV-2 genomes across countries. Of note, the authors show that 2 variants likely arose in Mexico and spread via multiple introductions globally, while other variant waves were driven by repeat introductions into Mexico from elsewhere. Finally, they use human mobility data to assess the impact of movement on transmission within Mexico. Overall, the study is well done and provides nice data on an under-studied country. The authors take a thoughtful approach to subsampling and provide a very thorough analysis. Because of the care given to subsampling and the great challenge that proper subsampling represents for the field of phylodynamics, the paper would benefit from a more thorough exploration of how their migration-informed subsampling procedure impacts their results. This would not only help strengthen the findings of the paper, but would likely provide a useful reference for others doing similar studies. Additionally, I would suggest the authors provide a bit more discussion of this subsampling approach and how it may be useful to others in the discussion section of the paper.

      We thank the reviewer for the constructive comments, and appreciate the recognition of our sub-sampling scheme as a valuable tool with potential application in other studies. We acknowledge the need for a ‘more thorough exploration and discussion of how a different migration-informed subsampling approach could impact our results’. To address this issue, “we further sought to validate our migration-informed genome subsampling scheme (applied to B.1.617.2+, representing the best sampled lineage in Mexico). For this, an independent dataset was built using a different migration sub-sampling approach, comprising all countries represented by B.1.617.2+ sequences deposited in GISAID (available up to November 30th 2021). In order to compare the number of introduction events, the new dataset was analysed independently under a time-scaled DTA (as described in Methods Section 4).” (L517-524). In the new dataset, <100 genome sequences from the USA were retained for further analysis (Supplementary Figure 2b), compared to approximately 2000 ‘USA’ genome sequences included in the original B.1.617.2+ alignment. Thus, we expected a lower number of inferred introduction events into Mexico, as an undersampling of viral genome sequences from the USA is likely to result in ‘Mexico’ clades not fully segregating (particularly impacting C5d).

      Our original results revealed a minimum number of 142 introduction events into Mexico (95% HPD interval = [125-148]), with 6 clades identified as denoting extended transmission chains. The DTA results derived from the new dataset (subsampling all countries) revealed a minimum number of 84 introduction events into Mexico (95% HPD interval = [81-87]), with again 6 major clades identified. Thus, a significantly lower number of introduction events into Mexico were inferred, as was expected. On the other hand, the number of clades identified were consistent between both datasets, supporting for the robustness of our phylogenetic methodological approach. However, in the new dataset, we observe that C5d displayed a reduced diversity (represented by the AY.113 and AY.100 genomes from Mexico, but excluded the B.1.617.2 genome sampled from the USA). This highlights the relevance of our genome sub-sampling using migration data as a proxy.

      In further agreement with these observations, publicly available data on global human mobility (https://migration-demography-tools.jrc.ec.europa.eu/data- hub/index.html?state=5d6005b30045242cabd750a2) shows that migration into Mexico is mostly represented by movements from the USA, followed by Indonesia, Guatemala, Belize and Colombia and Belize. However, the volume of movements from the USA into Mexico is much higher (up to 6 orders of magnitude above the volumes recorded into Mexico from any other country).

      Given time constraints related to performing additional analyses, we decided to exclude the subsampling scheme for ‘top ten countries’ suggested by the reviewer. However, we consider that the results derived from the comparison between the original and the new dataset (top-5 vs all countries) is sufficient to support for our migration-informed subsampling approach. A full description of the methodology and the result obtained, as well as a short discussion, is now available as Supplementary Text 2, and Supplementary Figure 2b and 2c. We hope the reviewer considers that the issues raised has been addressed.

    2. eLife assessment

      The authors document an in-depth analysis of introduction patterns of 5 variant waves in Mexico. This is an important analysis and dataset since the genomic epidemiology of SARS-CoV-2 in Mexico is generally understudied, and this paper contributes important missing information. The phylogenetic analyses are solid and well-presented, but the lack of detail regarding the collection of samples across Mexican states makes it difficult to evaluate conclusions about the relationship between observed viral lineages and local case counts. Additionally, in its current form, the manuscript is mostly descriptive, without clear hypotheses tested or discussion of implications.

    3. Reviewer #1 (Public Review):

      Castelán-Sánchez et al. analyzed SARS-CoV-2 genomes from Mexico collected between February 2020 and November 2021. This period spans three major spikes in daily COVID-19 cases in Mexico and the rise of three distinct variants of concern (VOCs; B.1.1.7, P.1., and B.1.617.2). The authors perform careful phylogenetic analyses of these three VOCs, as well as two other lineages that rose to substantial frequency in Mexico, focusing on identifying periods of cryptic transmission (before the lineage was first detected) and introductions to and from the neighboring United States. The figures are well presented and described, and the results add to our understanding of SARS-CoV-2 in Mexico. However, I have some concerns and questions about sampling that could affect the results and conclusions:

      1) The authors do not provide any details on the distribution of samples across the various Mexican States, making it hard to evaluate several key conclusions. Although this information is provided in Supplementary Data 2, it is not presented in a way that enables the reader to evaluate if lineages were truly predominant in certain regions of the country, or if these results are attributable purely to sampling bias. Specifically, each lineage is said to be dominant in a particular state or region, but it was not clear to me if sampling across states was even at all time points. For example, the authors state that most B.1.1.7 genome sampling is from the state of Chihuahua, but it is not clear if this was due to more sequenced samples from that region during the time that B.1.1.7 was circulating, or if the effects of B.1.1.7 were truly differential across the country. The authors do mention sequencing biases several times but need to be more specific about the nature of this bias and how it could affect their conclusions.

      2) It is surprising to see in this manuscript that the B.1.1.7 lineage did not rise above 25% prevalence in the data presented, despite its rapid rise in prevalence in many other parts of the world. This calls into question if the presented frequencies of each lineage are truly representative of what was circulating in Mexico at the time, especially since the coordinated sampling and surveillance program across Mexico did not start until May 2021.

    4. Reviewer #2 (Public Review):

      The authors use a series of subsampling methods based on phylogenetic placement and geographic setting, informed by human movement data to control for differences in sampling of SARS-CoV-2 genomes across countries. Of note, the authors show that 2 variants likely arose in Mexico and spread via multiple introductions globally, while other variant waves were driven by repeat introductions into Mexico from elsewhere. Finally, they use human mobility data to assess the impact of movement on transmission within Mexico.

      Overall, the study is well done and provides nice data on an under-studied country. The authors take a thoughtful approach to subsampling and provide a very thorough analysis. Because of the care given to subsampling and the great challenge that proper subsampling represents for the field of phylodynamics, the paper would benefit from a more thorough exploration of how their migration-informed subsampling procedure impacts their results. This would not only help strengthen the findings of the paper but would likely provide a useful reference for others doing similar studies. Additionally, I would suggest the authors provide a bit more discussion of this subsampling approach and how it may be useful to others in the discussion section of the paper.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors sought to identify the relationship between social touch experiences and the endogenous release of oxytocin and cortisol. Female participants who received a touch from their romantic partner before a stranger exhibited a blunted hormonal response compared to when the stranger was the first toucher, suggesting that social touch history and context influence subsequent touch experiences. Concurrent fMRI recordings identified key brain networks whose activity corresponded to hormonal changes and self-report.

      The strengths of the manuscript are in the power achieved by collecting multi-faceted metrics: plasma hormones across time, BOLD signal, and self-report. The experiment was cleverly designed and nicely counterbalanced. Data analysis was thorough and statistically sophisticated, making the findings and conclusions convincing.

      This work sheds new light on potential mechanisms underlying how humans place social experiences in context, demonstrating how oxytocin and cortisol might interact to modulate higher-level processing and contextualizing of familiar vs. stranger encounters.

      Thank you very much for this generous evaluation of the study.

      Reviewer #2 (Public Review):

      To test how oxytocin impacts the brain and the psychological, neural, and hormonal response to touch, the authors tested human females during two counterbalanced fMRI sessions wherein females were stroked on the arm or the palm, by a real-world romantic partner or a stranger, while blood levels of oxytocin and cortisol were collected at multiple time points.

      This combination of measures, and the number of hypotheses that could be tested with them, is remarkable - virtually unheard of. This impressive, difficult, and more ecological design than is typical for the field is a major strength of the study, which allowed the authors to test many important hypotheses concurrently and to show contextual effects that could not otherwise be observed. The only potential drawback perhaps is that with such a large design, including many measures, the authors produced so many significant interactions and results that it could be hard for the casual reader to appreciate the importance of each.

      The authors supported their hypothesis that oxytocin effects are context-sensitive, as they found a key interaction wherein experiencing the partner first increased oxytocin for the partner relative to when they came first the OT levels were low but then increased if they were preceded by the partner (excepting one timepoint). Cortisol responses (which reflect hormonal stress) were also higher when the stranger came first than when he was preceded by the partner). In addition, touch was experienced more positively on the arm than on the palm, supporting the role of c-fibers in conveying specifically felt responses to warm, tender touch.

      These data indicate significant context sensitivity with real-world implications. For example, experiencing warm touch on the arm can make us more receptive to other people in subsequent encounters. Conversely, when strangers try to approach and get close to us "out of the blue" people experience this as stressful, which reduces the pleasantness of the interaction and may reduce trust in the moment...perhaps even subsequently.

      This research is critical to the basic science of neurohormonal modulation, given that most of this research occurs in rodents or in simplified studies in humans, usually through intranasal oxytocin administration with unclear impacts on circulating levels in the brain and blood. Oxytocin in particular has suffered from oversimplification as the "love drug" - wherein people assume that it always renders people more loving and trusting. The reality is more complex, as they showed, and these demonstrations are needed to clarify for the field and the public that neurohormones adaptively shift with the context, location, and identity of the social partner in an adaptive way. These results also help us understand the many null effects of oxytocin on trusting strangers in human neuroeconomic studies. In a modern world that is characterized by significant loneliness, interactions with strangers and outsiders, and touch-free digital interactions, our ability to understand the human need for genuine social contact and how it impacts our response to outsiders (welcomed in versus a source of stress) is critical to human health and the wellbeing of individuals and society.

      Thank you very much for this nice summary of the study and its implications.

      As you pointed out, the design was ambitious and involved a broad range of measures and levels of hypothesis-testing. This presented challenges in reporting the results. In this paper we have tried to provide interpretation of the basic results, such as that social encounters (even in the scanner environment) are sufficient to evoke changes in endogenous oxytocin levels over the course of the experimental session, and that various interactions arise due to an influence of contextual factors such as the familiarity of the person and the recent social interaction history. For the more complex results, such as the nature of relationships between BOLD signal change and the degree of change in individuals’ plasma oxytocin levels, we have tried to outline provisional interpretations.

      We hope that the picture will gradually become more filled-in by work from ours and others’ labs—maybe these findings and interpretations will look very different in a few years’ time. We consider this study a starting point for future research into the dynamics and function of human endogenous oxytocin.

      Reviewer #3 (Public Review):

      In an ambitious, multimodal effort, Handlin, Novembre et al. investigated how the endogenous release of oxytocin and cortisol as well as functional brain activity are modulated by social touch under different contextual circumstances (e.g. palm vs. arm touch, stranger vs. partner touch) in neurotypical female participants.

      Using serial sampling of plasma hormone levels in blood during concurrent functional MRI neuroimaging, the authors show that the familiarity of the interactant during social touch not only impacts current hormonal levels but also subsequent hormonal responses in a successive touch interaction. Specifically, endogenous oxytocin levels are significantly heightened (and cortisol levels dampened) during touch from a romantic partner compared to touch from an unfamiliar stranger, at least during the first touch interaction. During the second touch interaction, however, oxytocin levels plummeted when being touched by a stranger following partner touch (although a recovery was made), whereas the normally elevated oxytocin responses to partner touch were dampened when following stranger touch. These results are paralleled by similar familiarity- and order-related effects in neural regions involving the hypothalamus, dorsal raphe, and precuneus.

      However, an important distinction to be made is that, although a significant main effect of familiarity was encountered in several brain regions when taking peak plasma oxytocin levels into account, subsequent t-tests showed no activation differences in the BOLD response between partner and stranger touch within the same subjects. Significant interaction maps seem thus mainly driven by between-subject effects at the different time points, which is arguably due to differences between subjects in their initial calibration of neural/hormonal responses, and not session-to-session changes within the same subjects.

      A similar comment can be made for the reported covariance between (changes in) maximal oxytocin levels and (changes in) BOLD activity for the hypothalamus.

      In an effort to delineate the complex cascade of responses induced by afferent tactile stimulation, the authors report an exploratory regression analysis to identify BOLD activation that precedes the pattern of serial plasma changes in oxytocin levels (looking backwards; i.e. implying changes in brain activation drive changes in hormonal plasma levels). Although the authors are appropriately modest about the significance of the encountered effects, additional control analyses could bring further clarifications about the temporal (e.g., can similar covariations also be found when looking forward) and hormonal specificity (e.g. can similar findings be found for cortisol-variations) of the encountered results. Nevertheless, despite the 'dynamically' covarying relationships between BOLD and max plasma oxytocin levels (i.e. dynamic as in the sense across conditions, not across timepoints), claims about the directionality of this effect (i.e. 'hormonal neuromodulation' vs. 'neural modulation of hormonal levels') remain speculative.

      A particular strength of this study is the employment of a "female-first" strategy since experimental data concerning endogenous oxytocin levels in women are sparse. Adequate control analyses are reported to take potential variability due to differences in contraception and phase in the hormonal cycle into account.

      Thank you for your attentive reading of the study, and for raising several very important points.

      You are right that the BOLD activation maps showing interactions between the change in OT levels and other factors (familiarity, order) reflect differences between subjects in the two runs of the experiment. The effect of familiarity emerged from the full model for the whole group (all participants, whether they started with partner or stranger), as an interaction between the partner/stranger factor and the change in OT. As you point out, this reflects interindividual-level covariation between OT changes and BOLD changes. For example, individuals showing greater OT increase were also more likely to show higher BOLD in certain clusters during partner compared to stranger touch. Similarly, the partner vs stranger contrast showing hypothalamus and Raphe reflects greater OT-BOLD covariance in the stranger first compared to the partner fist groups: in the stranger first group, BOLD was greater the lower the mean OT was across individuals.

      The t-tests with OT as covariate further indicate that the interaction was driven by group differences in the second run. As you point out, within groups (partner or stranger first), there was no significant change in the OT-BOLD covariance from the first to the second run, though these relationships were different between groups. We agree with you that this lack of difference in within-group OT-BOLD covariance from the first to the second run is likely because responses in the first run biased responses in the second run—but in different ways depending on whether the partner or the stranger was presented first. Both groups did show a meaningful correlation in mean OT levels between the first and the second run (we have now included this information in the paper).

      In general, we agree that it is very important to make clear that, as in many covariation/correlation effects in fMRI studies, the effects are driven by interindividual differences for a given covariant relationship, rather than the within-subject BOLD response increasing or decreasing.

      We also agree that it is not possible to determine the direction of modulation from these results. The creation of the temporal OT regressor as “backward-looking” was informed by evidence from animal models for central-to-peripheral effects from hypothalamus to pituitary to bloodstream. We assumed this directionality in the analysis. Given the exploratory nature of this regressor, “looking forward” from temporal OT sample patterns to BOLD patterns with different time intervals would be an equally valid approach. It could reveal activation related to any systematic influence of peripheral OT levels on cortical responses. As the premise of the temporal OT regressor analysis in the present study was any assumed central-to-peripheral modulation, we have kept this as the focus but will explore any specific peripheral-to-central covariation in future work.

      We believe that the full causal picture is likely to involve bidirectional modulation: a modulatory loop (or even loops) in which peripheral and central changes influence one another. Unfortunately, it is difficult to address such temporal feedback with the poor time resolution of fMRI.

    1. Author Response

      Reviewer #1 (Public Review):

      This is one of the most careful analyses of sexual dimorphism in dinosaurs, based on a remarkable assemblage of 61 ornithomimosaur fossils from the Early Cretaceous of western France. The dimorphism is expressed in variations in the shaft curvature and the distal epiphysis width, analysed appropriately here and plausible because these are the kinds of morphological features that vary between males and females among birds and crocodilians, among others.

      In the Introduction, it is right to highlight the shortage of convincing cases of demonstrated sexual dimorphism (SD) in dinosaurs. But note the points made by Hone, Saitta and others that SD can exist in many species today without major morphological differences, making it hard to demonstrate in fossils with such types of dimorphism. Also, some proposed statistical tests to ensure that SD has been convincingly demonstrated in fossils are so stringent they would be hard ever to pass (requiring enormous and constant morphological distinctiveness). In other words, we are conditioned not to find SD in dinosaurs, and yet may be massively under-reporting it because of preservation difficulties (of course) but also because of some overly rigorous demands for proof. These issues help argue that the current study is especially valuable because the data set is large (itself a rarity), and 3D bone shape analysis and proper statistical testing have been applied.

      We are grateful that Reviewer 1 raised this point regarding the occurrence of many subtle sexual dimorphism among modern populations, and added a sentence in the introduction, to further emphasize the importance of a large dataset composed of coeval organisms.

      It's interesting the dinosaur example shows the same two dimorphic traits (femoral obliquity = bicondylar angle; width of distal epiphysis = bicondylar breadth) seen in mammals (MS, lines 117-123), where the femur angle may vary because of the need for broader hips in the female to accommodate the birth canal, and yet dinosaurs laid eggs. These are small dinosaurs, so perhaps their eggs were relatively large in proportion to body size. Perhaps the authors could comment on this. There is some discussion with regard to modern birds at MS lines 187-199.

      We agree with comments from Reviewer 1 and we raise the question of egg possibly constraining the pelvic and proximal hindlimb morphology from line 170 to 189 and how it relates to modern archosaurs from line 189 to 202. We also originally intended to discuss how the Kiwi hindlimb morphology accommodates large eggs, but no significant dimorphism was demonstrated in the pelvic and hindlimb morphology of this bird.

    1. Author Response

      Reviewer #2 (Public review):

      Ansari et al. describe a web-based software for the design of guide RNA (gRNA) sequences and primers for CRISPR-Cas-based identification of single nucleotide variants (SNVs). The use of CRISPR-Cas to rapidly identify specific mutations in both cancer and infection is an evolving field with good potential to play a role in future research and diagnostics.

      The software described by Ansari et al. is easy to use for the design of guide RNAs. The most important question is how good the gRNAs that the software suggests are. As such, the manuscript would benefit from better describing the parameters used for the gRNA design as well as including more validation experiments. Clearly, the scope of the manuscript is not about developing different detection methods, but I would argue that performing more wet lab experiments is needed to support the usability of the software.

      We thank the reviewer for taking interest in this manuscript and raising an important point about increasing the number of targets for our wet lab experiments. To address this, we have tried to include more supporting data in the updated version of the manuscript.

      Reviewer #3 (Public review):

      This manuscript by Ansari and coworkers describes CriSNPr, a tool for designing gRNAs for CRISPR-based diagnostics for SNP detection. CriSNPr allows one to design assays to detect human and SARS-CoV-2 mutations, positioning the mismatches for optimal detection based on results from the literature. Designs can be generated for six different CRISPR effector proteins. The authors test their approach by designing assays to detect a single SNV using three different CRISPR effectors. A strength of the manuscript is that the method does appear to work, at least for the E484K mutation, for multiple CRISPR effector proteins.

      The weaknesses of this manuscript are the lack of data demonstrating that the method works. There is only one very small experimental demonstration using a single mutation (Figure 4) and some very high-level analyses using two SNP/SNV databases (Figure 5). The authors do not provide any data to answer any basic questions about how well their designs work, how fast and easy it is to run their method, or which designs are predicted to work better than others. These weaknesses ultimately limit the impact of the work on the field, as it is not clear what the benefits of using the author's approach are versus simply applying the rules for the individual CRISPR effector proteins outlined in Figure 1 of the manuscript.

      We thank the reviewer for taking interest in this manuscript and appreciate the constructive feedback and suggestions. In the new version of this paper, we've added more data to back up other SNVs with different CRISPR systems and the CriSNPr pipeline for sgRNA design. Even in these datasets, we see that for particular SNVs, the choice of the CRISPR system used might affect the sensitivity of detecting the mutation (Figures 5 and 6). This would be a huge task to do again for multiple targets and targeting systems, which is outside the scope of this study. Importantly, such large datasets are currently missing for the different CRISPRDx systems since we have not come across studies where users have comparatively determined the best methodology for their assay. In our opinion, criSNPr gives users this opportunity by providing a unified platform, and our validation assays show how this can be done in a relatively fast manner.

      A stand-alone version of the server is made available for download at https://github.com/asgarhussain/CriSNPr to increase its speed and accessibility for the end user.

      Addressing the point of determining which crRNAs work best for a given assay requires a large amount of data on target SNPs for individual Cas systems, which is currently scarce. In the current version of CriSNPr, we have considered prioritizing crRNA mismatch-sensitive positions based on original published studies. For example, for AaCas12b, mismatch positions are ranked as follows: 1&4 > 1&5 > 4&11 > 4&16 > 5&8 > 5&11 > 16&19. Similarly, crRNA mismatch-sensitive positions for individual Cas systems (as shown in Figure 1) have been used to prioritize crRNAs. Improving on these design principles further would require studying the biology of individual Cas:DNA/RNA interactions, which is beyond the scope of this study. However, in the updated version of the CriSNPr, we attempted to improve the scoring algorithm by taking into account off-targets for a crRNA design, and priority is given to the combinatorial positions with the fewest off-targets as well as the weightage of their efficacy.

    1. eLife assessment

      This important work advances our understanding of the factors that affect the speed of colour evolution in birds and the resulting diversification patterns. It provides compelling evidence that more complex plumage coloration can lead to rapid colour evolution in kingfishers, and could pave the way for more comprehensive analyses that fully embrace the multidimensional nature of colour variation. Hence, the results will be of broad interest to ornithologists and evolutionary biologists in general, once the authors have streamlined the theoretical framework and explained the novel methodological approaches in more detail.

    2. Reviewer #1 (Public Review):

      Animal colour evolution is hard to study because colour variation is extremely complex. Colours can vary from dark to light, in their level of saturation, in their hue, and on top of that different parts of the body can have different colours as well, as can males and females. The consequence of this is that the colour phenotype of a species is highly dimensional, making statistical analyses challenging.

      Herein the authors explore how colour complexity and island versus mainland dwelling affect the rates of colour evolution in a colourful clade of birds: the kingfishers. Island-dwelling has been shown before to lead to less complex colour patterns and darker coloration in birds across the world, and the authors hypothesise that lower plumage complexity should lead to lower evolutionary rates. In this paper, the authors explore a variety of different and novel statistical approaches in detail to establish the mechanism behind these associations.

      There are three main findings: (1) rates of colour evolution are higher for species that have more complex colour phenotypes (e.g. multiple different colour patches), (2) rates of colour evolution are higher on island kingfishers, but (3) this is not because island kingfishers have a higher level of plumage complexity than their mainland counterparts.

      I think that the application of these multivariate methods to the study of colour evolution and the results could pave the way for new studies on colour evolution.

      I do, however, have a set of suggestions that should hopefully improve the robustness of results and clarity of the paper as detailed below:

      1) The two main hypotheses tested linking plumage complexity and island-dwelling to rates of colour evolution seem rather disjointed in the introduction. This section should integrate these two aspects better justifying why you are testing them in the same paper. In my opinion, the main topic of the paper is colour evolution, not island-mainland comparisons. I would suggest starting with colours and the challenges associated with the study of colour evolution and then introducing other relevant aspects.

      2) Title: the title refers to both complex plumage and island-dwelling, but the potential effects of complexity should apply regardless of being an island or mainland-dwelling species, am I right? Consider dropping the reference to islands in the title.

      3) The results encompass a large variety of statistical results some closely related to the main hypothesis (eg island/mainland differences) tested and others that seem more tangential (differences between body parts, sexes). Moreover, quite a few different approaches are used. I think that it would be good to be a bit more selective and concentrate the paper on the main hypotheses, in particular, because many results are not mentioned or discussed again outside the Results section.

      4) Related to the previous section, the variety of analytical approaches used is a bit bewildering and for the reader, it is unclear why different options were used in different sections. Again, streamlining would be highly desirable, and given the novel nature of the analytical approach (as far as I know, many analytical approaches are applied for the first time to study colour evolution) it would be good to properly explain them to the reader, highlighting their strengths and weaknesses.

      5) The Results section contains quite a bit of discussion (and methods) despite there being a separate Discussion section. I suggest either separating them better or joining them completely.

      6) The main analyses of colour evolutionary rates only include chromatic aspects of colour variation. Why was achromatic variation (i.e. light to dark variation) not included in the analyses? I think that such variation is an important part of the perceived colour (e.g. depending on their lightness the same spectral shape could be perceived as yellow or green, black or grey or white). I realize that this omission is not uncommon and I have done so myself in the past, but I think that in this case, it is highly relevant to include it in the analyses (also because previous work suggests that island birds are darker than their mainland counterparts). This should be possible, as achromatic variation may be estimated using double cone quantum catches (Siddiqi et al., 2004) and the appropriate noise-to-signal ratios (Olsson et al., 2018). Adding one extra dimension per plumage patch should not pose substantial computational difficulties, I think.

      7) The methods need to be much better explained. Currently, some methods are explained in the main text and some in the methods section. All methods should be explained in detail in the methods section and I suggest that it would be better to use a more traditional manuscript structure with Methods before Results (IMRaD), to avoid repetition (provided this is allowed by the journal). Whenever relevant the authors need to explain the choice of alternative approaches. Many functions used have different arguments that affect the outcome of the analyses, these need to be properly explained and justified. In general, most readers will not check the R script, and the methods should be understandable to readers that are not familiar with R. This is particularly important because I think that the methodological approach used will be one of the main attractions of the manuscript, and other researchers should be able to implement it on their own data with ease. Judging from the R script, there are quite a few analyses that were not reported in the manuscript (e.g. multivariate evolutionary rates being higher in forest species). This should be fixed/clarified.

    3. Reviewer #2 (Public Review):

      In "Complex plumages spur rapid color diversification in island kingfishers (Aves: Alcedinidae)", Eliason et al. link intraspecific plumage complexity with interspecific rates of plumage evolution. They demonstrate a correlation here and link this with the distinction between island and mainland taxa to create a compelling manuscript of general interest on drivers of phenotypic divergence and convergence in different settings.

      This will be a fantastic contribution to the literature on the evolution of plumage color and pattern and to our understanding of phenotypic divergence between mainland and island taxa. A few key revisions can help it get there. This paper needs to get, fairly quickly, up to a point where the difference between plumage complexity and color divergence is defined carefully. That should include hammering home that one is an intraspecific measure, while one is an interspecific measure. It took me three reads of the paper to be able to say this with confidence. Leading with that point will greatly improve the paper if that point gets forgotten then the premise of the paper feels very circular.

      Also importantly, somewhere early on a hypothesized causal pathway by which insularity, plumage complexity, and color divergence interact needs to be laid out. The analyses that currently follow are good ones, and not wrong, but it's challenging to assess whether they are the right ones to run because I'm not following the authors' reasoning very well here. I think it's possible a more holistic analysis could be done here, but I'll refrain from any such suggestions until I better get what the authors are trying to link.

      We also need something near the top that tells us a bit more about the biogeography of kingfishers. Are kingfisher species always allopatric? I know the answer is no, but not all readers will. What I know less well though is whether your insular species are usually allopatric. I suspect the answer is yes, but I don't actually know.

      In short, how do the authors think allopatry/sympatry/opportunity for competition link to mainland vs. island link to plumage complexity? And rates of color evolution? Make this clear upfront.

    4. Reviewer #3 (Public Review):

      In this article, the authors examined color evolution in the kingfishers, a group of birds that have achieved a spectacular diversity of colors and color patterns as they have diverged across the continents and island chains of the globe. Like many other avian taxa, kingfishers on islands often exhibit color patterns distinct from their close relatives. The authors focus here on putting this informally recognized pattern of evolutionary change to a formal test, asking if plumage color diversity and evolutionary rate are elevated on islands. They also explore whether a notable characteristic of some kingfishers - their simultaneous use of many of the coloration mechanisms available in birds - contributes to the evolutionary lability of their color patterns.

      The authors have previously explored how when color varies in birds it is not just in dimensions of color, but also in the distribution of those colors in patches on the body. Summarizing this variation is challenging, and there are statistical obstacles to comparing it in a holistic manner. In this study, the authors use an exceptional set of analyses to study color in total as a multivariate trait. These are the major strengths of the paper. The authors' efforts are somewhat less convincing when they pursue a univariate model fitting on a small number of principal components, but these analyses are not central to the study. And as with all studies using ancestral state reconstruction to test hypotheses, it's an important tool and one that contributes to this study's effectiveness, but we should acknowledge some level of uncertainty with its results.

      The authors report two important relationships in this study. They provide convincing evidence that rates of color evolution are elevated in island kingfishers, without convergence towards a particular island phenotype. They also describe a relationship between the complexity of plumage patterns and the rate at which they evolve, which has fundamental implications for our understanding of the tempo of trait evolution.

      Islands make up a tiny portion of the earth's surface but are home to a seemingly disproportionate amount of life's diversity. This paper makes an important contribution to our understanding of how this diversity is generated, by showing that the evolutionary rate is elevated on islands for traits relevant to mate choice and recognition. The authors find that "plumage complexity, rather than uniformity, provides more phenotypic traits for natural selection to act upon". Given the number of different coloration mechanisms they express, the kingfishers are a unique group in which to study this issue, so I look forward to reading and hearing more from the authors on this issue in the future.

    1. eLife assessment

      This valuable manuscript describes a fully automated touchscreen cognitive testing system for rats that reduces the length of training required to learn a task and eliminates the need for daily handling. These features make it possible to assess cognitive behaviors in conjunction with other neurobehavioral paradigms during adolescence, an important advance in the field. The data convincingly show that cognitive flexibility does not promote susceptibility to severe weight loss in the activity-based anorexia (ABA) paradigm. However, support for the claim that cognitive deficits seen in rats that had been exposed ABA adequately capture an important clinical feature of the pathophysiology of anorexia nervosa is incompletely supported.

    2. Reviewer #1 (Public Review):

      In this manuscript, Huang et al., assess cognitive flexibility in rats trained on an animal model of anorexia nervosa known as activity-based anorexia (ABA). For the first time, they do this in a way that is fully automated and free from experimenter interference, as apparently experimenter interference can affect both the development of ABA as well as the effect on behaviour. They show that animals that are more cognitively flexible (i.e. animals that had received reversal training) were better able to resist weight loss upon exposure to ABA, whereas animals exposed to ABA first show poorer cognitive flexibility (reversal performance).

      Strengths:<br /> - The development of a fully-automated, experimenter-free behavioural assessment paradigm that is capable of identifying individual rats and therefore tracking their performance.<br /> - The bidirectional nature of the study - i.e. the fact that animals were tested for cognitive flexibility both before and after exposure to ABA, so that direction of causality could be established.<br /> - The analyses are rigorous and the sample sizes sufficient.<br /> - The use of touchscreens increases the translational potential of the findings.

      Weaknesses<br /> - Some descriptions of methods and results are confusing or insufficiently detailed.<br /> - It seems to me that performance on the pairwise discrimination task cannot be directly (statistically) compared to performance on reversal (as in Figure 4E), as these are tapping into fundamentally different cognitive processes (discrimination versus reversal learning). I think comparing groups on each assessment is valid, however.<br /> - Not necessarily a 'weakness' but I would have loved to see some assessment of the alterations in neural mechanisms underlying these effects, and/or some different behavioural assessments in addition to those used here. In particular, the authors mention in the discussion that this manipulation can affect cholinergic functioning in the dorsal striatum We (Bradfield et al., Neuron, 2013) and a number of others have now demonstrated that cholinergic dysfunction in the dorsomedial striatum impairs a different kind of reversal learning that based on alterations in outcome identity and thus relies on a different cognitive process (i.e. 'state' rather than 'reward' prediction error). It would be interesting perhaps in the future to see if the ABA manipulation also alters performance on this alternative 'cognitive flexibility' task.

      Nevertheless, I certainly think the manuscript provides a solid appraisal of cognitive flexibility using more traditional tasks, and that the authors have achieved their aims. I think the work here will be of importance, certainly to other researchers using the ABA model, but perhaps also of translational importance in the future, as the causal relationship between ABA and cognitive inflexibility is near impossible to establish using human studies, but here evidence points strongly towards this being the case.

    3. Reviewer #2 (Public Review):

      Huang and colleagues present data from experiments assessing the role of cognitive inflexibility in the vulnerability to weight loss in the activity-based anorexia paradigm in rats. The experiments employ a novel in-home cage touchscreen system. The home cage touch screen system allows reduced testing time and increased throughput compared with the more widely used systems resulting in the ability to assess ABA following testing cognitive flexibility in relatively young female rats. The data demonstrate that, contrary to expectations, cognitive inflexibility does not predispose to greater ABA weight loss, but instead, rats that performed better in the reversal learning task lost more weight in the ABA paradigm. Prior ABA exposure resulted in poorer learning of the task and reversal. An additional experiment demonstrated that rats that had been trained in reversal learning resisted weight loss in the ABA paradigm. The findings are important and are clearly presented. They have implications for anorexia nervosa both in terms of potentially identifying those at risk also in understanding the high rates of relapse.

    4. Reviewer #3 (Public Review):

      Activity-based anorexia (ABA), which combines access to a running wheel and restricted access to food, is a most common paradigm used to study anorexic behavior in rodents. And yet, the field has been plagued by persistent questions about its validity as a model of anorexia nervosa (AN) in humans. This group's previous studies supported the idea that the ABA paradigm captures cognitive inflexibility seen in AN. Here they describe a fully automated touchscreen cognitive testing system for rats that makes it possible to ask whether cognitive inflexibility predisposes individuals to severe weight loss in the ABA paradigm. They observed that cognitive inflexibility was predictive of resistance to weight loss in the ABA, the opposite of what was predicted. They also reported reciprocal effects of ABA and cognitive testing on subsequent performance in the other paradigm. Prior exposure to the ABA decreased subsequent cognitive performance, while prior exposure to the cognitive task promoted resistance to the ABA. Based on these findings, the authors argue that the ABA model can be used to identify novel therapeutic targets for AN.

      The strength of this manuscript is primarily as a methods paper describing a novel automated cognitive behavioral testing system that obviates the need for experimentalist handling and single housing, which can interfere with behavioral testing, and accelerate learning on the task. Together, these features make it feasible to perform longitudinal studies to ask whether cognitive performance is predictive of behavior in a second paradigm during adolescence, a peak period of vulnerability for many psychiatric disorders. The authors also used machine learning tools to identify specific behaviors during the cognitive task that predicted later susceptibility to the ABA paradigm. While the benefits of this system are clear, the rigor and reproducibility of experiments using this paradigm would be enhanced if the authors provided clear guidelines about which parameters and analyses are most useful. In their absence, the large amount of data generated can promote p-hacking.

      The authors use their automated behavioral testing paradigm to ask whether cognitive inflexibility is a cause or consequence of susceptibility to ABA, an issue that cannot be addressed in AN. They provide compelling evidence that there are reciprocal effects of the two behavioral paradigms, but do not perform the controls needed to evaluate the significance of these observations. For example, the learning task involves sucrose consumption and food restriction, conditions that can independently affect susceptibility to the ABA. Similarly, the ABA paradigm involves exercise and restricted access to food, which can both affect learning.

      In the Discussion, the authors hypothesize that the ABA paradigm produces cognitive inflexibility and argue that uncovering the underlying mechanism can be used to identify new therapeutic targets for AN. The rationale for their claim of translational relevance is undermined by the fact that the biggest effect of the ABA paradigm is seen in the pair discrimination task, and not reversal learning. This pattern does not fit clinical observations in AN.

      In summary, the significance of this manuscript lies in the development of a new system to test cognitive function in rats that can be combined with other paradigms to explore questions of causality. While the authors clearly demonstrate that cognitive flexibility does not promote susceptibility to ABA, the experiments presented do not provide a compelling case that their model captures important features of the pathophysiology of AN.

    1. eLife assessment

      This study provides fundamental insights into the relationship between single neuron activity in superficial layers of the cortex and electrical signals recorded at the cortical surface. Based on solid measurements, the results indicate a weak correlation between individual layer 2/3 neuron activity and multiunit activity recorded at the surface, whose interpretation could be reinforced. In particular, a strong contribution of layer 1 axons to surface signals is suggested but relies on incomplete evidence.

    2. Reviewer #1 (Public Review):

      This article describes simultaneous surface recordings with a transparent electrode array and two-photon calcium imaging in the mouse cortex. The study shows that spiking activity recorded by surface electrodes or imaged layer 2/3 activity is decoupled. Moreover, simulations indicate that this decoupling may be due to a dominance of L1 projecting axons (input to the cortex) in surface spiking activity.

      This is a rigorous study capitalizing on the new Windansee surface recording device, which provides extremely useful evidence that surface electrodes may not be able to capture information processed in the cortical layers. Recordings and simulations seem adequately performed. The indication that axons contribute significantly to multiunit activity is extremely important for the interpretation of multiunit activity in surface recordings. Here the claim is limited to surface recording, and one wonders to which extent this conclusion would transpose to recordings made with penetration electrodes.

    3. Reviewer #2 (Public Review):

      The manuscript describes a novel transparent electrode array and demonstrates its combination with two-photon calcium imaging in mouse neocortex. Using a computational model, the authors propose that surface multi-unit activity mainly reflects L1 axonal activity and they find a small population of L2/3 neurons that correlates with this activity. While the multi-modal approach with the innovative device in our view is interesting and potentially useful, we have several technical and scientific concerns that should be addressed by the authors.

      Strengths:<br /> We find the general scope of this manuscript, to establish a hybrid electrophysiological and optical approach for studying neocortical activity, very interesting and relevant. The authors provide a compelling use case for combined ECoG and two-photon imaging. While extracellular action potentials have been recorded from the cortical surface, the underlying source is unknown and the device and techniques introduced by the authors are appropriate to address this question. The introduced device can be implanted chronically and has good long-term stability, providing longitudinal optical and electrical recordings from the cortex. The authors perform recordings in awake, head-fixed animals which provides the opportunity to relate ECoG and single-cell data to the animal's behavioral state. The combination of empirical data and biophysical modelling is a powerful means by which to answer such questions.

      Weaknesses:<br /> The central claim of the paper relies heavily on the computational model and the physiological data could be more completely analyzed. Based on a sample of 136 L2/3 neurons the authors find a small proportion (13%) that correlates with the ECoG MUA (eMUA). Based on this, they use a model to show that ECoG MUA likely reflects axonal spikes. They then posit that these layer 2/3 neurons are tightly correlated to the layer 1 input. The presentation of their data and the specifics of their model makes it difficult to assess the validity of this claim. They do not sufficiently discuss possible confounds in the data, caveats of their model, or alternative explanations of the observed low proportion of L2/3 neurons that correlate with the ECoG MUA.

      Most relevantly, the authors do not measure single units with their ECoG. The eMUA is a complex mixture of many neuronal sources, and interpretation is therefore difficult. They relate the calcium transients of small populations of single L2/3 neurons with the aggregate measure of population activity reflected in eMUA. It is possible that the eMUA reflects population activity in the local circuit and might therefore have a low correlation with individual single units. Critically, there is no information on the sensitivity of calcium recordings. Do the imaging data detect single action potentials, or are they biased to bursts of more than 1 AP?

      The analysis pipeline and values used for computing the correlation coefficients are counterintuitive. The fluorescence data are first interpolated from 15 Hz to 4 kHz and then both eMUA and imaging data are effectively down-sampled to 2 Hz. A single correlation coefficient is then estimated for each neuron, regardless of behavioral state, even though the authors themselves show that the activity of single neurons and the ECoG signal depend on the state of the animal.

      There is also insufficient information on the weight of the implant and its effect on mouse behavior. How does the movement of implanted and non-implanted mice differ? Must mice be singly housed? Finally, the modeling parameters are highly specific, using independently driving spikes, while the activity of neurons can be highly correlated. Likewise, the contribution of tangentially oriented axons that could relate to long-range connections conveying information related to the animal's motion or level of arousal is not considered. The manuscript would benefit from further analysis of the physiological data, consideration of alternative explanations and forthright discussion of limitations and caveats of their device and approach.

    4. Reviewer #3 (Public Review):

      The authors have developed a new form of transparent surface multielectrode integrated into an imaging window, enabling simultaneous recording of electrical activity at the surface of the cortex combined with two-photon imaging through the window and electrode. The authors characterise the electrical signals and use simulations to argue that they reflect the activity of axons in layer 1. This is then correlated with calcium imaging signals from layer 2/3 pyramidal cells. A subset of these displayed strong correlations with the layer 1 activity.

      The raw electrical recordings appear to be contaminated by large movement artefacts. The authors attempt to decompose the signal into neuronal activity and artefact. The independent component analysis (ICA) employed yields plausible results. However, there is no definitive validation of this procedure.

      The simulations strongly suggest that only layer 1 axons will generate significant neuronal signals at the surface, but the authors have not attempted to reconstruct the multiunit activity in the simulations, which could provide additional assurance for their interpretation.

      A small fraction of pyramidal cells has activity strongly correlated with the signal at the surface electrode. However, the authors have not examined whether the distance from neuron to the electrode influences the strength of correlation. It remains possible that the differential correlation reflects a distance effect rather than the existence of two populations.

    1. eLife assessment

      This manuscript provides valuable insight into the molecular mechanism by which destabilized mitochondrial proteins 'clog' import channels and contribute to the pathologic mitochondrial and cellular dysfunction implicated in human disease. The evidence supporting this conclusion is solid, utilizing yeast, mammalian cell culture, and mouse models. However, additional characterization of import clogging in the mammalian model systems would strengthen this study. This work will be of broad interest to researchers in the fields of mitochondrial biology, protein quality control and proteostasis.

    2. Reviewer #1 (Public Review):

      This is an interesting manuscript that highlights the potential for 'clogging' of import channels by mutant proteins to promote mitochondrial dysfunction in disease. One of the challenges with this study is deconvoluting potential loss-of-function phenotypes associated with reductions in ANT1/AAC2 from gain-of-toxicity phenotypes linked to import clogging. This was addressed primarily in yeast, showing that phenotypes associated with overexpression of mutants (e.g., reduced growth on glucose media). The experiment showing that yeast AAC2 clogs import was also convincing including both in vitro and in vivo characterization, although it isn't clear why the proteomic experiments were performed with acute expression of A128P instead of the 'superclogger' double mutant. The extension of this work to mammalian cells and then mice is also admirable. However, the quality of characterization does begin to decline when moving into mammalian models. For example, there is no clear evidence that observed phenotypes can be attributed to gain of toxicity instead of loss of function in mammalian cells and mice. There are similarities to yeast, but this needs to be better defined in my opinion. Lastly, I have questions related to the mouse model, such as how do these phenotypes compare to KO animals and why were homozygous mice used in some situations and heterozygous mice used in others.

      Overall, this manuscript is interesting, as it describes a mechanism whereby mutant proteins can lead to import deficiencies in the context of disease. The strengths primarily reside with the yeast work, where the demonstration of import clogging and the functional implications of this clogging are best defined. The transition to mammalian cells and mice is admirable as well, but doesn't reach the same level of characterization, leaving open the possibility that the observed effects could be attributed (at least in part) to loss of function of ANT1.

    3. Reviewer #2 (Public Review):

      Mitochondrial dysfunction is now widely recognized as an underlying cause of many human diseases. In many cases, however, very little is known about the molecular etiology of mitochondrial disorders. In this comprehensive study Coyne et al. describe a mechanism by which dominant pathogenic variants of adenine nucleotide translocase Aac2p/ANT1 impair mitochondrial protein import pathway leading to cytotoxicity and mitochondrial dysfunction. By elucidating the fate of this protein in yeast, human cell culture, and murine models, the authors showed that mutant Aac2p variants accumulate the outer membrane translocase TOM complex jamming up mitochondrial protein import and affecting TIM22-mediated carrier import pathway, thus causing proteostatic stress. Furthermore, they showed that the i-AAA protease Yme1p and not the ubiquitin-proteasome system is responsible for proteolytic removal of the mutant Aac2p variants. Finally, the demonstrated that mitochondrial protein import clogging caused by the ANT1 A114P, A123D variant causes severe dominant neurodegenerative phenotype in mice, which resembles neuromuscular disease manifestations in humans. The authors propose this as a candidate pathological mechanism in ANT1-linked human disorders and by extension, to other diseases arising from defects in mitochondrial protein import.

      Overall, this is a well-designed and thoroughly executed study that reports on a novel aspect of ANT1 associated dysfunction and provides mechanistic insights into the pathological mechanisms at play.

    4. Reviewer #3 (Public Review):

      Dominant pathogenic variants of the Aac2/Ant1 ATP transporter cause disease by an unknown mechanism. In this manuscript the authors aim to reveal how these gain of function mutants impair cellular and mitochondrial health. To characterize the phenotype of Aac2 mutants in yeast, the authors use a series of single and double Aac2 mutations, within the 2nd and 3rd transmembrane domains that are associated with human diseases. Aac2A128P,A137D mutant, which caused high toxicity and damaged the mitochondrial DNA was selected for further analysis. This mutant was not imported efficiently into mitochondria and exhibited an increased association with TOM, suggesting that it clogs the TOM translocase. As a result, expression of Aac2A128P,A137D led to impaired import of other mitochondrial proteins. Several findings suggested that the single mutant Aac2A128P impaired mitochondrial import in a similar manner: 1. mass spec analysis revealed its increased association with cytosolic chaperones, TOM and TIM22 subunits, 2. Aac2A128P overexpression led to global mitochondrial protein import deficiency, demonstrated by HSP60 precursor accumulation and activation of stress responses (transcription of chaperons, proteosome induction, and CIS1).<br /> Parallel mutants of human Ant1 (AntA114P and Ant1A114P,A123D) were ectopically expressed in HeLa cells. The mutants were demonstrated to clog TOM and cause a global defect in mitochondrial protein import. This was confirmed in tissues from Ant1A114P,A123D/+ knock-in mice. The Ant1A114P,A123D/+ mice exhibited decreased maximal mitochondrial respiration in muscles. Examination of the skeletal muscle myofiber diameter and COX and SDH activity revealed that Ant1A114P,A123D expression in heterozygous mice acts dominantly and causes a myopathic phenotype and in some case neurodegeneration.

      Major strengths -

      The ability of proteins to clog TOM and sequentially disrupt protein import into mitochondria was demonstrated in recent years. However, till now this was achieved using chemicals, artificial cloggers and overexpression of mitochondrial proteins. This study reveals, for the first time, that disease associated variants of native mitochondrial proteins can clog the entry into the organelle. Thus, this work demonstrates that TOM clogging is a physiological relevant phenomenon that is involved in human diseases.

      The manuscript is well-written and the experiments are well-designed, presenting convincing data that mostly support the conclusions. The methods used are well-establish and suitable techniques that are often used in the field. This work took advantage of 3 different biological systems/model organism, yeast, cell culture, and mice tissues, to validate the results, show conservation, and exploit the strengths of each system.

      Overall, this study is impactful, greatly contributes to the field and should be of interest to the general scientific community. The work sheds light of the mechanisms by which Ant1 pathogenic mutants impact cellular health and provides evidence for the involvement of translocases clogging and impaired protein import in human diseases. The gain of function Aac2/Ant1 mutants will provide a new and powerful tool for future studies of mitochondrial quality control and repair mechanisms.

      Major weaknesses -

      1. The evidence for clogging of mitochondrial translocases and for general defect in protein import are solid. However, there are not enough evidence to conclude that all phenotype seen in mice and yeast are directly connected to clogging.

      2. This work implies that Aac2/Ant1 variants can clogg TOM, TIM22, or both. Clogging of TIM22 is novel and interesting but is not fully discussed in the manuscript, as well as the possibility that clogging of different translocases can result in different defects.

    1. eLife assessment

      This manuscript tests an important assumption about how sensory information is processed and used to guide motor choices. The widely held assumption is that sensory-motor circuits are capable of integrating evidence, but the validity and generality of this 'principle' have been recently questioned by studies suggesting that other computational operations may lead to similar psychophysical results, mimicking integration without actually performing it. This study makes a compelling case that the integration assumption was likely correct all along and that the model mimicry can be easily disambiguated by using appropriate sensory stimuli and task designs that permit rigorous analyses.

    2. Reviewer #1 (Public Review):

      This manuscript presents a comparison between models that may explain psychophysical performance in sensory integration tasks, where a subject essentially has to count stimulus samples and make a motor report about the final count.

      The work has many technical strengths:

      - The problem of model mimicry is clearly articulated.

      - The work shows that the use of discrete sample stimulus (DSS) is key for being able to disambiguate multiple candidate mechanisms that could possibly underlie the observed behavioral data.

      - The authors use rigorous model comparison and analysis techniques, some (like the integration maps) newly developed for the current application.

      - The model comparison involves both qualitative and qualitative contrasts between alternative models.

      - Consistent results are obtained with several data sets involving humans, monkeys, and rats.

      - The results provide insight into why the simpler alternative models (the snapshot and extrema detection models) fail.

      No glaring weaknesses were found in this manuscript. However, there are some limitations that are worth noting, to put things into context:

      - The results are consistent with what has become a well-known principle of operation of sensory-motor circuits, namely, that they are highly effective at integrating sensory evidence over time. Thus, the results are not particularly surprising.

      - The results are valuable in that they specifically refute two mechanisms that had been recently proposed as potential alternatives to the more standard temporal integration. To some, these alternative mechanisms may have seemed somewhat far-fetched to begin with, as they would lead to suboptimal performance in general. Nevertheless, settling the question was important.

      - Temporal integration and accumulation of evidence have been the focus of many computational studies in systems neuroscience. Although these are certainly important functions, sensory-guided choices require the deployment and coordination of numerous sensory, motor, and cognitive mechanisms, of which integration is just one.

      Overall, this is a valuable study that has important theoretical implications in the field of computational neuroscience. It presents a compelling case that temporal integration is a common capability of sensory-motor circuits and that it explains a variety of behavioral data sets much better than two simpler, alternative mechanisms.

    3. Reviewer #2 (Public Review):

      The authors' goal is to uncover the most likely method used by mammals to make choices based on a time-limited stream of noisy incoming sensory data. To achieve this, they analyze with great rigor several large datasets obtained from tightly controlled two-alternative forced choice behavioral experiments. The tight control of fluctuating incoming sensory input over a large number of trials allows the authors to extract the influence of different components of that input on the behavioral choice. The conditional analysis, showing the impact of early information on the importance of later information, or vice versa, is an excellent new technique.

      They compare three models and find one based on a form of weighted integration of evidence across time is very strongly favored compared to models in which only short segments of the sensory input are used, or the most extreme fluctuations of the sensory input generate a response. Overall, the results clearly do indicate that the integration-like family of models outperforms the other families. The authors succeed well in giving a fair comparison of the different families of models, allowing multiple parameters to be optimized to test different versions of each model.

      It should be said that the integration model is a strange type of integration, as the weight of incoming evidence depends on the time at which it arrives-by a factor of 4 in one animal (Fig. 2)-and with an over-weighting of evidence in the middle of the sequence in one case, while the more expected effects of primacy and recency (over-weighting of early or late evidence) in another. It would be nice to see more discussion of how these differences might arise across animals, what it may say about the neural circuit performing such unbalanced integration, and how suboptimal such differential weighting of evidence is. This is important, as in some discussions integration is contrasted with state transitions, which are akin to integration over a barrier, and not necessarily ruled out by the models compared here.

    1. Author Response:

      We would like to thank both reviewers and editors for their time and effort in reviewing our work, and the thoughtful suggestions made.

      Reviewer #1 (Public Review):

      […] The experiments are well-designed and carefully conducted. The conclusions of this work are in general well supported by the data. There are a couple of points that need to be addressed or tested.

      1) It is unclear how LC phasic stimulation used in this study gates cortical plasticity without altering cellular responses (at least at the calcium imaging level). As the authors mentioned that Polack et al 2013 showed a significant effect of NE blockers in membrane potential and firing rate in V1 layer2/3 neurons during locomotion, it would be useful to test the effect of LC silencing (coupled to mismatch training) on both cellular response and cortical plasticity or applying NE antagonists in V1 in addition to LC optical stimulation. The latter experiment will also address which neuromodulator mediates plasticity, given that LC could co-release other modulators such as dopamine (Takeuchi et al. 2016 and Kempadoo et al. 2016). LC silencing experiment would establish a causal effect more convincingly than the activation experiment.

      Regarding the question of how phasic stimulation could alter plasticity without affecting the response sizes or activity in general, we believe there are possibilities supported by previous literature. It has been shown that catecholamines can gate plasticity by acting on eligibility traces at synapses (He et al., 2015; Hong et al., 2022). In addition, all catecholamine receptors are metabotropic and influence intracellular signaling cascades, e.g., via adenylyl cyclase and phospholipases. Catecholamines can gate LTP and LTD via these signaling pathways in vitro (Seol et al., 2007). Both of these influences on plasticity at the molecular level do not necessitate or predict an effect on calcium activity levels. We will expand on this in the discussion of the revised manuscript.

      While a loss of function experiment could add additional corroborating evidence that LC output is required for the plasticity seen, we did not perform loss-of-function experiments for three reasons:

      1. The effects of artificial activity changes around physiological set point are likely not linear for increases and decreases. The problem with a loss of function experiment here is that neuromodulators like noradrenaline affect general aspects neuronal function. This is apparent in Polack et al., 2013: during the pharmacological blocking experiment, the membrane hyperpolarizes, membrane variance becomes very low, and the cells are effectively silenced (Figure 7 of (Polack et al., 2013)), demonstrating an immediate impact on neuronal function when noradrenaline receptor activation is presumably taken below physiological/waking levels. In light of this, if we reduce LC output/noradrenergic receptor activation and find that plasticity is prevented, this could be the result of a direct influence on the plasticity process, or, the result of a disruption of another aspect of neuronal function, like synaptic transmission or spiking. We would therefore challenge the reviewer’s statement that a loss-of-function experiment would establish a causal effect more convincingly than the gain-of-function experiment that we performed.

      2. The loss-of-function experiment is technically more difficult both in implementation and interpretation. Control mice show no sign of plasticity in locomotion modulation index (LMI) on the 10-minute timescale (Figure 4J), thus we would not expect to see any effect when blocking plasticity in this experiment. We would need to use dark-rearing and coupled-training of mice in the VR across development to elicit the relevant plasticity ((Attinger et al., 2017); manuscript Figure 5). We would then need to silence LC activity across days of VR experience to prevent the expected physiological levels of plasticity. Applying NE antagonists in V1 over the entire period of development seems very difficult. This would leave optogenetically silencing axons locally, which in addition to the problems of doing this acutely (Mahn et al., 2016; Raimondo et al., 2012), has not been demonstrated to work chronically over the duration of weeks. Thus, a negative result in this experiment will be difficult to interpret, and likely uninformative: We will not be able to distinguish whether the experimental approach did not work, or whether local LC silencing does nothing to plasticity.

        Note that pharmacologically blocking noradrenaline receptors during LC stimulation in the plasticity experiment is also particularly challenging: they would need to be blocked throughout the entire 15 minute duration of the experiment with no changes in concentration of antagonist between the ‘before’ and ‘after’ phases, since the block itself is likely to affect the response size, as seen in Polack et al., 2013, creating a confound for plasticity-related changes in response size. Thus, we make no claim about which particular neuromodulator released by the LC is causing the plasticity.

      3. There are several loss-of-function experiments reported in the literature using different developmental plasticity paradigms alongside pharmacological or genetic knockout approaches. These experiments show that chronic suppression of noradrenergic receptor activity prevents ocular dominance plasticity and auditory plasticity (Kasamatsu and Pettigrew, 1976; Shepard et al., 2015). Almost absent from the literature, however, are convincing gain-of-function plasticity experiments.

      Overall, we feel that loss-of-function experiments may be a possible direction for future work but, given the technical difficulty and – in our opinion – limited benefit that these experiments, would provide in light of the evidence already provided for the claims we make, we have chosen not to perform these experiments at this time. Note that we already discuss some of the problems with loss-of-function experiments in the discussion.

      2) The cortical responses to NE often exhibit an inverted U-curve, with higher or lower doses of NE showing more inhibitory effects. It is unclear how responses induced by optical LC stimulation compare or interact with the physiological activation of the LC during the mismatch. Since the authors only used one frequency stimulation pattern, some discussion or additional tests with a frequency range would be helpful.

      This is correct, we do not know how the artificial activation of LC axons relates to physiological activation, e.g. under mismatch. The stimulation strength is intrinsically consistent in our study in the sense that the stimulation level to test for changes in neuronal activity is similar to that used to probe for plasticity effects. We suspect that the artificial activation results in much stronger LC activity than seen during mismatch responses, given that no sign of the plasticity in LMI seen in high ChrimsonR occurs in low ChrimsonR or control mice (Figure 4J). Note, that our conclusions do not rely on the assumption that the stimulation is matched to physiological levels of activation during the visuomotor mismatches that we assayed. The hypothesis that we put forward is that increasing levels of activation of the LC (reflecting increasing rates or amplitude of prediction errors across the brain) will result in increased levels of plasticity. We know that LC axons can reach levels of activity far higher than that seen during visuomotor mismatches, for instance during air puff responses, which constitute a form of positive prediction error (unexpected tactile input) (Figures 2C and S1C).  The visuomotor mismatches used in this study were only used to demonstrate that LC activity is consistent with prediction error signaling. We will expand on these points in the discussion as suggested.

      Reviewer #2 (Public Review):

      […] The study provides very compelling data on a timely and fascinating topic in neuroscience. The authors carefully designed experiments and corresponding controls to exclude any confounding factors in the interpretation of neuronal activity in LC axons and cortical neurons. The quality of the data and the rigor of the analysis are important strengths of the study. I believe this study will have an important contribution to the field of system neuroscience by shedding new light on the role of a key neuromodulator. The results provide strong support for the claims of the study. However, I also believe that some results could have been strengthened by providing additional analyses and experimental controls. These points are discussed below.

      Calcium signals in LC axons tend to respond with pupil dilation, air puffs, and locomotion as the authors reported. A more quantitative analysis such as a GLM model could help understand the relative contribution (and temporal relationship) of these variables in explaining calcium signals. This could also help compare signals obtained in the sensory and motor cortical domains. Indeed, the comparison in Figure 2 seems a bit incomplete since only "posterior versus anterior" comparisons have been performed and not within-group comparisons. I believe it is hard to properly assess differences or similarities between calcium signal amplitude measured in different mice and cranial windows as they are subject to important variability (caused by different levels of viral expression for instance). The authors should at the very least provide a full statistical comparison between/within groups through a GLM model that would provide a more systematic quantification.

      We will implement an improved analysis in the revised version of the manuscript.

      Previous studies using stimulations of the locus coeruleus or local iontophoresis of norepinephrine in sensory cortices have shown robust responses modulations (see McBurney-Lin et al., 2019, https://doi.org/10.1016/j.neubiorev.2019.06.009 for a review). The weak modulations observed in this study seem at odds with these reports. Given that the density of ChrimsonR-expressing axons varies across mice and that there are no direct measurements of their activation (besides pupil dilation), it is difficult to appreciate how they impact the local network. How does the density of ChrimsonR-expressing axons compare to the actual density of LC axons in V1? The authors could further discuss this point.

      In terms of estimating the percentage of cortical axons labelled based on our axon density measurements: we refer to cortical LC axonal immunostaining in the literature to make this comparison. In motor cortex, an average axon density of 0.07 µm/µm2 has been reported (Yin et al., 2021), and 0.09 µm/µm2 in prefrontal cortex (Sakakibara et al., 2021). Density of LC axons varies by cortical area, with higher density in motor cortex and medial areas than sensory areas (Agster et al., 2013): V1 axon density is roughly 70% of that in cingulate cortex (adjacent to motor and prefrontal cortices) (Nomura et al., 2014). So, we approximate a maximum average axon density in V1 of approximately 0.056 µm/µm2. Because these published measurements were made from images taken of tissue volumes with larger z-depth (~ 10 µm) than our reported measurements (~ 1 µm), they appear much larger than the ranges reported in our manuscript (0.002 to 0.007 µm/µm2). We repeated the measurements in our data using images of volumes with 10 µm z-depth, and find that the percentage axons labelled in our study in high ChrimsonR-expressing mice ranges between 0.012 to 0.039 µm/µm2. This corresponds to between 20% to 70% of the density we would expect based on previous work. Note that this is a potentially significant underestimate, and therefore should be used as a lower bound: analyses in the literature use images from immunostaining, where the signal to background ratio is very high. In contrast, we did not transcardially perfuse our mice leading to significant background (especially in the pia/L1, where axon density is high - (Agster et al., 2013; Nomura et al., 2014)), and the intensity of the tdTomato is not especially high. We therefore are likely missing some narrow, dim, and superficial fibers in our analysis.

      We also can quantify how our variance in axonal labelling affects our results: For the dataset in Figure 3, there doesn’t appear to be any correlation between the level of expression and the effect of stimulating the axons on the mismatch or visual flow responses for each animal (Figure R1: https://imgur.com/gallery/Yl60hnT), while there is a significant correlation between the level of expression and the pupil dilation, consistent with the dataset shown in Figure 4. Thus, even in the most highly expressing mice, there is no clear effect on average response size at the level of the population. We will add these correlations to the revised manuscript.

      To our knowledge, there has not yet been any similar experiment reported utilizing local LC axonal optogenetic stimulation while recording cortical responses, so when comparing our results to those in the literature, there are several important methodological differences to keep in mind. The vast majority of the work demonstrating an effect of LC output/noradrenaline on responses in the cortex has been done using unit recordings, and while results are mixed, these have most often demonstrated a suppressive effect on spontaneous and/or evoked activity in the cortex (McBurney-Lin et al., 2019). In contrast to these studies, we do not see a major effect of LC stimulation either on baseline or evoked calcium activity (Figure 3), and, if anything, we see a minor potentiation of transient visual flow onset responses (see also Figure R2). There could be several reasons why our stimulation does not have the same effect as these older studies:

      1. Recording location: Unit recordings are often very biased toward highly active neurons (Margrie et al., 2002) and deeper layers of the cortex, while we are imaging from layer 2/3 – a layer notorious for sparse activity. In one of the few papers to record from superficial layers, it was been demonstrated that deeper layers in V1 are affected differently by LC stimulation methods compared to more superficial ones (Sato et al., 1989), with suppression more common in superficial layers. Thus, some differences between our results and those in the majority of the literature could simply be due to recording depth and the sampling bias of unit recordings.

      2. Stimulation method: Most previous studies have manipulated LC output/noradrenaline levels by either iontophoretically applying noradrenergic receptor agonists, or by electrically stimulating the LC. Arguably, even though our optogenetic stimulation is still artificial, it represents a more physiologically relevant activation compared to iontophoresis, since the LC releases a number of neuromodulators including dopamine, and these will be released in a more physiological manner in the spatial domain and in terms of neuromodulator concentration. Electrical stimulation of the LC as used by previous studies differs from our optogenetic method in that LC axons will be stimulated across much wider regions of the brain (affecting both the cortex and many of its inputs), and it is not clear whether the cause of cortical response changes is in cortex or subcortical. In addition, electrical LC stimulation is not cell type specific.

      3. Temporal features of stimulation: Few previous studies had the same level of temporal control over manipulating LC output that we had using optogenetics. Given that electrical stimulation generates electrical artifacts, coincident stimulation during the stimulus was not used in previous studies. Instead, the LC is often repeatedly or tonically stimulated, sometimes for many seconds, prior to the stimulus being presented. Iontophoresis also does not have the same temporal specificity and will lead to tonically raised receptor activity over a time course determined by washout times.

      4. State specificity: Most previous studies have been performed under anesthesia – which is known to impact noradrenaline levels and LC activity (Müller et al., 2011). Thus, the acute effects of LC stimulation are likely not comparable between anesthesia and in the awake animal.

      Due to these differences, it is hard to infer why our results differ compared to other papers. The study with the most similar methodology to ours is (Vazey et al., 2018), which used optogenetic stimulation directly into the mouse LC while recording spiking in deep layers of the somatosensory cortex with extracellular electrodes. Like us, they found that phasic optogenetic stimulation alone did not alter baseline spiking activity (Figure 2F of Vazey et al., 2018), and they found that in layers 5 and 6, short latency transient responses to foot touch were potentiated and recruited by simultaneous LC stimulation. While this finding appears more overt than the small modulations we see, it is qualitatively not so dissimilar from our finding that transient responses appear to be slightly potentiated when visual flow begins (Figure R2). Differences in the degree of the effect may be due to differences in the layers recorded, the proportion of the LC recruited, or the fact anesthesia was used in Vazey et al., 2018.

      Note that we only used one set of stimulation parameters for optogenetic stimulation, and it is always possible that using different parameters would result in different effects. We will add a discussion on the topic to the revised manuscript.

      In the analysis performed in Figure 3, it seems that red light stimulations used to drive ChrimsonR also have an indirect impact on V1 neurons through the retina. Indeed, figure 3D shows a similar response profile for ChrimsonR and control with calcium signals increasing at laser onset (ON response) and offset (OFF response). With that in mind, it is hard to interpret the results shown in Figure 3E-F without seeing the average calcium time course for Control mice. Are the responses following visual flow caused by LC activation or additional visual inputs? The authors should provide additional information to clarify this result.

      This is a good point. When we plot the average difference between the stimulus response alone and the optogenetic stimulation + stimulus response, we do indeed find that there is a transient increase in response at the visual flow onset (and the offset of mismatch, which is where visual flow resumes), and this is only seen in ChrimsonR-expressing mice (Figure R2: https://imgur.com/gallery/cqN2Khd). We therefore believe that these enhanced transients at visual flow onset could be due to the effect of ChrimsonR stimulation, and indeed previous studies have shown that LC stimulation can reduce the onset latency and latency jitter of afferent-evoked activity (Devilbiss and Waterhouse, 2004; Lecas, 2004), an effect which could mediate the differences we see. We will add this analysis to the revised manuscript.

      Some aspects of the described plasticity process remained unanswered. It is not clear over which time scale the locomotion modulation index changes and how many optogenetic stimulations are necessary or sufficient to saturate this index. Some of these questions could be addressed with the dataset of Figure 3 by measuring this index over different epochs of the imaging session (from early to late) to estimate the dynamics of the ongoing plasticity process (in comparison to control mice). Also, is there any behavioural consequence of plasticity/update of functional representation in V1? If plasticity gated by repeated LC activations reproduced visuomotor responses observed in mice that were exposed to visual stimulation only in the virtual environment, then I would expect to see a change in the locomotion behaviour (such as a change in speed distribution) as a result of the repeated LC stimulation. This would provide more compelling evidence for changes in internal models for visuomotor coupling in relation to its behavioural relevance. An experiment that could confirm the existence of the LC-gated learning process would be to change the gain of the visuomotor coupling and see if mice adapt faster with LC optogenetic activation compared to control mice with no ChrimsonR expression. Authors should discuss how they imagine the behavioural manifestation of this artificially-induced learning process in V1.

      Regarding the question of plasticity time course: Unfortunately, owing to the paradigm used in Figure 3, the time course of the plasticity will not be quantifiable from this experiment. This is because in the first 10 minutes, the mouse is in closed loop visuomotor VR experience, undergoing optogenetic stimulation (this is the time period in which we record mismatches). We then shift to the open loop session to quantify the effect of optogenetic stimulation on visual flow responses. Since the plasticity is presumably happening during the closed loop phase, and we have no read-out of the plasticity during this phase (we do not have uncoupled visual flow onsets to quantify LMI in closed loop), it is not possible to track the plasticity over time.

      Regarding the behavioral relevance of the plasticity: The type of plasticity we describe here is consistent with predictive, visuomotor plasticity in the form of a learned suppression of responses to self-generated visual feedback during movement. Intuitive purposes of this type of plasticity would be 1) to enable better detection of external moving objects by suppressing the predictable (and therefore redundant) self-generated visual motion and 2) to better detect changes in the geometry of the world (near objects have a larger visuomotor gain that far objects). In our paradigm, we have no intuitive read-out of the mouse’s perception of these things, and it is not clear to us that they would be reflected in locomotion speed, which does not differ between groups (manuscript Figure S5). Instead, we would need to turn to other paradigms for a clear behavioral read-out of predictive forms of sensorimotor learning: for instance, sensorimotor learning paradigms in the VR (such as those used in (Heindorf et al., 2018; Leinweber et al., 2017)), or novel paradigms that reinforce the mouse for detecting changes in the gain of the VR, or moving objects in the VR, using LC stimulation during the learning phase to assess if this improves acquisition. This is certainly a direction for future work. In the case of a positive effect, however, the link between the precise form of plasticity we quantify in this manuscript and the effect on the behavior would remain indirect, so we see this as beyond the scope of the manuscript. We will add a discussion on this topic to the revised manuscript.

      Finally, control mice used as a comparison to mice expressing ChrimsonR in Figure 3 were not injected with a control viral vector expressing a fluorescent protein alone. Although it is unlikely that the procedure of injection could cause the results observed, it would have been a better control for the interpretation of the results.

      We agree that this indeed would have been a better control. However, we believe that this is fortunately not a major problem for the interpretation of our results for two reasons:

      1. The control and ChrimsonR expressing mice do not show major differences in the effect of optogenetic LC stimulation at the level of the calcium responses for all results in Figure 3, with the exception of the locomotion modulation indices (Figure 3I). Therefore, in terms of response size, there is no major effect compared to control animals that could be caused by the injection procedure, apart from marginally increased transient responses to visual flow onset – and, as the reviewer notes, it is difficult to see how the injection procedure would cause this effect.

      2. The effect on locomotion modulation index (Figure 3I) was replicated with another set of mice in Figure 4C, for which we did have a form of injected control (‘Low ChrimsonR’), which did not show the same plasticity in locomotion modulation index (Figure 4E). We therefore know that at least the injection itself is not resulting in the plasticity effect seen.

      References:

      • Agster, K.L., Mejias-Aponte, C.A., Clark, B.D., Waterhouse, B.D., 2013. Evidence for a regional specificity in the density and distribution of noradrenergic varicosities in rat cortex. Journal of Comparative Neurology 521, 2195–2207. https://doi.org/10.1002/cne.23270

      • Attinger, A., Wang, B., Keller, G.B., 2017. Visuomotor Coupling Shapes the Functional Development of Mouse Visual Cortex. Cell 169, 1291-1302.e14. https://doi.org/10.1016/j.cell.2017.05.023

      • Devilbiss, D.M., Waterhouse, B.D., 2004. The Effects of Tonic Locus Ceruleus Output on Sensory-Evoked Responses of Ventral Posterior Medial Thalamic and Barrel Field Cortical Neurons in the Awake Rat. J. Neurosci. 24, 10773–10785. https://doi.org/10.1523/JNEUROSCI.1573-04.2004

      • He, K., Huertas, M., Hong, S.Z., Tie, X., Hell, J.W., Shouval, H., Kirkwood, A., 2015. Distinct Eligibility Traces for LTP and LTD in Cortical Synapses. Neuron 88, 528–538. https://doi.org/10.1016/j.neuron.2015.09.037

      • Heindorf, M., Arber, S., Keller, G.B., 2018. Mouse Motor Cortex Coordinates the Behavioral Response to Unpredicted Sensory Feedback. Neuron 0. https://doi.org/10.1016/j.neuron.2018.07.046

      • Hong, S.Z., Mesik, L., Grossman, C.D., Cohen, J.Y., Lee, B., Severin, D., Lee, H.-K., Hell, J.W., Kirkwood, A., 2022. Norepinephrine potentiates and serotonin depresses visual cortical responses by transforming eligibility traces. Nat Commun 13, 3202. https://doi.org/10.1038/s41467-022-30827-1

      • Kasamatsu, T., Pettigrew, J.D., 1976. Depletion of brain catecholamines: failure of ocular dominance shift after monocular occlusion in kittens. Science 194, 206–209. https://doi.org/10.1126/science.959850

      • Lecas, J.-C., 2004. Locus coeruleus activation shortens synaptic drive while decreasing spike latency and jitter in sensorimotor cortex. Implications for neuronal integration. European Journal of Neuroscience 19, 2519–2530. https://doi.org/10.1111/j.0953-816X.2004.03341.x

      • Leinweber, M., Ward, D.R., Sobczak, J.M., Attinger, A., Keller, G.B., 2017. A Sensorimotor Circuit in Mouse Cortex for Visual Flow Predictions. Neuron 95, 1420-1432.e5. https://doi.org/10.1016/j.neuron.2017.08.036

      • Mahn, M., Prigge, M., Ron, S., Levy, R., Yizhar, O., 2016. Biophysical constraints of optogenetic inhibition at presynaptic terminals. Nat Neurosci 19, 554–556. https://doi.org/10.1038/nn.4266

      • Margrie, T.W., Brecht, M., Sakmann, B., 2002. In vivo, low-resistance, whole-cell recordings from neurons in the anaesthetized and awake mammalian brain. Pflugers Arch. 444, 491–498. https://doi.org/10.1007/s00424-002-0831-z

      • McBurney-Lin, J., Lu, J., Zuo, Y., Yang, H., 2019. Locus coeruleus-norepinephrine modulation of sensory processing and perception: A focused review. Neurosci Biobehav Rev 105, 190–199. https://doi.org/10.1016/j.neubiorev.2019.06.009

      • Müller, C.P., Pum, M.E., Amato, D., Schüttler, J., Huston, J.P., De Souza Silva, M.A., 2011. The in vivo neurochemistry of the brain during general anesthesia. Journal of Neurochemistry 119, 419–446. https://doi.org/10.1111/j.1471-4159.2011.07445.x

      • Nomura, S., Bouhadana, M., Morel, C., Faure, P., Cauli, B., Lambolez, B., Hepp, R., 2014. Noradrenalin and dopamine receptors both control cAMP-PKA signaling throughout the cerebral cortex. Front Cell Neurosci 8. https://doi.org/10.3389/fncel.2014.00247

      • Polack, P.-O., Friedman, J., Golshani, P., 2013. Cellular mechanisms of brain-state-dependent gain modulation in visual cortex. Nat Neurosci 16, 1331–1339. https://doi.org/10.1038/nn.3464

      • Raimondo, J.V., Kay, L., Ellender, T.J., Akerman, C.J., 2012. Optogenetic silencing strategies differ in their effects on inhibitory synaptic transmission. Nat Neurosci 15, 1102–1104. https://doi.org/10.1038/nn.3143

      • Sakakibara, Y., Hirota, Y., Ibaraki, K., Takei, K., Chikamatsu, S., Tsubokawa, Y., Saito, T., Saido, T.C., Sekiya, M., Iijima, K.M., n.d. Widespread Reduced Density of Noradrenergic Locus Coeruleus Axons in the App Knock-In Mouse Model of Amyloid-β Amyloidosis. J Alzheimers Dis 82, 1513–1530. https://doi.org/10.3233/JAD-210385

      • Sato, H., Fox, K., Daw, N.W., 1989. Effect of electrical stimulation of locus coeruleus on the activity of neurons in the cat visual cortex. Journal of Neurophysiology. https://doi.org/10.1152/jn.1989.62.4.946

      • Seol, G.H., Ziburkus, J., Huang, S., Song, L., Kim, I.T., Takamiya, K., Huganir, R.L., Lee, H.-K., Kirkwood, A., 2007. Neuromodulators control the polarity of spike-timing-dependent synaptic plasticity. Neuron 55, 919–929. https://doi.org/10.1016/j.neuron.2007.08.013

      • Shepard, K.N., Liles, L.C., Weinshenker, D., Liu, R.C., 2015. Norepinephrine is necessary for experience-dependent plasticity in the developing mouse auditory cortex. J Neurosci 35, 2432–2437. https://doi.org/10.1523/JNEUROSCI.0532-14.2015

      • Vazey, E.M., Moorman, D.E., Aston-Jones, G., 2018. Phasic locus coeruleus activity regulates cortical encoding of salience information. Proceedings of the National Academy of Sciences 115, E9439–E9448. https://doi.org/10.1073/pnas.1803716115

      • Yin, X., Jones, N., Yang, J., Asraoui, N., Mathieu, M.-E., Cai, L., Chen, S.X., 2021. Delayed motor learning in a 16p11.2 deletion mouse model of autism is rescued by locus coeruleus activation. Nat Neurosci 24, 646–657. https://doi.org/10.1038/s41593-021-00815-7

    2. eLife assessment

      This important study provides convincing evidence that locus coeruleus is activated during visuomotor mismatches. Gain of function optogenetic experiments complement this evidence and indicate that locus coeruleus could be involved in the learning process that enables visuomotor predictions. This study therefore sets the groundwork for the circuit dissection of predictive signals in the visual cortex. Loss-of-function experiments would strengthen the evidence of the involvement of locus coeruleus in prediction learning. These results will be of interest to systems neuroscientists.

    3. Reviewer #1 (Public Review):

      Jordan and Keller investigated the possibility that sensorimotor prediction error (mismatch between expected and actual inputs) triggers locus coeruleus (LC) activation, which in turn drives plasticity of cortical neurons that detect the mismatch (e.g. layer 2/3 neurons in V1), thus updating the internal presentation (expected) to match more the sensory input. Using genetic tools to selectively label LC neurons in mice and in vivo imaging of LC axonal calcium responses in the V1 and motor cortex in awake mice in virtual reality training, they showed that LC axons responded selectively to a mismatch between the visual input and locomotion. The greater the mismatch (the faster the locomotion in relation to the visual input), the larger the LC response. This seemed to be a global response as LC responses were indistinguishable between sensory and motor cortical areas. They further showed that LC drove learning (updating the internal model) despite that LC optical stimulation failed to alter acute cellular responses. Responses in the visual cortex increased with locomotion, and this was suppressed following LC phasic stimulation during visuomotor coupled training (closed loop). In the last section, they showed that artificial optogenetic stimulation of LC permitted plasticity over minutes, which would normally take days in non-stimulated mice trained in the visuomotor coupling mode. These data enhance our understanding of LC functionality in vivo and support the framework that LC acts as a prediction error detector and supervises cortical plasticity to update internal representations.

      The experiments are well-designed and carefully conducted. The conclusions of this work are in general well supported by the data. There are a couple of points that need to be addressed or tested.

      1) It is unclear how LC phasic stimulation used in this study gates cortical plasticity without altering cellular responses (at least at the calcium imaging level). As the authors mentioned that Polack et al 2013 showed a significant effect of NE blockers in membrane potential and firing rate in V1 layer2/3 neurons during locomotion, it would be useful to test the effect of LC silencing (coupled to mismatch training) on both cellular response and cortical plasticity or applying NE antagonists in V1 in addition to LC optical stimulation. The latter experiment will also address which neuromodulator mediates plasticity, given that LC could co-release other modulators such as dopamine (Takeuchi et al. 2016 and Kempadoo et al. 2016). LC silencing experiment would establish a causal effect more convincingly than the activation experiment.

      2) The cortical responses to NE often exhibit an inverted U-curve, with higher or lower doses of NE showing more inhibitory effects. It is unclear how responses induced by optical LC stimulation compare or interact with the physiological activation of the LC during the mismatch. Since the authors only used one frequency stimulation pattern, some discussion or additional tests with a frequency range would be helpful.

    4. Reviewer #2 (Public Review):

      The work presented by Jordan and Keller aims at understanding the role of noradrenergic neuromodulation in the cortex of mice exploring a visual virtual environment. The authors hypothesized that norepinephrine released by Locus Coeruleus (LC) neurons in cortical circuits gates the plasticity of internal models following visuomotor prediction errors. To test this hypothesis, they devised clever experiments that allowed them to manipulate visual flow with respect to locomotion to create prediction errors in visuomotor coupling and measure the related signals in LC axons innervating the cortex using two-photon calcium imaging. They observed calcium responses proportional to absolute prediction errors that were non-specifically broadcast across the dorsal cortex. To understand how these signals contribute to computations performed by V1 neurons in layers 2/3, the authors activated LC noradrenergic inputs using optogenetic stimulations while imaging calcium responses in cortical neurons. Although LC activation had little impact on evoked activity related to visuomotor prediction errors, the authors observed changes in the effect of locomotion on visually evoked activity after repeated LC axons activation that were absent in control mice. Using a clever paradigm where the locomotion modulation index was measured in the same neurons before and after optogenetic manipulations, they confirmed that this plasticity depended on the density of LC axons activated, the visual flow associated with running, and the concurrent visuomotor coupling during LC activation. Based on similar locomotion modulation index dependency on speed observed in mice that develop only with visuomotor experience in the virtual environment, the authors concluded that changes in locomotion modulation index are the result of experience-dependent plasticity occurring at a much faster rate during LC axons optogenetic stimulations.

      The study provides very compelling data on a timely and fascinating topic in neuroscience. The authors carefully designed experiments and corresponding controls to exclude any confounding factors in the interpretation of neuronal activity in LC axons and cortical neurons. The quality of the data and the rigor of the analysis are important strengths of the study. I believe this study will have an important contribution to the field of system neuroscience by shedding new light on the role of a key neuromodulator. The results provide strong support for the claims of the study. However, I also believe that some results could have been strengthened by providing additional analyses and experimental controls. These points are discussed below.

      Calcium signals in LC axons tend to respond with pupil dilation, air puffs, and locomotion as the authors reported. A more quantitative analysis such as a GLM model could help understand the relative contribution (and temporal relationship) of these variables in explaining calcium signals. This could also help compare signals obtained in the sensory and motor cortical domains. Indeed, the comparison in Figure 2 seems a bit incomplete since only "posterior versus anterior" comparisons have been performed and not within-group comparisons. I believe it is hard to properly assess differences or similarities between calcium signal amplitude measured in different mice and cranial windows as they are subject to important variability (caused by different levels of viral expression for instance). The authors should at the very least provide a full statistical comparison between/within groups through a GLM model that would provide a more systematic quantification.

      Previous studies using stimulations of the locus coeruleus or local iontophoresis of norepinephrine in sensory cortices have shown robust responses modulations (see McBurney-Lin et al., 2019, https://doi.org/10.1016/j.neubiorev.2019.06.009 for a review). The weak modulations observed in this study seem at odds with these reports. Given that the density of ChrimsonR-expressing axons varies across mice and that there are no direct measurements of their activation (besides pupil dilation), it is difficult to appreciate how they impact the local network. How does the density of ChrimsonR-expressing axons compare to the actual density of LC axons in V1? The authors could further discuss this point.

      In the analysis performed in Figure 3, it seems that red light stimulations used to drive ChrimsonR also have an indirect impact on V1 neurons through the retina. Indeed, figure 3D shows a similar response profile for ChrimsonR and control with calcium signals increasing at laser onset (ON response) and offset (OFF response). With that in mind, it is hard to interpret the results shown in Figure 3E-F without seeing the average calcium time course for Control mice. Are the responses following visual flow caused by LC activation or additional visual inputs? The authors should provide additional information to clarify this result.

      Some aspects of the described plasticity process remained unanswered. It is not clear over which time scale the locomotion modulation index changes and how many optogenetic stimulations are necessary or sufficient to saturate this index. Some of these questions could be addressed with the dataset of Figure 3 by measuring this index over different epochs of the imaging session (from early to late) to estimate the dynamics of the ongoing plasticity process (in comparison to control mice). Also, is there any behavioural consequence of plasticity/update of functional representation in V1? If plasticity gated by repeated LC activations reproduced visuomotor responses observed in mice that were exposed to visual stimulation only in the virtual environment, then I would expect to see a change in the locomotion behaviour (such as a change in speed distribution) as a result of the repeated LC stimulation. This would provide more compelling evidence for changes in internal models for visuomotor coupling in relation to its behavioural relevance. An experiment that could confirm the existence of the LC-gated learning process would be to change the gain of the visuomotor coupling and see if mice adapt faster with LC optogenetic activation compared to control mice with no ChrimsonR expression. Authors should discuss how they imagine the behavioural manifestation of this artificially-induced learning process in V1.

      Finally, control mice used as a comparison to mice expressing ChrimsonR in Figure 3 were not injected with a control viral vector expressing a fluorescent protein alone. Although it is unlikely that the procedure of injection could cause the results observed, it would have been a better control for the interpretation of the results.

    1. eLife assessment

      The paper provides a valuable, in-depth mathematical analysis of the coevolutionary dynamics resulting from a coupling of players' strategies and (collective) risk, as well as illustrative numerical simulations of the system's trajectories for different starting conditions. It is therefore a solid contribution to our understanding of how cooperation can be sustained when there is feedback between individual decisions and the global risk of disaster. This paper will be of interest to scientists working on mathematical biology/ecology, and more generally various aspects of human decision-making, the interplay between human decisions and the environment, and public goods provision.

    2. Reviewer #1 (Public Review):

      This is a quite nice work equipped with healthy scientific substance underpinned by a solid mathematical approach.

      The authors based on a PGG with the threshold; M (that ranges; 1 < M < N, where N indicates the game size), whether cooperation bringing fruit or not, in which, according to the commonly used parameterization, b and c mean the cooperation fruit and the cost for cooperation. As a kernel in their model, they presumed that an individual will lose his endowment (cooperation fruit in this context) with a probability r, which represents the risk level of collective failure (Eqs. (1 & 2)). Let alone, they presumed a well-mixed and infinite mother-population to ensure their analytical formulation and analysis, and to apply the replicator dynamics. Subsequently, they presumed the co-evolution of cooperation fraction; x, and risk level; r, by introducing another dynamical system for r, of which the general form is defined by Eq. (3).

      For a down-to-earth discussion, they presumed two types of concrete forms for non-linear function; U(x,r). Both types premise the so-called logistic type form; containing r*(1 - r). One is what-they-called Linear; Eq. (5). Another is Eq. (7), called Exponential. Up to here, all the modeling approach is well depicted and quite understandable.

      By exploring some numerical results backed by their theoretical ground, the authors got phase diagram (Figs. 3 and 5); whether a co-evolutionary destiny evaluated by (x,r) being absorbed by the dominance of unwilling (less cooperative) situation (say, D-dominant); (0,1), or by bi-stable equilibrium (either better state or D-dominant depending on an initial condition) along u (parameter appeared in the dynamical equation for r) and c/b (roughly speaking; it implies dilemma strength).

      The result seems interesting and conceivable. As a rough sketch, the two types of U(x,r) seem less different. But the higher absorbing point of (x,r) out of the two cases of bi-stable equilibria is mutually different (yellow region). The authors deliberately illustrated the time-series of properties and trajectory of (x,r) in some representative cases in Figs. 4 and 6.

      As a whole, I really evaluate this work as impressive.

    3. Reviewer #2 (Public Review):

      Liu, Chen and Szolnoki investigated the coupled dynamics of individual cooperation level and collective risk (i.e. the probability of future loss of all endowment). Their model encapsulates the assumption that not only does risk affect individual decision-making, but that there is also feedback between individual strategies, i.e. the level of individual contributions, and the level of risk. The authors investigate two main forms of this feedback, considering strategies linearly affecting the evolution of risk as well as non-linear (exponential) feedback. They mathematically analyze both these dynamical systems, identifying the fixed points, parametrized by the enhancement rate of defection u and the cost/benefit ratio of cooperation, and analyzing the stability of these points. The results of this systematic analysis show that, while the undesirable equilibrium state of full defection and high risk is always stable independent of the form of the feedback, the coevolutionary dynamics can exhibit a wide range of behaviors. In particular, depending on the initial conditions (frequency of cooperators), sustainable cooperation levels can be reached. This can happen by convergence to a stable fixed point with positive cooperation rates; additionally, the authors also prove that a Hopf bifurcation can take place in the system, such that a stable limit cycle with persistent oscillations in strategy and risk state can appear. Interestingly, the evolutionary outcomes do not depend significantly on the character of the feedback between strategy and risk. These theoretical results are supplemented by representative numerical examples, visualizing the phase plane and temporal dynamics of cooperation and risk for particular initial conditions and parameters.

      The main conclusions of the paper are fully supported by the results, as they are directly derived from the comprehensive mathematical analysis of the coevolutionary dynamics and do not rely on external data. Additionally, the stability analysis is clean and the comprehensive numerical examples deepen the reader's understanding. Another strength of the paper is the fact that the considered model is complex enough to be able to still represent somewhat realistic settings while being simple enough to rigorously analyze. One particularly interesting finding is the fact that the exact form of the risk feedback function or its speed does not play a very significant role in the outcome of the dynamics.

      The paper hence adds to the literature on the coevolution of environment and strategies in a productive way and will be of interest to various research communities in mathematical biology/ecology and decision-making.

    1. Author Response

      Reviewer #2 (Public Review):

      Weaknesses: The authors do not make a direct link between TOR and REPTOR2 signalling. This seems important since REPTOR2 is a novel gene that arose from the duplication of REPTOR.

      We have added several experiments to strengthen the connection between TOR and REPTOR2, and determined the effect of co-silencing of TOR and REPTOR2 on autophagy and proportion of the winged morph. Please see the details below in your comments point 3.

    1. Author Response

      Reviewer #2 (Public Review):

      This paper has collected an impressive data set of the visual response properties of neurons in the visual layers of the mouse superior colliculus. There are 3 main findings of the study. First, the authors identify 24 functional classes of neurons based on the clustering of each neuron's visual response properties. Second, unlike in the retina where each cell type is regularly spaced, functional classes in the superior colliculus appear to cluster near each other. Third, visual representation has a lower dimensionality in the superior colliculus compared to the retina. The dataset has the potential to support the conclusions of the paper, but further analysis is required to make the claims convincing.

      Strengths:

      The main strength of the paper is its impressive dataset of more than 5000 neurons from the visual layers of the superior colliculus. This data set includes recordings from both an interesting set of genetically labelled classes of cells and from a reasonably large portion of the superior colliculus. This dataset offers the opportunity to support the major claims of the paper. This includes i) the identification of 24 functional classes of neurons, ii) the intriguing possibility that functional classes form local patches within the superior colliculus and iii) that the representation of visual information in the superior colliculus has a lower dimensionality compared to the retina.

      Weaknesses:

      The weakness of the paper is that its main claims are not adequately supported by the presented data or analysis. First, support for the existence of 24 functional classes is not clear enough. Our major concern is that it is not clear that each class of neurons was distributed across different mice. Are certain cell types overrepresented in individual animals, or do you find examples of each cell type in most animals?

      The new Supplementary Figure 7G shows how individual mice contribute to the functional types for all neurons. Further, the new Supplementary Figure 12 shows the receptive field locations derived from recordings in each of the animals.

      In addition, it should be made explicit how the responses of each genetically labeled class of neurons are distributed among the 24 functional clusters.

      We have added a new Figure 5D to show this.

      Second, the analysis of the spatial clustering of functional cell types is not complete. Do the same functional clusters sample the same retinotopic locations in different mice? How are clusters of the functional type distributed in visual space?

      Please see our point-by-point responses below to the concerns.

      Third, the lower dimensionality of representation in the superior colliculus may be the result of selective projections of retinal ganglion cells, not all retinal ganglion cell types project to the superior colliculus. Please estimate the dimensionality of the visual representation of those retinal ganglion cell types that projects to the superior colliculus.

      Certainly part of the dimensionality reduction may come from the incomplete retino-geniculate projection; we have added discussion on this topic.

    1. Author Response

      Reviewer #1 (Public Review):

      In this manuscript, the authors describe a one-step genome editing method to replace endogenous EB1 with their previously-developed light-sensitive variant, in order to examine the effect of acute and local optogenetic inactivation of EB1 in human neurons. They then attempt to assess the effects of EB1 inactivation on microtubule growth, F-actin dynamics, and growth cone advance and turning. They also perform these experiments in neurons that are lacking EB3, in order to determine whether EB1 can function in a direct and specific way without possible EB3 redundancy.

      First, the experiments depicting the methodology are rigorous and compelling. Most previous studies of +TIP function use knockout or knockdown studies in which the proteins are inactivated over many hours or days in non-human systems. This is the first study to acutely and locally inactivate a +TIP in human neurons. While this group previously published the effects of replacing endogenous EB1 with the light-sensitive variant, the novelty in this current study is that they use a one-step gene editing replacement method (using CRISPR/Cas9) along with using human neurons derived from iPSCs. After proving their new experimental system works, the authors next seek to test the effect that acutely inactivating EB1 (alongside chronic EB3 knockdown) has on microtubule dynamics, and they observe a marked reduction in MT growth and MT length. They then seek to investigate whether F-actin dynamics are immediately affected by EB1 inactivation.

      While measured F-actin flow rates are not significantly affected, which leads the authors to conclude that EB1 inactivation does not have any immediate effect, the included figures and movies show a different phenotype, which is not discussed. Finally, they examine the effect of EB1 inactivation on growth cone advance and growth cone turning, and find that both are affected. However, the lack of certain controls in these final experiments (specifically for Figures 3, 4, and 5) reduces the strength of their findings.

      Thus, the first part of this paper describing the new methodology is very compelling and should be of interest to a wide readership, while the second part describing the functional analysis is mostly solid, with very high-quality imaging data. However, additional analysis and controls would be needed to increase confidence in their conclusions.

      1) Analysis of F-actin dynamics is not thorough, and their claim is not completely supported by the data. Figure 3 only depicts F-actin dynamics data from growth cones of π-EB1 EB3-/- i3Neurons and does not [include] control growth cones (to compare dark and light conditions). While their conclusion is that F-actin dynamics are not affected, there do appear to be immediate changes in the F-actin images, other than flow rates. For example, the F-actin bundles do not appear to emanate straight out with the light condition, compared to the dark condition. There also appears to be more F-actin intensity in the transition domain of the growth cone, compared to the dark condition. If the reason is due to the effects of four minutes of blue light exposure, this would be made clear by doing this experiment with control growth cones as well.

      In Figure 3, we wanted to specifically test if π-EB1 photoinactivation has an immediate effect on growth cone leading edge actin polymerization (for example because of rapid changes in Rho GTPase activity) by measuring F-actin retrograde flow. Because of photobleaching, these experiments are limited to relatively short time-lapse data sets, and within 4-5 min of blue light exposure, we found no significant difference between the dark and light conditions. As requested by this and another reviewer, we added a few more data points as well as a wild-type control. Statistical analysis by ANOVA shows no difference in retrograde flow between any of the four groups.

      We did not see a consistent difference in overall F-actin organization after a few minutes of blue light, and we now include control and π-EB1 growth cones in Fig. 3 that are more similar to one another with the dark image shown more immediately before blue light exposure. The growth cone that we had in the original figure (and that remains in Video 5 to illustrate retrograde flow and how dynamic these growth cones are) was a poor choice for this figure as it undergoes quite dramatic F-actin reorganization before the blue light is turned on, and the morphology immediately before blue light exposure is much more similar to the growth cone during blue light compared with the -5 min time point that we had originally shown.

      Lastly, the apparent relocalization of F-actin to the growth cone center is seen in both control and experimental conditions and we believe that has to do with photobleaching of the F-actin probe at the relatively high frame rates required to observe retrograde flow. We agree with the reviewer that it is important to know this, and we included a note in the figure legend.

      2) Analysis of the effect of EB1 inactivation on growth cone advance and growth cone turning. Figure 4C, showing the neurite unable to cross the blue light barrier, is potentially quite compelling data, but it would be even more convincing if there were also data showing that the blue light barrier has no effect on a control neurite. Given that a number of previous recent studies have shown a detrimental effect of blue light on neurons, it seems important to include these negative controls in this current study.

      The experiment growing neurites on a micropatterned laminin surface in combination with photoinactivation in (now) Figure 4D is incredibly low throughput but serves to illustrate repeated retraction from blue light over many hours of imaging. To show that blue light barriers do not affect control cells we have instead included a quantification of the retraction response of control and π-EB1 neurites growing randomly on a laminin-coated surface (not micropatterned stripes) in new Fig. 4C. It is also worth noting that the dose of blue light used for π-EB1 photoinactivation is much lower than what is typically used for fluorescence imaging (we analyzed and discussed this in great detail in our original π-EB1 publication), and especially in experiments with a blue light barrier, cells are not exposed to any blue light before they hit the barrier.

      3) This concern also holds true for the final experiment, in which the authors examine whether localized blue light would lead to growth cone turning. The authors report difficulty with performing this technically challenging experiment of accurately targeting the light to only a localized region of the growth cone. Thus, the majority of the growth cones (72%) were completely retracted, and so only a small subset of growth cones showed turning. However, this data would be more compelling if there were also a control condition of blue light with neurons that are not expressing the light-inactivated EB1. Another useful control would be to examine whether precise region-of-interest blue light leads to localized loss of EGFP-Zdk1-EB1C on MT plus-ends within the growth cone, or if the loss extends throughout the growth cone. Either outcome would be helpful to potential readers.

      We modified Fig. 5 to include control i3Neurons in this experiment. We also included a supplement to Fig. 5 showing that π-EB1 photodissociation remains localized to the blue light-exposed region. However, because in our π-EB1 line the C-terminal π-EB1 half is EGFP-tagged, we cannot show before and after images of local π-EB1 photodissociation.

      Reviewer #3 (Public Review):

      The major strength of the study was the approach of using photosensitive protein variants to replace endogenous protein with the 1-step Crispr-based gene editing, which not only allowed acute manipulation of protein function but also mimicked the endogenous targeted protein. However, the same strategy has been used by the same first author previously in dividing cells, somewhat reducing the novelty of the current study. In addition, the results obtained from the study were the same as those from previous studies using different approaches. In other words, the current study only confirmed the known findings without any novel or unexpected results. As a result, the study did not provide strong evidence regarding the advantage of the new experimental approach in our understanding of the function of EB1. Some specific comments are listed below.

      1) In Figure 1, to show that the photosensitive EB1 variant did not affect stem cell properties and their neuronal differentiation, Oct4 staining and western blot of KIF2C and EB3 were not strong evidence. Some new experiments more specifically related to stem cell properties or iPSC-derived neurons are necessary.

      While we did not attempt to fully characterize stemness in our π-EB1 edited i3N lines, we believe, most importantly, we show that π-EB1 i3N hiPSCs differentiate normally into i3Neurons. We show this morphologically as well as by immunoblotting and RT-qPCR experiments looking at marker proteins also including DCX, a well-established neuronal differentiation marker. Although not directly related to stemness, we included one additional RT-qPCR experiment more carefully analyzing the expression level of π-EB1 in the edited lines compared with EB1 in control i3N hiPSCs (new Fig. 1E).

      In addition, the effect of EB1 inactivation on microtubule growth was quantified in stem cells but not in differentiated neurons, which supposed to be the focus of the study.

      Quantification of MT dynamics in the hiPSCs parallels our previous experiments in cancer cell lines to demonstrate that π-EB1 photoinactivation had a similar inhibitory effect on MT growth in interphase cells. This serves as an additional control that our new system works as expected. Because of our inability to efficiently transfect i3Neurons, we could not measure MT growth in i3Neurons with the same method (i.e. automated EB1N tracking). However, as further outlined below we have added a quantification of MT growth rates in i3Neuron growth cones by additional manual tracking of SPY555-tubulin-labelled growth cone MTs after at least one minute of blue light exposure.

      In Figure S2D, quantification is needed to show the effect of blue light-induced EB1 inactivation in growth cones.

      Fig. 1 – supplement 2D (together with Video 3, and Fig. 2A) is simply to illustrate that the C-terminal π-EB1 half dissociates in blue light as expected. We previously characterized the kinetics of π-EB1 photodissociation and do not think redoing this would add substantially to the current manuscript. The remainder of the manuscript, however, examines the functional consequences of π-EB1 photoinactivation in i3Neurons.

      2) In Figure 2, the effect of blue light on microtubule retraction in the control cells was examined, showing little effect. However, it is still unclear if the blue light per se would have any effect on microtubule plus end dynamics, a more sensitive behavior than that of retraction. In Figure 2C, the length of individual microtubules in different growth cones was presented, showing microtubule retraction after blue light. Quantification and statistical analysis are necessary to draw a strong conclusion.

      Figure 2 shows that growth cone MTs in π-EB1 lines shorten in response to blue light and we did this by analyzing MTs that were visible in a short time window before and after blue light exposure. In response to another reviewer’s comment, we have redesigned this figure to better illustrate this result. We have now included statistical analysis comparing relative MT length 20 s before and during blue light exposure. In control cells that was not statistically significantly different. We also report statistical difference between control and π-EB1 lines at the 20 s by ANOVA in the text. Lastly, we also measured MT growth rates after at least one minute of blue light exposure showing that MT growth is greatly attenuated in π-EB1 lines (new Fig. 2D).

      The results showed that EB3 did not seem to contribute to stabilizing microtubules in growth cones. It was discussed that EB3 might have a different function from that of EB1 in the growth cone, although they are markedly up-regulated in neurons. In the differentiated neuronal growth cones examined in the study, does EB3 actually bind to the microtubule plus ends? In the EB3 knockout cells without the blue light, the microtubules were stable, indicating that EB3 had no microtubule stabilization function in these cells. Is such a result consistent with previous studies? If not, some explanation and discussion are needed.

      Other papers have shown that EB3 localizes to growth cone MT ends; for example, in rat cortical neurons (Poobalasingam et al., 2022). We did not test if endogenous EB3 is present on MT ends in i3Neurons, but transfected EB3 certainly is. Interestingly, it was reported by multiple groups that EB1 and EB3 do not bind to the exact same place near MT ends. EB3 trails behind EB1, which would be consistent with functional differences especially in controlling MT growth. We have expanded the discussion of such differences in the text, and thank Phillip Gordon-Weeks, who reminded us of this in a comment on the bioRxiv preprint.

      3) In Figure 3, for the potential roles of EB1 on actin organization and dynamics, only the rates of retrograde flow were measured for 5 min. and no change was observed. However, based on the images presented, it seemed that there was a reduced number of actin bundles after blue light and the actin structure was somewhat disrupted. Some additional examination and measurement of actin organization are necessary to get a clear result.

      This point was also raised by reviewer #1, and we now include images and quantification of retrograde flow in control growth cones and we increased the number of data points. We still see no difference in retrograde flow between all these groups. The original π-EB1 growth cone in Fig. 3A was a poor example because it underwent large morphological changes before the blue light was even turned on and just before light exposure is a lot more like the end point image. We therefore replaced this image with a different growth cone that is more similar to the wild-type growth cone shown, and also show images more immediately before blue light exposure. The bottomline is that we do not see a consistent difference in overall F-actin organization after a few minutes of blue light.

      4) In Figure 4, the effect of blue light and EB1 inactivation on neurite extension need to be quantified in some way, such as the neurite length changes in a fixed time period, and the % of growth cones passing the blue light barrier compared with growth cones of the control cells.

      We have included a statistical comparison (by ANOVA) at the 15 min time point, and a quantification of neurite retraction of growth cones encountering a blue light barrier.

      5) For the quantification of growth cone turning, a control condition is needed to show that blue light itself has no effect on turning.

      We have also added a control experiment to Fig. 5.

    1. Author Response

      Reviewer #1 (Public Review):

      1) The role of increased temperature on immunity and homeostasis in cold-blooded vertebrates is an understudied yet important field. This work not only examines how immunity is impacted by fever, but also incorporates an infection model and examines resolution of the response. This work can serve as a model for other groups interested in the study of hyperthermia and immunity.

      Thank you very much.

      2) Generally speaking, I agree with the authors' strategy and interpretations of the data.

      • In the Introduction, the authors chose to begin with how fever in endotherms impact the immune system. Considering that this work exclusively examines the response of a teleost (goldfish), the authors might consider flipping the way they present this work. After all, cold-blooded vertebrates rely on this response because of their basic physiology.

      We chose to begin with a description of fever in endotherms because we know less about those immune mechanisms impacted by fever in ectotherms. The goal was to provide points of comparison based on published datasets. Indeed, we also expect differences between cold- and warm-blooded vertebrates based on their basic physiologies. However, it is interesting that despite different physiologies and thermoregulatory strategies, common biochemical pathways appear to regulate fever across cold- and warm-blooded vertebrates. This is now captured more clearly in the Introduction section (lines 134-136). Added support also comes from the work that we present in this study, including fever inhibition experiments using ketorolac tromethamine (lines 244-253; Figure 3C).

      3) I thought the set up of the work in figure 1 was innovative and could provide an example of how to study such a problem.

      Thank you. Very much appreciated.

      4) Figure 2 was (to me) unexpected. One would not expect such tight response to hyperthermia and infection. This experiment in and of itself was quite interesting, and worth following up in future experiments (by the authors and other groups).

      The level of homogeneity in the behavioural responses shown in Figure 2 was a big part of why we pursued this work. It was striking that fish would display such consistency in behaviour during the febrile window, regardless of whether they were evaluated in groups or individually. To us, this suggested that the temperature chosen and the kinetics of this thermal preference are central for modulation of downstream biological processes. Added support for the importance of precise thermal selection comes from "failed" experiments during this study where incoming aquatic facility water temperatures fluctuated due to factors outside of our control. This caused temporary disruption to the temperatures available to these fish in the annular thermal preference tank. In these cases, we noted disruption of both classical behaviours shown in Figure 2 as well as downstream benefits.

      • The other work, on the response to infection and the resolution of infection were unique to this paper, and (sorry to be repetitive) can be an example of how to devise such studies.

      Thank you.

      • On the other hand, I am not sure this is a study of "fever." That implies how increased temperature impacts immunity and resolution in endotherms. Perhaps the authors could temper the comparisons between cold- and warm-blooded vertebrates regarding the response to hyperthermia.

      We believe that for those mechanisms that are evolutionarily conserved, the teleost system will offer an opportunity for novel insights into the effects of fever induction and disruption. Indeed, this animal model offers multiple advantages. But we agree that much work remains to establish the extent of this conservation and now highlight this issue more clearly (lines 454-455).

      An additional note on hyperthermia versus fever: although both terms are sometimes used interchangeably in the literature, we make a distinction between them. Hyperthermia captures an increase in core body temperature. However, this alone is not sufficient to engage the CNS (representative results shown in Figure 3-figure supplement 1). Consistent with prior descriptions of fever (e.g. Nat Rev Immunol (2015)15:335-49; Arch Intern Med (1998)158:1870-81), we also show that our model results in CNS engagement (Figure 3A), induces systemic pyrogen release (Figure 3B), triggers classical sickness behaviours (Figure 2), and promotes immune function (Figures 4-7).

    2. eLife assessment

      This study addressed a long-standing question in biology - the role of fever during infections. Using innovative research strategy, the authors provide compelling evidence for the positive impact of higher body temperature on both pathogen clearance and tissue repair. This study thus provides important advances in our understanding of host defense and its connection with physiology and behavior.

    3. Reviewer #1 (Public Review):

      The role of increased temperature on immunity and homeostasis in cold-blooded vertebrates is an understudied yet important field. This work not only examines how immunity is impacted by fever, but also incorporates an infection model and examines resolution of the response. This work can serve as a model for other groups interested in the study of hyperthermia and immunity.

      Generally speaking, I agree with the authors' strategy and interpretations of the data.

      - In the Introduction, the authors chose to begin with how fever in endotherms impact the immune system. Considering that this work exclusively examines the response of a teleost (goldfish), the authors might consider flipping the way they present this work. After all, cold-blooded vertebrates rely on this response because of their basic physiology.

      - I thought the set up of the work in figure 1 was innovative and could provide an example of how to study such a problem.

      - Figure 2 was (to me) unexpected. One would not expect such tight response to hyperthermia and infection. This experiment in and of itself was quite interesting, and worth following up in future experiments (by the authors and other groups).

      - The other work, on the response to infection and the resolution of infection were unique to this paper, and (sorry to be repetitive) can be an example of how to devise such studies.

      - On the other hand, I am not sure this is a study of "fever." That implies how increased temperature impacts immunity and resolution in endotherms. Perhaps the authors could temper the comparisons between cold- and warm-blooded vertebrates regarding the response to hyperthermia.

    4. Reviewer #2 (Public Review):

      Fever is an ancient and conserve response to infection from invertebrates to humans. However, the functional benefits of engaging fever responses are not clear, especially when it comes to moderate fever responses where pathogen growth Is not impaired by temperature. This study aims to develop a natural in vivo fever model in fish that overcomes many of the technical challenges to investigate fever in mammals. In ectotherms, fever is manifested as a behavioral response by which animals move to warmer temperatures. By using this new developed in vivo behavioral ring, the present study reveals new functional roles for fever in vertebrates. Additionally, upon infection, sickness behavior did not only consist of fever, but two novel lethargic behaviors not previously described in fish. The experimental evidence is compelling and supports the authors' conclusions. The data presented strongly indicates that moderate fever levels are critical for fine tuning immune responses to pathogens. By triggering earlier but weaker antimicrobial defenses, moderate fever in teleosts results in controlled inflammation and improved wound healing. These exciting results reveal novel roles of fever as a way to minimize the collateral damage that inflammatory responses often cause to the host. This work advances our conceptual view of the evolutionary advantages that fever brings to host-pathogen interactions. The technological development of the annular temperature preference tank can now become the gold standard platform to investigate the consequences of fever during teleost infection.

    1. Author Response

      Reviewer 1 (Public Review):

      The authors in this manuscript investigate the effect of co-substrate cycling on the metabolic flow. The main finding is that this cycling can limit the flux through a pathway. The authors examine implications of this effect in different simple configurations to highlight the potential impact on metabolic pathways. Overall, the manuscript follows logical steps and is accessible. Once the main point-reduction in flux of a pathway with limited pool of a cycled co-substrate-is established, some of the following steps become expected (e.g. the fraction of the flux in a branched pathway). Nevertheless, it is understandable that the authors have picked a few simple examples of the metabolic network motifs to highlight the implications. The results presented in the manuscript overall support the conclusions. One weakness is that some of the details of the assumptions (e.g. the choices of rates) are not explicitly spelt out in the manuscript. This work is impactful because it brings into light how cycling of some of the intermediates in a pathway can influence metabolic fluxes and dynamics. This is a factor in addition to (and separate from) reaction rates which are often considered as the main driver of metabolic fluxes.

      We thank the reviewer for this accurate summary. Regarding the effect of parameters on the presented results, we note that the first part of the results are based on analytical solutions provided in the Appendix (formerly the SI). These results are given as inequalities comprising parameters, allowing direct evaluation of parameter effects. We have now made this point explicit in the presentation of the results.

      In the second part of the results, we utilise numerical simulations and in this case, the observed results can possibly depend on parameters. We have explored effects of key parameters, that is kin and total substrate concentration through presented 'phase diagram' style figures - see Figure 2 and 4. For additional parameters, we have now included additional simulations exploring their effects - e.g. see Appendix - Figure 11 and Appendix – Figure 13.

      Reviewer 2 (Public Review):

      The cycling of "co-substrates" in metabolic reactions is possibly a very important but often overlooked determinant of metabolic fluxes. To better understand how the turnover dynamics of co-substrates affect metabolic fluxes the authors dissect a few metabolic reaction motifs. While these motifs are necessarily much simpler than real metabolic networks with dozens or hundreds of reactions, they still include important characteristics of the full network but allow for a deeper mathematical analysis. I found this mathematical approach of the manuscript convincing and an important contribution to the field as it provides more intuitive insights how co-substrate cycling could affect metabolic fluxes. In the manuscript, the authors stress particularly how the pool sizes of co-substrates and the enzymes involved in the cycling of those can constrain metabolic fluxes but the presented results also go substantially beyond this statement as the authors further illustrate how turnover characteristics of substrates in branches/coupled reactions can affect the ratio of produced substrates.

      The authors further present an analysis of previously published experimental data (around Figure 3). This is a very nice idea as it can in principle add more direct proof that the cycling of co-substrates is indeed an important constraint shaping fluxes in real metabolic networks and (instead of being merely a theoretical phenomena which occurs only in unphysiological parameter regimes). However, the way currently presented, it remained unclear to which extent the data analysis is adding convincing support that co-cycling substantially constrains metabolic fluxes. Particularly, it remains unclear for which organisms and conditions the used experimental dataset holds, how it has been generated, and with what uncertainty different measured values come. For example, the comparison requires an estimation of v_max. How can these values determined in-vivo? Are (expected) uncertainties sufficiently low to allow for the statement that fluxes are higher than what enzyme kinetics predict? Furthermore, I am wondering to which extent the correlations between co-substrate pool levels and flux is supporting the idea that co-substrate cyling is important. The positive relation between ATP/AMP/ADP levels for example, is a nice observation. However, it remains a correlation which might occur due to many other factors beyond the limitations of cosubstrate cycling and which might change with provided conditions.

      We thank the reviewer for this accurate summary. Although, we would like to clarify that we do not observe nor analyse any relation between ATP/AMP/ADP levels. Rather, in the analysis presented in Fig. 3B-D, we are looking at the relation between fluxes in co-substrate utilising reactions and the pool size of that co-substrate (e.g. total ATP, AMP, and ADP level for reactions utilising any one of these three co-substrates).

      In their summary, the reviewer raises several valid points about the data analysis and its possible limitations. We address them here point by point:

      How are Vmax values gathered/estimated? We have now added more information regarding how the Vmax values were gathered and from which organisms and conditions. Specifically, we used previously published values of Vmax from (Davidi et al. 2016) where it was estimated by multiplying the in vitro determined kcat by the concentration of the enzyme from proteomic measurement under different conditions - all for model organism Escherichia coli. See also below, reply to recommendation 2.

      Are (expected) uncertainties sufficiently low? It is difficult to have an estimate for the uncertainty since much of the error in the previous analysis probably comes from the fact that the kinetic parameters determined in vitro are used to estimate fluxes under in vivo conditions - the main source of error is expected to be this discrepancy, which is hard to estimate. However, since the plot is in log-scale, we highlight only gaps that are more than 1 order of magnitude (dashed diagonal lines) and hopefully the uncertainty is lower than that. Furthermore, high uncertainty would probably contribute equally to over- and under-estimating the maximal flux, while we can clearly see that the flux rarely exceeds the Vmax. We have now included a statement in the revised text capturing this point.

      Correlations offer weak evidence. Unfortunately, as we do not have measurements on co-substrate pool sizes and cycling kinetics under all conditions, our analyses of experimental data from cycling-involving reactions are admittedly limited. However, they do show that (1) measured fluxes are lower than those predicted by kinetics of the primary enzyme (i.e. enzyme involved in co-substrate and substrate conversion) alone, and (2) there is - for some cycling-involving reactions - a correlation between flux and co-substrate pool size. Both observations could indicate co-substrate pool sizes and/or co-substrate cycling dynamics being limiting. As the reviewer points out, we cannot state this as a certainty.

      Other possible limitations include thermodynamic effects, i.e. limitation by the concentration of both substrate or product, or substrate saturation. We already explored the latter possibility and found that there is still a lower flux when taking into account the primary substrate saturation (see Fig. S6). The former effect is very difficult to analyse without more data, as calculating reaction thermodynamics requires knowledge of concentrations for all substrates and products, as well as enzyme Michaelis-Menten constants in both forward and backward directions. This information is currently not available except for few of the reactions among the ones we analysed. Nevertheless, to give as much insight as possible on the thermodynamic effect, we added a new figure (Appendix – Figure 8) where we plot the physiological Gibbs free energy (is calculated assuming that all reactants are at 1 mM and pH=7) against the normalized flux. The plot shows that although in few cases, such as malate dehydrogenase (MDH), the normalised flux seems to be greatly reduced by the thermodynamic barrier, the general picture is that there is little correlation between physiological Gibbs free energy and normalised flux. We have now included the resulting figure and associated discussion in the revised manuscript.

      In relation to all these points on data-based support of the theory, we would also like to point out the comments from reviewer 3 and the fact that our theoretical work provides motivation for further future experimental studies of co-substrate cycling dynamics. Our main analysis about co-substrate dynamics becoming limiting is based on analytical solutions. These solutions provide an inequality of system parameters relating pathway influx, co-substrate pool size, and co-substrate related enzymatic parameters. When this inequality is satisfied, there will be flux limitation due to cosubstrate cycling. Future experimental studies can now be devised to explore this inequality under different conditions by measuring the key parameters more explicitly. This key point and aspects of the above replies are incorporated at the relevant points in the main text. In addition, we have included a new paragraph in the Discussion section (see reply to second recommendation of reviewer 3) and the following paragraph at the end of the Results section:

      In summary, these results show that for reactions involving co-substrate cycling (1) measured fluxes are lower than those predicted by kinetics of the primary enzyme (i.e. enzyme involved in substrate conversion) alone, and (2) there is - for some reactions - a correlation between flux and co-substrate pool size. Both observations could indicate co-substrate pool sizes and/or co-substrate cycling dynamics being a main limiting factor for flux. We can not state this as a certainty, however, as there are possibly other factors acting as the extra limitation, including thermodynamic effects. These points call for further experimental analysis of co-substrate cycling within the study of metabolic system dynamics.

      Reviewer 3 (Public Review):

      In the study, the authors present a mathematical framework and data analysis approach that revisits an "old" idea in cell physiology: The role of co-substrate cycling as potential key determinant of reaction flux limits in enzyme-catalyzed reaction systems. The aim of the study is to identify metabolic network properties that indicate potential global flux regulatory capacities of co-substrate cycling.

      The authors approached this aim in two steps. First, a mathematical framework, which is based on ODEs was developed and which reflects small abstract metabolic pathways including kinetic parameters of the involved reactions. While the modeled pathways are abstract, the considered pathway motifs are motivated by structures of known existing pathways such as glycolysis (as example of a linear pathway) and certain amino acid biosynthesis pathways (as example of branched pathways). The developed ODE-based models were used for steady state analysis and symbolic and numerical simulations of flux dynamics. As a main result of the first step, the authors highlight that co-substrate cycling can act as mechanism which limits specific metabolic fluxes across the metabolic network and that co-substrate cycling can facilitate flux regulation at branching points of the network. Second, the authors re-analyzed data on flux rates (experimental measurements and flux-balance-analysis predictions) from previous publications in order to assess whether the predicted role of co-substrate cycling could explain the observed flux distributions. In this data analysis, the author provide evidence that the fluxes of specific reactions in central metabolism could be constrained by co-substrate cycling, because their observed fluxes are often lower than expected by the kinetics of the corresponding enzymes.

      A particular strength of the study is that the authors highlight that co-substrates are not limited to ATP and NAD(P)H, but could include a range of other metabolites and which could also be organism-specific. Building on this broad definition of cosubstrates, the authors developed an abstract mathematical framework that can be used to study the general potential 'design principle' of co-substrate cycling in cellular metabolism and to adapt the framework to study different co-substrates in specific organisms in future works.

      Experimental data (i.e. measured fluxes using mass-spectrometry data and labeled substrates) that is available to date is limited and therefore also limits the broad evaluation of the developed mathematical framework across various different organisms and environmental conditions. However, with advances in metabolomics and derived metabolic flux measurements, the mathematical framework will serve as a valuable resource to understand the potential role of co-substrate cycling in more biological systems. The framework might also guide new experiments that generate data for a systematic evaluation of when and to what extent co-substrate cycling governs flux distributions, e.g. depending on growth rates or response to environmental stress.

      We thank the reviewer for this accurate summary. We agree with the reviewer's final comments on limitations of current testing of our theory, due to limitations in existing data, and that this analysis will now motivate further experimental study of co-substrate dynamics. We have already included revisions of the manuscripts to further highlight and discuss limitations of the data-based analysis.

    1. Author Response

      Reviewer #1 (Public Review):

      This study investigates the psychological and neurochemical mechanisms of pain relief. To this end, 30 healthy human volunteers participated in an experiment in which tonic heat pain was applied. Three different trial types were applied. In test trials, the volunteers played a wheel of fortune game in which wins and losses resulted in decreases and increases of the stimulation temperature, respectively. In control trials, the same stimuli were applied but the volunteers did not play the game so that stimulation decreases and increases were passively perceived. In neutral trials, no changes of stimulation temperature occurred. The experiment was performed in three conditions in which either a placebo, or a dopamineagonist or an opioid-antagonist was applied before stimulations. The results show that controllability, surprise, and novelty-seeking modulate the perception of pain relief. Moreover, these modulations are influenced by the dopaminergic but not the opioidergic manipulation.

      Strengths

      • The mechanisms of pain relief is a timely and relevant basic science topic with potential clinical implications.

      • The experimental paradigm is innovative and well-designed.

      • The analysis includes advanced assessments of reinforcement learning.

      Weaknesses

      • There is no direct evidence that the opioidergic manipulation has been effective. This weakens the negative findings in the opioid condition and should be directly demonstrated or at least critically discussed.

      We agree that we cannot provide direct evidence on the effectiveness of the opioidergic manipulation in our study. However, previous literature strongly suggests that a dose of 50 mg naltrexone (p.o.) is effective in blocking 𝜇-opioid receptors in humans. Using positron emission tomography, Weerts et al. (2013) found a blockage of 𝜇-opioid receptors of more than 90% with 50 mg naltrexone (p.o.) although given repeatedly 4 days in a row. In addition, convincing effects on behavioral functions have been reported with comparable doses that support the efficacy of the opioidergic manipulation. For example, Chelnokova et al. (2014) found attenuating effects of 50 mg naltrexone (p.o.) on wanting as well as liking of social rewards, implicating the involvement of endogenous opioids in the processing of rewarding stimuli. The same dose was also found to attenuate reward directed effort exerted in a value-based decision-making task (Eikemo et al., 2017). Moreover, 50mg of naltrexone (p.o.) have been shown to reduce endogenous pain inhibition induced by conditioned pain modulation (King et al., 2013) and to reduce the perceived pleasantness of pain relief (Sirucek et al., 2021). Thus, based on the available literature we assume the effectiveness of our opioidergic manipulation. A corresponding reasoning including a note of caution on the of the lack of a direct manipulation check of the opioidergic manipulation can be found in the manuscript in the Discussion:

      “The doses and methods used here are comparable to those used in other contexts which have identified opioidergic effects. Using positron emission tomography, Weerts et al. (2013) found a blockage of opioid receptors of more than 90% by 50 mg of naltrexone (p.o.) in humans given repeatedly over 4 days. In addition, effects on behavioral functions have been reported with comparable doses that support the efficacy of the opioidergic manipulation. Chelnokova et al. (2014) found attenuating effects of 50 mg naltrexone (p.o.) on wanting as well as liking of social rewards, implicating the involvement of endogenous opioids in the processing of rewarding stimuli. The same dose was also found to attenuate reward directed effort exerted in a value-based decision-making task (Eikemo et al., 2017). Moreover, 50 mg of naltrexone (p.o.) have been shown to reduce endogenous pain inhibition induced by conditioned pain modulation (King et al., 2013). Thus, based on the literature we assume that the opioidergic manipulation was effective in this study, although we do not have a direct manipulation check of this pharmacological manipulation. Despite its effectiveness in blocking endogenous opioid receptors, the effect of naltrexone on reward responses was found to be small (Rabiner et al., 2011). Hence, a lack of power may have limited our chances to find such effects in the present study.”

      • The negative findings are exclusively based on the absence of positive findings using frequentist statistics. Bayesian statistics could strengthen the negative findings which are essential for the key message of the paper.

      We agree with the reviewers that the power may not have been sufficient to detect potentially small effects of the pharmacological manipulations. The power calculation was based on the design and the medium effect size found in a previous study using a comparable experimental procedure for assessing pain-reward interactions (Becker et al., 2015). To acknowledge this weakness, we clarified in the manuscript the description of the a priori sample size calculation as follows:

      “The power estimation was based on the design and the finding of a medium effect size in a previous study using a comparable version of the wheel of fortune game without pharmacological interventions (Becker et al., 2015). The a priori sample size calculation for an 80% chance to detect such an effect at a significance level of 𝛼=0.05 yielded a sample size of 28 participants (estimation performed using GPower (Faul et al., 2007 version 3.1) for a repeated-measures ANOVA with a three-level within-subject factor)."

      Further, we did not aim to claim that endogenous opioids do not affect the perception of pain relief. Our phrasing in describing the results was in several instances too bold. The aim of the pharmacological manipulations was to investigate effects of dopamine and endogenous opioids on endogenous modulation of perceived intensity of pain relief. Here, we expected dopamine to enhance such endogenous modulation and naltrexone to reduce this modulation. The higher average pain modulation under naltrexone compared to placebo found in VAS ratings (naltrexone: -10.09, placebo: -7.31, see Table 1) suggests an increase in pain modulation by naltrexone compared to placebo, although this did not reach statistical significance, which is the opposite of what we had expected (see comment #11). Therefore, we concluded that we have no evidence to support our hypothesis of reduced endogenous modulation of pain relief by naltrexone. We do not want to claim that there are no effects of endogenous opioids on pain modulation. Although Bayesian statistics might be used to support such an interpretation, we think this might be misleading in our context here due to the considerations on the lack of power (which also affects null-hypothesis testing in Bayesian statistics) and the lack of a direct manipulation check mentioned above. Since we expected opposite effects of levodopa and naltrexone on pain modulation, we did not intend to compare these effects directly to avoid a distortion of the results. According to our hypotheses, we expected to see increased modulation of pain relief with enhanced dopamine availability and decreased modulation of pain relief with blocking of opioid receptors (see also comment #11). However, we had no a priori assumptions on potential differences in the absolute changes induced by the drug manipulations. Based on these considerations, we did now not include further direct comparisons of the effects of both drugs. Rather, we carefully went through the manuscript to tone down the descriptions and interpretations of our null findings and adjusted the respective section of the discussion to better reflect this interpretation.

      • The effects were found in one (pain intensity ratings) but not the other (behaviorally assessed pain perception) outcome measure. This weakens the findings and should at least be critically discussed.

      We thank the reviewers for highlighting this important aspect. We have considered the two outcome measures as indicative of two different aspects or dimensions of the pain experience, based also on previous results in the literature. Within our procedure, the ratings indicate the momentary perception of the stimulus intensity after phasic changes in nociceptive input (outcomes), while the behavioral measure indicates perceptual within-trial sensitization or habituation in response to the tonic stimulation within each trial. Supporting the assumption of such two different aspects, it has been shown before that pain intensity ratings and behavioral discrimination measures can dissociate (Hölzl et al., 2005). In line with the assumption that both outcome measures assess different aspects of the pain experience, a differential effect of controllability on these two outcome measures is conceivable. Similarly, Becker et al. (2015), using a very similar experimental paradigm, did only find endogenous pain facilitation in the losing condition of the wheel of fortune game in pain ratings but not in the behavioral outcome measure, while they found endogenous inhibition in both measures. Compared to Becker et al. (2015), we implemented here smaller changes in stimulation intensity as outcomes in the wheel of fortune game (-3°C vs -7°C for win trials, +1°C vs +5°C for lose trials), potentially resulting in the differential effects here. Nevertheless, we agree that this reasoning needs a more explicit discussion in the manuscript and we included the following sentences to the Discussion section:

      “Although we did not assess the affective component of the relief experience, we implemented two outcome measures that are assumed to capture independent aspects of the pain experience: VAS ratings indicate perception of phasic changes (outcomes), while the behavioral measure indicates perceptual within-trial sensitization or habituation in response to the tonic stimulation within each trial. We found enhanced endogenous modulation by controllability and unpredictability in the VAS ratings, in line with the view that endogenous modulation enhances behaviorally relevant information. In contrast, the within-trial sensitization did not differ between the active and passive conditions under placebo. In contrast, in a previous study using a similar experimental paradigm Becker et al. (2015) found a reduction of within-trial sensitization after pain relief outcomes by controllability. Compared to this study, we implemented here smaller changes in stimulation intensity as outcomes in the wheel of fortune (-3 °C vs -7 °C for pain relief), potentially explaining the differential results.“

      • The instructions given to the participants should be specified. Moreover, it is essential to demonstrate that the instructions do not yield differences in other factors than controllability (e.g., arousal, distraction) between test and control trials. Otherwise, the main interpretation of a controllability effect is substantially weakened.

      Thanks for pointing out that specific information on instructions given to the participants was missing. We agree that factors other than controllability would confound the interpretation of differences between test and control trials. We aimed minimizing nonspecific effects of arousal and/or distraction while still giving all needed information with our instructions (see below). In addition, control and test trials were kept as similar as possible. In order to check for unspecific effects of arousal and/or distraction, we also included lose trials in the game as an additional control condition. For clarifying participants’ instructions, we added the following paragraph to the Materials and methods section: “The participants were instructed that there were two types of trials: trials in which they could choose a color to bet on the outcome of the wheel of fortune and trials in which they had no choice. Specifically, they were told that in the first type of trials they could use the left and right mouse button, respectively, to choose between the pink and blue section of the wheel of fortune. Participants were further instructed that if the wheel lands on the color they had chosen they will win, i.e. that the stimulation temperature will decrease, while if the wheel lands on the other color, they will lose, i.e. that the stimulation temperature will increase. For the second type of trials, participants were instructed that they could not choose a color, but were to press a black button, and that after the wheel stopped spinning the temperature would by chance either increase, decrease, or remain constant.”

      In general, both arousal and distraction can be assumed to affect pain perception. If the active condition in the wheel of fortune resulted in higher arousal and/or distraction this should result in comparable effects on intensity ratings in both the win and lose outcomes compared to the passive condition. In contrast, controllability is expected to have opposite effects on pain perception in win and lose trials (decreased pain perception after winning and increased pain perception after losing in the active compared to the passive condition). These opposite effects of controllability are tested by the interaction ‘outcome × trial type’ when fitting separate models for each drug condition, which should be zero if unspecific effects of arousal and/or distraction predominated. Instead, we found a significant interaction in these models, confirming opposing effects of controllability in win and lose outcomes and contradicting such unspecific effects. We added this reasoning, marked in red here, to the Results section to better highlight this line of reasoning, as follows:

      “To test whether playing the wheel of fortune induced endogenous pain inhibition by gaining pain relief during active (controllable) decision-making, a test condition in which participants actively engaged in the game and ‘won’ relief of a tonic thermal pain stimulus in the game was compared to a control condition with passive receipt of the same outcomes (Figure 1). As a further comparator the game included an opposite (‘lose’) condition in which participants received increases of the thermal stimulation as punishment. This active loss condition was also matched by a passive condition involving receipt of the same course of nociceptive input. Comparing the effects of active versus passive trials between the pain relief and the pain increase condition (interaction ‘outcome × trial type’) allowed us to test for unspecific effects such as arousal and/or distraction. If effects seen in the active compared to the passive condition were due to such unspecific effects, then actively engaging in the game should affect comparably pain in both win and lose trials. In contrast, if the effects were due to increased controllability, pain inhibition should occur in win trials and pain facilitation in lose trials.”

      • The blinding assessment does not rule out that the volunteers perceived the difference between placebo on the one hand and levodopa/naltrexone on the other hand. It is essential to directly show that the participants were not aware of this difference.

      We based our assessment of blinding on the fact that for none of the drug conditions the frequency of guessing correctly which drug was ingested was above chance (see Results section, page 8, lines 201ff). In addition, the frequency of side effects reported by the participants did not differ between the three drug conditions, supporting this notion indirectly. However, we agree with the reviewer that this does not rule out completely that participants may have perceived a difference between the placebo and the levodopa/naltrexone conditions. We ran additional analyses to test whether participants were more likely to answer correctly that they had ingested an active drug and whether they were more likely to report side effects in the active drug conditions compared to the placebo condition. In 7 out of 28 placebo sessions (25%) the participants assumed incorrectly to have ingested one of the active drugs. In 12 out of 43 drug sessions (21.8%) the participants assumed correctly that they had ingested one of the active drugs. These frequencies did not differ between placebo sessions on the one hand and the levodopa and naltrexone active drug sessions on the other hand (𝜒)(1) = 0.11, p = 0.737). In 9 out of 28 placebo sessions (32.1%) and in 23 out of 55 drug sessions (41.8%) participants reported to be tired at the end of the session. The frequency of reporting tiredness did not significantly differ between placebo sessions on the one hand and drug sessions on the other hand (𝜒)(1) = 1.06, p = 0.304). No other side effects were reported. We added the following information, marked in red here, to the Results section:

      “In 32 out of 83 experimental sessions subjects reported tiredness at the end of the session. However, the frequency did not significantly differ between the three drug conditions (𝜒)(2) = 2.17, p = 0.337) or between the placebo condition compared to the levodopa and naltrexone condition (𝜒)(1) = 1.06, p = 0.304). No other side effects were reported. To ensure that participants were kept blinded throughout the testing, they were asked to report at the end of each testing session whether they thought they received levodopa, naltrexone, placebo, or did not know. In 43 out of 83 sessions that were included in the analysis (52%), participants reported that they did not know which drug they received. In 12 out of 28 sessions (43%), participants were correct in assuming that they had ingested the placebo, in 6 out of 27 sessions (22%) levodopa, and in 2 out of 28 sessions (7%) naltrexone. The amount of correct assumptions differed between the drug conditions (𝜒)(2) = 7.70, p = 0.021). However, posthoc tests revealed that neither in the levodopa nor in the naltrexone condition participants guessed the correct pharmacological manipulation significantly above chance level (p’s > 0.997) and the amount of correct assumptions did not differ significantly between placebo compared to levodopa and naltrexone sessions (𝜒)(1) = 0.11, p = 0.737), suggesting that the blinding was successful.”

      • The effects of novelty seeking have been assessed in the placebo and the levodopa but not in the naltrexone conditions. This should be explained. Assessing novelty seeking effects also in the naltrexone condition might represent a helpful control condition supporting the specificity of the effects in the naltrexone condition.

      We thank the reviewer for this interesting suggestion. Indeed, we did not report the association of pain modulation with novelty seeking in the naltrexone condition, because we did not have an a-priori hypothesis for this relationship. We now included correlations for all three drug conditions, testing if higher novelty seeking was associated with greater perceptual modulation in the active vs. passive condition. In line with comment 3, we applied a correction for multiple comparisons here (Bonferroni-Holm correction). This correction caused the correlation in the placebo condition to be no longer significant with an adjusted p-value of 0.073 (r = -0.412), while the correlation stays significant in the levodopa condition (r = -0.551, p = 0.013). Because of a reasonable effect size of the correlation under placebo (i.e. r = -0.412), we still report this correlation to highlight the increase under levodopa, while emphasizing that this correlation not significant We carefully toned down the interpretation of this correlation to reflected the change in significance with the correction for multiple testing.

      We added the following information, marked in red here, in the Results section:

      “Previous data suggest that endogenous pain inhibition induced by actively winning pain relief is associated with a novelty seeking personality trait: greater individual novelty seeking is associated with greater relief perception (pain inhibition) induced by winning pain relief (Becker et al., 2015). Similar to these results, we found here that endogenous pain modulation, assessed using self-reported pain intensity, induced by winning was associated with participants’ scores on novelty seeking in the NISS questionnaire (Need Inventory of Sensation Seeking; Roth & Hammelstein, 2012; subscale ‘need for stimulation’ (NS)), although this correlation failed to reach statistical significance after correction for multiple comparisons using Bonferroni-Holm method (r = -0.412, p = 0.073). A significant association between novelty seeking and endogenous pain modulation was found in the levodopa condition (r = 0.551, p = 0.013). More importantly, the higher a participants’ novelty seeking score in the NISS questionnaire, the greater the levodopa-related endogenous pain modulation when winning compared to placebo (NISS NS: r = -0.483, p = 0.034 Figure 7). In contrast, higher novelty seeking scores were not correlated with stronger pain modulation induced by winning in the naltrexone condition (r = 0.153, p = 0.381) and the naltrexone induced change in pain modulation showed no significant association with novelty seeking (r = 0.239, p = 0.499). Pain modulation after losing was not associated with novelty seeking in placebo (r = 0.083, p = 0.866), levodopa (r = -0.164, p = 0.783), or naltrexone (r = 0.405, p = 0.133).

      No significant correlations with NISS novelty seeking score were found for behaviorally assessed pain modulation in the placebo, levodopa and naltrexone conditions during pain relief or pain increase (|r|’s < 0.35, p’s > 0.238). Similarly, the difference in pain modulation during pain relief or pain increase between the levodopa and the placebo condition and between the naltrexone and the placebo condition did also not correlate with novelty seeking (|r|’s < 0.22, p’s > 0.576).” <br /> We also edited the interpretation of the correlation in the Discussion:

      “Overall, all three predictions were largely borne out by the data: relief perception as measured by VAS ratings was enhanced by controllability, unpredictability and showed a medium sized - although not significant - association with the individual novelty-seeking tendency,”

      • The writing of the manuscript is sometimes difficult to follow and should be simplified for a general readership. Sections on the information-processing account of endogenous modulation in the introduction (lines 78-93), unpredictability and endogenous pain modulation in the results (lines 278-331) are quite extensive and add comparatively little to the main findings. These sections might be shortened and simplified substantially. Moreover, providing a clearer structure for the discussion by adding subheadings might be helpful.

      We have reworked the manuscript to make it easier to follow. Specifically, we reworked the Introduction section to simplify it and to make it more concise. Further, we also shortened the extensive descriptions of modeling procedures that are not central for understanding the main findings. We think that these additions make it easier to follow the manuscript and our line of arguments, and to understand the applied analysis strategies.

      • Effect sizes are generally small. This should be acknowledged and critically discussed. Moreover, effect sizes are given in the figures but not in the text. They should be included to the text or at least explicitly referred to in the text.

      We agree that the effect sizes we report appear generally small. Importantly, the effect sizes were calculated by dividing differences in marginal means by the pooled standard deviation of the residuals and the random effects to obtain an estimate of the effect size of the underlying population rather than only for our sample. This procedure was used for the purpose of achieving more generalizable estimates. Due to considerable variance between subjects in our sample, this procedure resulted in comparatively small effect sizes. Nevertheless, we think this calculation of effects sizes results in more informative values because they can be viewed as estimates of population effects. We added specific information on the calculation of the effect sizes and a brief explanation that this procedure results in comparatively small effect sizes estimates to the Materials and methods and to the Results section (see below). In addition, we included standardized effect sizes whenever we report the respective post-hoc comparisons in the Results section.

      “Effects sizes were calculated by dividing the difference in marginal means by the pooled standard deviation of the random effects and the residuals providing an estimate for the underlying population (Hedges, 2007).” (Materials and methods section)

      “We used post-hoc comparisons to test direction and significance of differences in either outcome condition and report standardized effect sizes for these differences. Note that all reported effect sizes account for random variation within the sample, providing an estimate for the underlying population; due to considerable variance between participants in the present study, this results in comparatively small effect sizes.” (Results section)

      • The directions of dopamine and opioid effects on pain relief should be discussed.

      We amended our explanation of the hypothesis on the expected drug effects. As outlined there, we indeed expected opposite effects of levodopa and naltrexone on endogenous pain modulation in the active vs. the passive condition of the wheel of fortune.

      Reviewer #2 (Public Review):

      This study used the tonic heat stimulation combined with the probabilistic relief-seeking paradigm (which is a wheel of fortune gambling task) to manipulate the level of controllability and predictability of pain on 30 healthy participants. The authors focused on the influence of controllability and unpredictability on pain relief using pain reports and computational models and examined the involvement of dopamine and opioids in those effects. For that, the authors conducted the three-day experiments, which involved placebo, levodopa (dopamine precursor), and naltrexone (opioid receptor antagonist) administration on separate days. Lastly, the authors examined the relationship between dopamine-induced pain relief and novelty-seeking traits.

      This is a strong and well-performed study on an important topic. The paper is well-written. I really enjoyed reading the introduction and discussion and learned a lot. Below, I have a few minor comments.

      First, given that the Results section comes before the Methods section, it would be helpful to include some method and experimental design-related information crucial for the understanding of the results in the Results section. For example, how long was the thermal stimulus? What was the baseline temperature? etc. Maybe this information can be included in the caption of Figure 1.

      We thank the reviewer for this helpful suggestion. We agree that due to the order of the manuscript sections, more information on experimental design and the statistical analysis strategies should be included in the results section. Accordingly, we included more detailed information on the analysis strategies in the Results section (please see responses to comments #5 & #9). In addition, we added more detailed information on the experimental design and information such as the duration of the stimuli and the baseline temperature, marked in red below, to the caption of Figure 1 (Results section).

      “Figure 1: Time line of one trial with active decision-making (test trials) of the wheel of fortune game. Experimental pain was implemented using contact heat stimulation on capsaicin sensitized skin on the forearm. In each trial, the temperature increased from a baseline of 30 °C to a predetermined stimulation intensity perceived as moderately painful. In each testing session, one of the two colors (pink and blue) of the wheel was associated with a higher chance to win pain relief (counterbalanced across subjects and drug conditions). Pain relief (win) as outcome of the wheel of fortune game (depicted in green) and pain increase (loss; depicted in red) were implemented as phasic changes in stimulation intensity offsetting from the tonic painful stimulation. Based on a probabilistic reward schedule for theses outcomes, participants could learn which color was associated with a better chance to win pain relief. In passive control trials and neutral trials participants did not play the game, but had to press a black button after which the wheel started spinning and landed on a random position with no pointer on the wheel. Trials with active decision-making were matched by passive control trials without decision making but the same nociceptive input (control trials), resulting in the same number of pain increase and pain decrease trials as in the active condition. In neutral trials the temperature did not change during the outcome interval of the wheel. Two outcome measures were implemented in all trial types: i) after the phasic changes during the outcome phase participants rated the perceived momentary intensity of the stimulation on a visual analogue scale (‘VAS intensity’); ii) after this rating, participants had to adjust the temperature to match the sensation they had memorized at the beginning of the trial, i.e. the initial perception of the tonic stimulation intensity (‘self-adjustment of temperature’). This perceptual discrimination task served as a behavioral assessment of pain sensitization and habituation across the course of one trial. One trial lasted approximately 30 s, phasic offsets occurred after approximately 10 s of tonic pain stimulation. Adapted from Becker et al. (2015).”

      Second, it would be helpful if the authors could provide their prior hypotheses on the drug effects. It could be a little bit confusing that the goal of using these drugs given that levodopa is a precursor of dopamine, whereas naltrexone is the opioid antagonist, i.e., the effects on the target neurotransmitters seem the opposite. Then, I wondered if the authors expected to see the opposite effects, e.g., levodopa enhances pain relief, while naltrexone inhibits pain relief, or to see similar effects, e.g., both enhance pain relief. Clarifying which direction of expected effects would be helpful for novice readers.

      We thank the reviewer for pointing out that information on the expected drug effects should be explained in more detail. Indeed, we expected opposite effects of levodopa and naltrexone with respect to the effect of controllability on pain relief. Levodopa, as a precursor of dopamine, enhances dopamine availability and thus, phasic release of dopamine in response to events, for example, the reception of reward. Accordingly, we hypothesized that endogenous modulation by relief outcomes are increased in the active (reward) compared to the passive condition. In contrast, naltrexone blocks opioid receptors and as such it has been reported that naltrexone blocks placebo analgesia as a type of endogenous pain inhibition. Correspondingly, we hypothesized that naltrexone decreases endogenous pain modulation induced by actively winning pain relief compared to the passive condition. We expanded the explanation of these hypotheses in the Introduction section as follows:

      “We expected increased dopamine availability to enhance phasic release of dopamine in response to rewards, and hence, to increase the effect of active compared to passive reception of pain relief. In contrast, we expected the inhibition of endogenous opioid signaling to decrease the effect of active controllability on pain relief. The latter is based on the observation that blocking of opioid receptors attenuates other types of endogenous pain inhibition such as placebo analgesia (Benedetti, 1996; Eippert et al., 2009) or conditioned pain modulation (King et al., 2013). “

      Third, on the "Behaviorally assessed pain perception" results in Figs. 2D-F, I wonder why the results for the "pain increase" were still positive. Were the y values on the plots the temperature that participants adjusted (i.e., against the temperature right before the temperature adjustment)? or are the values showing the differences from the baseline (i.e., against the baseline temperature)?

      The behavioral measure was calculated as the difference in temperatures between the memorization interval at the beginning of the trial (i.e. the predetermined temperature perceived as moderately painful) minus the self-adjusted temperature at the end of the trial so that positive values indicate sensitization (i.e. an increase in sensitivity) and negative values indicate habituation (i.e. a decrease in sensitivity) across the stimulation within on trial (i.e. approx. 30 seconds of stimulation). In general, for a stimulation of approximately 30 seconds with intensities perceived as painful, perceptual sensitization is expected to occur (Kleinböhl et al., 1999).

      The outcome of the wheel of fortune game, i.e. the phasic decrease (winning) or increase (losing) in stimulation intensity, should indeed have opposite effects on this sensitization. A decrease in nociceptive input negatively reinforces pain perception, as seen in stronger sensitization in win trials, while an increase in nociceptive input punishes pain perception, as seen in reduced perceptual sensitization in lose trials. Using the a very similar task, Becker et al. (2015) found values indicating habituation within trials with temperature increases in lose outcomes. However, in this previous study, increases of +5°C were used for lose outcomes (as compared to +1 °C in the present study). Thus, in the present study the comparatively small increase in absolute stimulation temperature may not have been sufficient to induce within trial habituation to the tonic heat pain stimulation.

      Nevertheless, independent of the effect of the outcome (increase or decrease of the stimulation intensity) our focus was on the additional effect that controllability (active vs. passive condition) had on the perception of the underlying tonic stimulation within each outcome condition (i.e. on the same nociceptive input). Here we expected to see endogenous inhibition after winning and endogenous facilitation after losing in the active compared to the passive condition.

      We added more detailed information on the calculation of the behavioral measure and the expected perceptual modulation within each trial due to the stimulus duration in the Methods section as well as in the Results section.

      Methods section:

      “After this rating, participants had to adjust the stimulation temperature themselves to match the temperature they had memorized at the beginning of the trial. This self-adjustment operationalizes a behavioral assessment of perceptual sensitization and habituation within one trial (Becker et al., 2011, 2015; Kleinböhl et al., 1999). Participants adjusted the temperature using the left and right button of the mouse to increase and decrease the stimulation temperature. The behavioral measure was calculated as the difference in temperatures in the memorization interval at the beginning of each trial minus this selfadjusted temperature at the end of each trial. Positive values, i.e. self-adjusted temperatures lower than the stimulation intensity at the beginning of the trial, indicate perceptual sensitization, while negative values indicate habituation.” Results section:

      “Positive values (i.e. lower self-adjusted temperatures compared to the stimulation intensity at the beginning of the trial) indicate perceptual sensitization across the course of one trial of the game, negative values indicate habituation. For tonic stimulation at intensities that are perceived as painful, perceptual sensitization is expected to occur (Kleinböhl et al., 1999). Differences between the outcome conditions (win, lose) reflect the effect of the phasic changes on the perception of the underlying tonic stimulus. Differences between active and passive trials reflect the effect of controllability on this perceptual sensitization within each outcome condition.”

      Lastly, I wonder if it is feasible or not, but examining the effects of dopamine antagonists will be helpful for obtaining a more definitive answer to the role of dopamine in information-related pain relief. This could be a good suggestion for future studies.

      We thank the reviewer for this suggestion. We agree that antagonistic manipulation of the dopaminergic system could provide further insights and confirm the role of dopamine in shaping pain related perception and behavior. Moreover, we think that bidirectional manipulations of opioidergic signaling could also provide valuable insights and should be used for future research. We added the following sentences to the Discussion section:

      “Because the mechanisms underlying learning from pain and pain relief and their recursive influence on pain perception may contribute to the development and maintenance of chronic pain, it is crucial to better understand the roles of dopamine and endogenous opioids in these mechanisms. Accordingly, bidirectional manipulations of both transmitter systems should be used in future studies to better characterize their respective roles in shaping behavior and perception.“

    1. eLife assessment

      This manuscript reports novel and important findings on the mechanisms of regulation of CRAC channels. Collectively, the work represents an important conceptual advancement, showing that stromal interaction molecule-1 is not necessary for Ca2+-dependent inactivation of the Orai1 channel and that Orai1 likely contains a Ca2+ sensor for autoregulatio. The experiments are carefully conducted, and the data is of high quality and support the major conclusions of the authors.

    2. Reviewer #1 (Public Review):

      In this report, Yeung et al studied a mutation in Orai1 channels (L138F) that is associated with constitutive CRAC channel activity and tubular aggregate myopathy (TAM) in humans. They put forth a model whereby substitution with large amino acids at position L138 on TM2 or the neighboring T92 on TM1 causes a steric clash between TM1 and TM2 and elicits a highly Ca2+ selective current in the absence of STIM1, the ER Ca2+ sensor protein that is the physiological activator of Orai channels. The authors went on to study one typical biophysical property of Orai1-mediated CRAC channels which is the fast Ca2+-dependent inactivation (CDI), after the surprising finding of the presence of CDI in CRAC currents mediated by T92 and L138 Orai1 mutants in the absence of STIM1. The authors showed differences in CDI between WT and mutants when using weak vs strong buffers and through computation and experimentation, they show that the Orai1 mutants have enhanced cytosolic Ca2+ sensitivity, which could be normalized when STIM1 was present. The experiments are carefully conducted and the manuscript is clearly written. The study has significant novelty and impact.

    3. Reviewer #2 (Public Review):

      The manuscript "A human tubular aggregate myopathy mutation unmasks STIM1-independent rapid inactivation of Orai1 channels" describes the effects of a disease-related gating checkpoint at the TM1-TM2 interface. The authors suggest that the mutation of one of the two oppositely located positions T92 - L138 into a large amino acid leads to constitutive activity due to steric clash. Notably, the mutants also exhibit robust Ca2+ dependent inactivation (CDI) suggesting that this feature is intrinsic to the Orai1 channel, and not as previously thought a key process that is triggered by STIM1. Nevertheless, STIM1 is able to fine-tune Ca2+ selectivity and CDI.

      This study provides an extensive electrophysiological characterization of the tubular aggregate myopathy (TAM)-disease-related Orai1 L138F mutation and based on mutational studies provides compelling evidence that constitutive activity is caused by a steric clash between TM1/TM2 Orai helices. Additionally, yet unexpectedly, the constitutive Orai1 mutants exhibit CDI behavior which is thoroughly characterized by experiments using various intracellular Ca2+-buffering reagents. By this, it is proposed that the Orai1 T92W mutant shows increased sensitivity to intracellular Ca2+. This is further revealed in a sophisticated tow step protocol, which would profit from additional control experiments. The unusual behavior of the T92W Orai1 mutant is "corrected" to that of the Orai1 wild-type form by the presence of STIM1.

    4. Reviewer #3 (Public Review):

      In this paper, Yeung et al., use patch-clamp electrophysiology measurements combined with structural analyses and mutagenesis to compellingly reveal how the tubular aggregate myopathy (TAM)-associated Orai1 L138F mutation leads to the gain of CRAC channel function. They discover that L138F not only constitutively activates Orai1-composed channels but also enhances Ca2+-dependent inactivation (CDI). The authors find that the L138F gain of function occurs due to a steric clash with T92 from an adjacent subunit and that introduction of a bulky residue at the T92 position similarly activates CRAC channels and enhances CDI in the absence of STIM1. Nevertheless, co-expression of STIM1 with strongly activating T92W or L138F mutants regularized the CDI to wild-type levels. Collectively, the work represents an important conceptual advancement, exposing that STIM1 is not necessary for CDI and that Orai1 likely contains the Ca2+ sensor intrinsically for this phenomenon.

      Strengths:<br /> The authors use rigorous and careful electrophysiological measurements to probe how the TAM-related mutation (L138F) affects the biophysical properties of CRAC channels. The extensive and systematic mutagenesis (i.e. substitution to every possible amino acid at the T92 and L138 sites) coupled with these functional assessments reveal a steric clash between L138F and T92 and provide a complete picture of how any residue type at the so-called T92/L138 lever point may contribute to constitutive CRAC and CDI activity. The use of available high-resolution structural data to interpret functional data, rationalize the consequence of new mutations related to the mechanisms of L138F dysfunction, and generate new hypotheses is a strength of the research. Overall, the work provides a considerable conceptual advance in terms of understanding the molecular requirements for CRAC and CDI activity; in particular, the discovery that CDI can occur independently of STIM1 and the notion that Orai1 may contain an intrinsic Ca2+ sensor that regulates CDI are important steps forward for the field.

      Weaknesses:<br /> While the work provides a phenomenological advancement regarding CRAC channel regulation and pinpoints new important residues for function, some aspects of the study appear incomplete. It was shown that STIM1 can normalize the enhanced CDI caused by the T92W mutation, but it is not clear how this happens. Further, the authors propose a "push" - "pull" mechanism for the complementary roles L138 and H134 in channel regulation but do not provide any structural dynamics data to support this idea. The authors provide a mathematical explanation for chelator-specific differences in CDI observed for the T92W compared to WT Orai1 but do not show any fitted data to accompany and support the model. Finally, the authors show that a considerable portion of the CDI can be eliminated after a C-terminal Orai1 deletion (i.e. residues 267-301) and probe the idea that N-terminal W76, Y80, and R83 residues may contribute to the residual CDI effect; however, after W76E, Y80E, R83E mutations showed enhanced CDI (rather than suppressed) in the context of the T92W mutation, no further experiments were pursued to account for the residual CDI.

      Overall, the strengths far outweigh the weaknesses of this study, and the conclusions drawn based on the data are compelling. The work represents an important conceptual advancement as future studies can now steer towards identifying the STIM-independent Ca2+ sensor underlying the CDI of CRAC channels and revealing structural mechanisms by which Ca2+ sensing leads to pore closure.

    1. eLife assessment

      This important paper advances the understanding, in the model plant Arabidopsis thaliana, of the molecular basis of the promotion of flowering in the spring by exposure to winter cold through a process known as vernalization. In Arabidopsis, there are two classes of long non-coding RNAs produced only when plants are in the cold, and this work provides compelling evidence that the cold-induced expression of one of these (COOLAIR) involves C-repeat binding factor proteins that bind to cognate binding elements in the COOLAIR promoter, but also that COOLAIR is not required for the vernalization-mediated promotion of flowering under standard laboratory conditions in which the vernalization response is measured.

    2. Reviewer #1 (Public Review):

      FLOWERING LOCUS C (FLC) is a key repressor of flowering in Arabidopsis thaliana. FLC expression creates a requirement for vernalization which is the acquisition of competence to flower after exposure to the prolonged cold of winter. Vernalization in Arabidopsis and other Brassicas results in the suppression of FLC expression.

      How exposure to winter cold initiates the vernalization process (i.e., the silencing of FLC) is not fully understood. It is known that cold exposure causes several long non-coding RNAs, including COOLAIR and COLDAIR, to be transcribed from FLC. this work shows that COOLAIR induction by cold results requires the binding of CRT/DRE-binding factors (CBFs) to their cognate recognition elements which reside at the 3' end of the FLC locus. The authors demonstrate this regulation in many ways including studying the effect on vernalization of knocking out all CBFs and also by showing that constitutive CBF expression causes COOLAIR levels to be elevated even without cold exposure. Intriguingly, plants with genetic alterations that eliminate COOLAIR expression (loss of CBF activity and FLC deletion mutants that eliminate COOLAIR expression) do not have a significant impairment in becoming vernalized.

      The work appears to be done properly and provides much important information about how this remarkable environmentally-induced epigenetic switch operates.

    3. Reviewer #2 (Public Review):

      Here the authors questioned the regulation and functional roles of anti-sense transcripts at the 3'end of an important flowering-time regulator FLC.

      The authors present compelling genetic, molecular biology, transgene, and biochemical data on the molecular details of how COOLAIR is induced by cold temperatures. They report that cold-induction of COOLAIR is mediated by C-repeat/dehydration-responsive elements (CRT/DREs) at the 3'-end of the FLC and relatively small deletions of the CRT/DREs prevent cold-induction of COOLAIR. They also report that long-term cold results in an increase in the expression of CRT/DRE BINDING FACTORs (CBFs) that bind to the CRT/DREs and result in the activation of genes containing CRT/DREs.

      Interestingly, in lines in which COOLAIR is not induced the vernalization proceeds normally with respect to flowering behavior and cold-mediated FLC chromatin changes, a result that is at odds with some publications but consistent with other reports.

      The major strength of this research is the comprehensive battery of relevant assays used to address their aim. Using ChIP they demonstrate CBF3 directly binds to the 3'end of FLC in vivo, and of less interest, but still very relevant, CBF3 binds to a CRT/DRE motif containing oligo-nucleotides in vitro using an EMSA. Using CRISPR-mediated genetic deletion of these sequences in vivo, they demonstrated that the downstream antisense transcripts are no longer transcribed. Interestingly, in these CRISPR mutants or genetic mutants of higher-order CBF mutants, the vernalisation response (chromatin modifications) is not impaired. They also show that CBF mRNA transcription occurs in at least two waves, an early peak, and over a prolonged cold period.

      While the CRISPR genetic motif mutants are relatively small, a few hundred base pairs, ideally they would have been smaller if only encompassing the CRT/DRE motif.

      The authors clearly achieved their aims and the presented results strongly support their conclusions. The compelling data clearly questions a widely held view in the vernalisation field. The presented methods can be widely transferable to a broader research community.

    4. Reviewer #3 (Public Review):

      The authors start by examining the COOLAIR promoter and identifying a CRT/DRE motif that is bound by the CBF transcription factor family that is involved in the short-term cold. This is confirmed by gel shift assays and chromatin immunoprecipitation. However, it should be noted that the gel shift assays are an in vitro assay and the chromatin immunoprecipitation is carried out with plants over-expressing CBF3-myc from the pSuper promoter and so do not necessarily reflect the native state. The authors then examine COOLAIR expression in lines over-expressing each of the three CBF proteins of Arabidopsis and found COOLAIR expression elevated in the warm in all three, but with small differences in the variants of COOLAIR that are expressed. Examination of the expression of COOLAIR after short-term cold shows that transcript abundance increases after 6 hours, this expression was not observed in the cbfs-1 where all three CBFs are knocked out. Taken together this provides good evidence that COOLAIR transcription is rapidly induced via CBFs on exposure to cold.

      The authors then go on to look at the roles of CBFs in longer-term cold. COOLAIR has previously been shown to increase during long-term cold (multiple weeks duration), so the question was whether this increase is CBF-dependent. The increase in COOLAIR abundance is similar to other CBF targets but does begin to decline with 40-day cold periods, presumably reflecting the shutdown of the FLC locus. The lack of COOLAIR expression in the cbfs-1 mutant is good evidence that increased COOLAIR expression is CBF-dependent. The authors also present evidence that CBFs are required for COOLAIR induction by the first seasonal frost, which is consistent with this being a short-term cold response.

      The authors then examine deletions of the COOLAIR promoter. In agreement with the hypothesis that CBFs regulate COOLAIR transcription via the CRT/DREs in the COOLAIR promoter, deletions that include the two elements do not show cold induction of COOLAIR, while one that contains them does. It should be noted that these deletions are relatively coarse so could include other elements than the CRT/DREs.

      The authors then use the finding that COOLAIR is not induced in the cbfs-1 mutant or in the deltaCOOLAIR1 and 3 lines to ask whether COOLAIR is required for the repression of FLC in the vernalization response. The data in Figures 6 and 7 show that these lines don't show different responses to vernalization treatment at the FLC expression, FLC chromatin modifications, or flowering time/leaf number to flowering. This supports the conclusion that the COOLAIR transcript does not play an essential role in the vernalization response.

      The Discussion is well-balanced and considers previous publications in this area and highlights differences with this study. The conservation of COOLAIR in other brassica species suggests that it does have a biological function, but the data here suggest it isn't an essential component of the vernalization response. Whether there is a function in more natural conditions where the temperature fluctuates in a diurnal manner during the vernalization period is a possibility that is considered. When the data presented here are taken with other publications, the precise biological role of COOLAIR remains enigmatic.

    1. Author Response

      Joint Public review:

      1) Line 215: The authors state that pairing TCRseq with RNAseq reflects the magnitude of TCR signaling. This is absolutely not the case. TCR sequencing does not reflect TCR signaling strength.

      Thanks for the comments and we apologize for the usage of this misleading description. Actually in this part, we were trying to quantitatively assess the activation states of CD8 T cells based on the average expression of previously described activation-related gene signatures1 (also shown in Supplementary file 3). Therefore, TCRseq data was not involved in this analysis and the magnitude of TCR signaling could neither be reflected. We apologize again for this mistake and have corrected the corresponding texts and figures as follows (line 210-217): "Meanwhile, the activation states of CD8 T cell subpopulations were quantitatively assessed based on the average expression of previously described activation-related gene signatures1 (also shown in Supplementary file 3). Our results showed that the T-Tex cluster was the most activated, followed by the two P-Tex clusters (Fig. 2b left). In addition, CD8 T cells in tumor tissues were more activated than those in adjacent normal tissues (Fig. 2b, right top). And no significant difference in T cell activation states was observed between HPV-positive and HPV-negative samples (Fig. 2b right bottom)."

      2) A lot of discussion around "activation" is presented, but there is no evidence to support which genes or gene programs are associated with "activation".

      Thanks for the comments. The activation states of CD8 T cell subpopulations were quantitatively assessed based on the average expression of previously described activation-related gene signatures1 (also shown in Supplementary file 3). More specifically, activation-related gene signatures are as follows: "CD69, CCR7, CD27, BTLA, CD40LG, IL2RA, CD3E, CD47, EOMES, GNLY, GZMA, GZMB, PRF1, IFNG, CD8A, CD8B, CD95L, LAMP1, LAG3, CTLA4, HLA-DRA, TNFRSF4, ICOS, TNFRSF9, TNFRSF18".

      3) Line 249: It is unclear why the authors are indicating that TCRseq was used in pseudotime analysis. This type of analysis does not take TCRs into account but rather looks at the proportion of spliced mRNA of individual genes from the DGE data.

      Thanks for the comments and we apologize for the usage of this misleading description. As acknowledged by the reviewer, pseudotime analysis has nothing to do with TCRseq data. Actually in this part, we separately performed clonality analysis of CD8 T cells based on TCRseq data and pseudotime analysis based on RNAseq data. Shared TCRs were identified among certain cell subclusters, which could partially validate the potential lineage relationships simulated by pseudotime analysis. Therefore, we have corrected the texts as follows to avoid the misunderstanding that TCRseq was used in pseudotime analysis: "Given the clonal accumulation of CD8 T cells was a result of local T cell proliferation and activation in the tumor environment2, we further conducted clonality analysis of CD8 T cells based on TCRseq data. " (line 246-248) and "To further investigate their lineage relationships, we performed pseudotime analysis for CD3+ T cells on the basis of transcriptional similarities (Fig. 3j-l, Figure 3-figure supplementary 2d)." (line 277-279).

    2. eLife assessment

      This study provides fundamental insight into the functional impact of CDK4 inhibition on cells in the tumor microenvironment, which is of high importance and interest to the field. The compelling conclusion that proliferative exhausted T cells are associated with response in HPV+ head and neck cancer is supported by the cohort of 14 patients with paired tumor and adjacent normal tissue and rigorous bioinformatic analysis of nearly 50,000 single CD3+ T cell transcriptomes. This work will be of interest to researchers across tumor types and in other immunological fields of study.

    3. Joint Public Review:

      In this study, the authors transcriptomically characterize TIL from head and neck cancers and associate their transcriptional programs with overall survival as a function of HPV positivity. Specifically, they study the impact of CDK4 inhibition on TIL from these tumors. They find an exhausted T cell subset that preferentially expresses CDK4. They then perform some in vitro studies to test the function of exhausted T cells and the impact of CDK4 inhibition on different TIL subsets from head and neck tumors. Understanding the functional impact of different cancer therapies on cells in the TME is of high importance and interest to the field.

      1. Line 215: The authors state that pairing TCRseq with RNAseq reflects the magnitude of TCR signaling. This is absolutely not the case. TCR sequencing does not reflect TCR signaling strength.<br /> 2. A lot of discussion around "activation" is presented, but there is no evidence to support which genes or gene programs are associated with "activation".<br /> 3. Line 249: It is unclear why the authors are indicating that TCRseq was used in pseudotime analysis. This type of analysis does not take TCRs into account but rather looks at the proportion of spliced mRNA of individual genes from the DGE data.<br /> 4. There is no way to know if the differences in proliferation and cell viability shown in Figs. 4a and b, respectively, are meaningful or not. Proper controls or replicates should be provided to fully understand if this difference is biologically meaningful. Likewise, what is the evidence that P-Tex cells are self-renewing rather than expanding?

  2. Jan 2023
    1. eLife assessment

      In this work, the authors provide convincing evidence about the existence of two distinct osteoclast populations with specific expression profiles and properties and show that the probiotic yeast S. boulardii may be useful in managing inflammation-mediated bone loss, including estrogen deprivation-mediated osteoporosis. The reported study aims to bring the concept of heterogeneous osteoclasts into a proof-of-principle therapeutic application, which may mean that the use of probiotics might combat osteoporosis towards a better bone quality than current therapies. The molecular mechanism of how the probiotic yeast S. boulardii treatment acts via the receptors remains obscure since it might act via changes in the gut permeability or by components directly released by the fungus.

    2. Reviewer #1 (Public Review):

      Osteoclasts, giant multinucleated bone-resorbing cells, are crucial regulators of bone homeostasis and pathology. An underestimated aspect of their biology is that they are very heterogeneous, with at least 2 sub-populations (inflammatory osteoclasts and tolerogenic osteoclasts) existing, and exerting different actions, especially in the context of inflammatory bone loss. In this report, Madel, Halper (co-first authors), and colleagues present an interesting report investigating this heterogeneity, and showing that the probiotic yeast S. boulardii (probably through β-glucans) may be useful in managing inflammation-mediated bone loss, including oestrogen deprivation-mediated osteoporosis, as the authors show in vivo using an OVX mouse model.

      The authors first evaluate the differences in the transcriptional landscape of tolerogenic vs inflammatory osteoclasts with RNAseq, and then they evaluate the differences in miRNA expression between the two. Finding that some of the pathways/genes that vary are related to pattern recognition receptors (PRRs), specialized in recognizing non-self antigens including those arising from bacteria and yeasts, they wonder if the probiotic yeast S. boulardii could influence the balance between tolerogenic and inflammatory osteoclasts. Indeed, when the authors treated OVX mice, characterised by an increase in inflammatory osteoclasts and estrogen deprivation/inflammation-induced bone loss, with the probiotic, the bone loss is avoided and inflammatory osteoclasts are reduced. This challenges the classical way in which osteoclast-mediated bone loss is treated, since targeting specifically the inflammatory osteoclasts could allow the good osteoclasts to keep working and improving bone health and immunity, while only the bad osteoclasts are targeted. Current treatments are not able to distinguish between the two, which can cause a paradoxical degradation in bone health and atypical fractures. The report is therefore potentially very important for the field, and although quite focused on a specific strain, it can pave the way to treating bone diseases with probiotics, or specific molecules derived from them including beta-glucans.

    3. Reviewer #2 (Public Review):

      The authors apply their previously developed concept that osteoclasts exist in at least two flavors, tolerogenic and inflammatory osteoclasts towards the treatment of osteoporosis. They suggest that selectively targeting inflammatory osteoclasts attenuates ovariectomy-induced bone loss by agonists of pattern recognition receptors (PRR) that are higher expressed on inflammatory osteoclasts. The vision would be that the tolerogenic osteoclasts are still functioning, allowing bone remodeling with high bone quality, while the strong resorbing inflammatory osteoclasts are resorbed. By expression profiling, they detected PPR differentially expressed and confirmed these by flow cytometry and RT-QPCR. The activation of the Tlr2, Dectin-1, and Mincle reduced inflammatory osteoclast generation in vitro and affected their resorptive activity. Dendritic syk cell-specific deletion abrogated the differentiation of this osteoclast subset as well. The application of yeast Saccharomyces boulardii (Sbb) into mice attenuated trabecular bone loss (but not cortical) and seemed to inhibit in vitro the generation of inflammatory osteoclasts.

      Strength:<br /> - The expression profiling between very defined in vitro generated osteoclasts, which are somehow extreme phenotypes, provides a good tool to discern gene signatures on the osteoclast level.<br /> - The candidate of PPR were evaluated in their expression at the protein level by flow cytometry and their function was evaluated by loss of function studies.<br /> - The effect of S.b. treatment is striking and exploiting such probiotic fungi could be an elegant way to treat osteoporosis.

      Weakness:<br /> - The osteoclasts are generated in vitro in the presence of M-CSF to induce tolerogenic osteoclasts or GM-CSF / Il-4 to generate inflammatory osteoclasts. The demonstration of these cell populations in the S.b. treated mice in vivo is not present, despite the challenge to do this. The author tried to tackle this, by analyzing the differentiation potential of bone marrow progenitor cells of S.b. treated animals, which provides some information.<br /> - The effect on tolerogenic osteoclasts could have been further evaluated, whether they are not affected at all, or whether there are also effects.

      The authors strikingly show that agonists for PPR are affecting strongly GM-CSF/IL-4 progenitor-derived osteoclasts. They show that t-Ocl and i-Ocl differ in their gene signature and convincingly show the differential expression of the PPR, with exception of mincle which is clearly acknowledged. The molecular mechanism of how Sb treatment acts via the receptors remains obscure since it might act via changes in the gut permeability or by components directly released by the fungus. The kinase syk could play a role, at least some data in vitro suggest this.

      Conceptionally the authors tried to utilize the previously generated knowledge by the group published 2016 and 2020 into an approach. If the use of a probiotic fungus would be beneficial indeed this could be a suitable drug with few side effects much superior to current treatments of osteoporosis.<br /> For me, an intriguing question arises from this study, in case these i-Ocl express these receptors and are thus so "vulnerable" to the agonists to decrease their activity, evt. a negative feedback to prevent overshooting reactions?